tool¶

pyannote speaker identification tool.

Functions

Crops an audio track and returns its corresponding waveform.

Classes

pyannote speaker identification tool.

class PyannoteIdentificationTool(model_names, api_token=None, device='cpu', overwrite=False, verbose=True)[source]¶

pyannote speaker identification tool.

Parameters:

model_names (Sequence[str]) – The names of the models to use.
api_token (Optional[str]) – The HuggingFace API token to use.
device (str) – The device where the computation should be executed.
overwrite (bool) – Whether to overwrite existing files, otherwise raise an error.
verbose (Union[bool, int]) – Whether to execute the computation verbosely.

inference(mixed_audio_path, diarization_path, mono_audio_paths, identification_path)[source]¶

pyannote’s backed inference method.

Parameters:

mixed_audio_path (Union[str, Path]) – Path to the mixed audio track.
diarization_path (Union[str, Path]) – Path to the diarization file.
mono_audio_paths (Sequence[Union[str, Path]]) – Path to the mono audio tracks.
identification_path (Union[str, Path]) – Path to the identification file.

Returns:

cropped_waveform(path, start, end, sample_rate=32000)[source]¶

Crops an audio track and returns its corresponding waveform.

Parameters:

Return type:

Tensor

Returns:

Tensor containing the waveform of the audio segment.