tool¶
pyannote speaker identification tool.
Functions
Crops an audio track and returns its corresponding waveform. |
Classes
pyannote speaker identification tool. |
- class PyannoteIdentificationTool(model_names, api_token=None, device='cpu', overwrite=False, verbose=True)[source]¶
Bases:
IdentificationTool
pyannote speaker identification tool.
- Parameters:
model_names (
Sequence
[str
]) – The names of the models to use.api_token (
Optional
[str
]) – The HuggingFace API token to use.device (
str
) – The device where the computation should be executed.overwrite (
bool
) – Whether to overwrite existing files, otherwise raise an error.verbose (
Union
[bool
,int
]) – Whether to execute the computation verbosely.
- inference(mixed_audio_path, diarization_path, mono_audio_paths, identification_path)[source]¶
pyannote’s backed inference method.
- Parameters:
mixed_audio_path (
Union
[str
,Path
]) – Path to the mixed audio track.diarization_path (
Union
[str
,Path
]) – Path to the diarization file.mono_audio_paths (
Sequence
[Union
[str
,Path
]]) – Path to the mono audio tracks.identification_path (
Union
[str
,Path
]) – Path to the identification file.
- Returns:
- cropped_waveform(path, start, end, sample_rate=32000)[source]¶
Crops an audio track and returns its corresponding waveform.
- Parameters:
path (
Union
[str
,Path
]) – Path to the audio track.start (
float
) – Start of segment in seconds.end (
float
) – End of the segment in seconds.sample_rate (
int
) – Sample rate of the audio track.
- Return type:
Tensor
- Returns:
Tensor containing the waveform of the audio segment.