tool¶
pyannote speaker identification tool.
Functions
Crops an audio track and returns its corresponding waveform. |
Classes
pyannote speaker identification tool. |
- class PyannoteIdentificationTool(model_names, api_token=None, device='cpu', overwrite=False, verbose=True)[source]¶
Bases:
IdentificationToolpyannote speaker identification tool.
- Parameters:
model_names (
Sequence[str]) – The names of the models to use.api_token (
Optional[str]) – The HuggingFace API token to use.device (
str) – The device where the computation should be executed.overwrite (
bool) – Whether to overwrite existing files, otherwise raise an error.verbose (
Union[bool,int]) – Whether to execute the computation verbosely.
- inference(mixed_audio_path, diarization_path, mono_audio_paths, identification_path)[source]¶
pyannote’s backed inference method.
- Parameters:
mixed_audio_path (
Union[str,Path]) – Path to the mixed audio track.diarization_path (
Union[str,Path]) – Path to the diarization file.mono_audio_paths (
Sequence[Union[str,Path]]) – Path to the mono audio tracks.identification_path (
Union[str,Path]) – Path to the identification file.
- Returns:
- cropped_waveform(path, start, end, sample_rate=32000)[source]¶
Crops an audio track and returns its corresponding waveform.
- Parameters:
path (
Union[str,Path]) – Path to the audio track.start (
float) – Start of segment in seconds.end (
float) – End of the segment in seconds.sample_rate (
int) – Sample rate of the audio track.
- Return type:
Tensor- Returns:
Tensor containing the waveform of the audio segment.