audio¶
psifx audio¶
Command-line interface for processing audio tracks.
usage: psifx audio [-h] [--all-help]
{diarization,identification,manipulation,speech,transcription}
...
- -h, --help¶
show this help message and exit
- --all-help¶
show help recursively and exit
psifx audio diarization¶
Command-line interface for diarizing audio tracks.
usage: psifx audio diarization [-h] [--all-help] {pyannote,visualization} ...
- -h, --help¶
show this help message and exit
- --all-help¶
show help recursively and exit
psifx audio diarization pyannote¶
Command-line interface for running pyannote diarization tool.
usage: psifx audio diarization pyannote [-h] [--all-help]
{inference,visualization} ...
- -h, --help¶
show this help message and exit
- --all-help¶
show help recursively and exit
psifx audio diarization pyannote inference¶
Command-line interface for diarizing an audio track with pyannote.
usage: psifx audio diarization pyannote inference [-h] --audio AUDIO
--diarization DIARIZATION
[--num_speakers NUM_SPEAKERS]
[--model_name MODEL_NAME]
[--api_token API_TOKEN]
[--device DEVICE]
[--overwrite | --no-overwrite]
[--verbose | --no-verbose]
- -h, --help¶
show this help message and exit
- --audio <audio>¶
path to the input audio file, such as
/path/to/audio.wav
- --diarization <diarization>¶
path to the output diarization file, such as
/path/to/diarization.rttm
- --num_speakers <num_speakers>¶
number of speaking participants, if ignored the model will try to guess it, it is advised to specify it
- --model_name <model_name>¶
name of the diarization model used, c.f. https://huggingface.co/pyannote/speaker-diarization/tree/main/reproducible_research
- --api_token <api_token>¶
API token for the downloading the models from HuggingFace
- --device <device>¶
device on which to run the inference, either ‘cpu’ or ‘cuda’
- --overwrite, --no-overwrite¶
overwrite existing files, otherwise raises an error
- --verbose, --no-verbose¶
verbosity of the script
psifx audio diarization pyannote visualization¶
Command-line interface for visualizing the diarization of a track.
usage: psifx audio diarization pyannote visualization [-h] --diarization
DIARIZATION
--visualization
VISUALIZATION
[--overwrite | --no-overwrite]
[--verbose | --no-verbose]
- -h, --help¶
show this help message and exit
- --diarization <diarization>¶
path to the input diarization file, such as
/path/to/diarization.rttm
- --visualization <visualization>¶
path to the output visualization file, such as
/path/to/visualization.png
- --overwrite, --no-overwrite¶
overwrite existing files, otherwise raises an error
- --verbose, --no-verbose¶
verbosity of the script
psifx audio diarization visualization¶
Command-line interface for visualizing the diarization of a track.
usage: psifx audio diarization visualization [-h] --diarization DIARIZATION
--visualization VISUALIZATION
[--overwrite | --no-overwrite]
[--verbose | --no-verbose]
- -h, --help¶
show this help message and exit
- --diarization <diarization>¶
path to the input diarization file, such as
/path/to/diarization.rttm
- --visualization <visualization>¶
path to the output visualization file, such as
/path/to/visualization.png
- --overwrite, --no-overwrite¶
overwrite existing files, otherwise raises an error
- --verbose, --no-verbose¶
verbosity of the script
psifx audio identification¶
Command-line interface for identifying speakers in audio tracks.
usage: psifx audio identification [-h] [--all-help] {pyannote} ...
- -h, --help¶
show this help message and exit
- --all-help¶
show help recursively and exit
psifx audio identification pyannote¶
Command-line interface for running pyannote identification tool.
usage: psifx audio identification pyannote [-h] [--all-help] {inference} ...
- -h, --help¶
show this help message and exit
- --all-help¶
show help recursively and exit
psifx audio identification pyannote inference¶
Command-line interface for identifying speakers from an audio track with pyannote.
usage: psifx audio identification pyannote inference [-h] --mixed_audio
MIXED_AUDIO --diarization
DIARIZATION --mono_audios
MONO_AUDIOS
[MONO_AUDIOS ...]
--identification
IDENTIFICATION
[--model_names MODEL_NAMES [MODEL_NAMES ...]]
[--api_token API_TOKEN]
[--device DEVICE]
[--overwrite | --no-overwrite]
[--verbose | --no-verbose]
- -h, --help¶
show this help message and exit
- --mixed_audio <mixed_audio>¶
path to the input mixed audio file, such as
/path/to/mixed-audio.wav
- --diarization <diarization>¶
path to the input diarization file, such as
/path/to/diarization.rttm
- --mono_audios <mono_audios>¶
paths to the input mono audio files, such as
/path/to/mono-audio-1.wav /path/to/mono-audio-2.wav
- --identification <identification>¶
path to the output identification file, such as
/path/to/identification.json
- --model_names <model_names>¶
names of the embedding models
- --api_token <api_token>¶
API token for the downloading the models from HuggingFace
- --device <device>¶
device on which to run the inference, either ‘cpu’ or ‘cuda’
- --overwrite, --no-overwrite¶
overwrite existing files, otherwise raises an error
- --verbose, --no-verbose¶
verbosity of the script
psifx audio manipulation¶
Command-line interface for manipulating audio tracks.
usage: psifx audio manipulation [-h] [--all-help]
{extraction,conversion,split,mixdown,normalization,trim}
...
- -h, --help¶
show this help message and exit
- --all-help¶
show help recursively and exit
psifx audio manipulation conversion¶
Command-line interface for converting any audio track to a mono audio track at 16kHz sample rate.
usage: psifx audio manipulation conversion [-h] --audio AUDIO --mono_audio
MONO_AUDIO
[--overwrite | --no-overwrite]
[--verbose | --no-verbose]
- -h, --help¶
show this help message and exit
- --audio <audio>¶
path to the input audio file, such as
/path/to/audio.wav
(or .mp3, etc.)
- --mono_audio <mono_audio>¶
path to the output audio file, such as
/path/to/mono-audio.wav
- --overwrite, --no-overwrite¶
overwrite existing files, otherwise raises an error
- --verbose, --no-verbose¶
verbosity of the script
psifx audio manipulation extraction¶
Command-line interface for extracting the audio track from a video.
usage: psifx audio manipulation extraction [-h] --video VIDEO --audio AUDIO
[--overwrite | --no-overwrite]
[--verbose | --no-verbose]
- -h, --help¶
show this help message and exit
- --video <video>¶
path to the input video file, such as
/path/to/video.mp4
(or .avi, .mkv, etc.)
- --audio <audio>¶
path to the output audio file, such as
/path/to/audio.wav
- --overwrite, --no-overwrite¶
overwrite existing files, otherwise raises an error
- --verbose, --no-verbose¶
verbosity of the script
psifx audio manipulation mixdown¶
Command-line interface for mixing multiple mono audio tracks.
usage: psifx audio manipulation mixdown [-h] --mono_audios MONO_AUDIOS
[MONO_AUDIOS ...] --mixed_audio
MIXED_AUDIO
[--overwrite | --no-overwrite]
[--verbose | --no-verbose]
- -h, --help¶
show this help message and exit
- --mono_audios <mono_audios>¶
paths to the input mono audio files, such as
/path/to/mono-audio-1.wav /path/to/mono-audio-2.wav
- --mixed_audio <mixed_audio>¶
path to the output mixed audio file, such as
/path/to/mixed-audio.wav
- --overwrite, --no-overwrite¶
overwrite existing files, otherwise raises an error
- --verbose, --no-verbose¶
verbosity of the script
psifx audio manipulation normalization¶
Command-line interface for normalizing an audio track.
usage: psifx audio manipulation normalization [-h] --audio AUDIO
--normalized_audio
NORMALIZED_AUDIO
[--overwrite | --no-overwrite]
[--verbose | --no-verbose]
- -h, --help¶
show this help message and exit
- --audio <audio>¶
path to the input audio file, such as
/path/to/audio.wav
- --normalized_audio <normalized_audio>¶
path to the output normalized audio file, such as
/path/to/normalized-audio.wav
- --overwrite, --no-overwrite¶
overwrite existing files, otherwise raises an error
- --verbose, --no-verbose¶
verbosity of the script
psifx audio manipulation split¶
Command-line interface for splitting a stereo audio track into two mono tracks.
usage: psifx audio manipulation split [-h] --stereo_audio STEREO_AUDIO
--left_audio LEFT_AUDIO --right_audio
RIGHT_AUDIO
[--overwrite | --no-overwrite]
[--verbose | --no-verbose]
- -h, --help¶
show this help message and exit
- --stereo_audio <stereo_audio>¶
path to the input stereo audio file, such as
/path/to/stereo-audio.wav
- --left_audio <left_audio>¶
path to the output left channel mono audio file, such as
/path/to/left-audio.wav
- --right_audio <right_audio>¶
path to the output right channel mono audio file, such as
/path/to/right-audio.wav
- --overwrite, --no-overwrite¶
overwrite existing files, otherwise raises an error
- --verbose, --no-verbose¶
verbosity of the script
psifx audio manipulation trim¶
Command-line interface for trimming an audio track.
usage: psifx audio manipulation trim [-h] --audio AUDIO --trimmed_audio
TRIMMED_AUDIO [--start_time START_TIME]
[--end_time END_TIME]
[--overwrite | --no-overwrite]
[--verbose | --no-verbose]
- -h, --help¶
show this help message and exit
- --audio <audio>¶
path to the input audio file, such as
/path/to/audio.wav
- --trimmed_audio <trimmed_audio>¶
path to the output trimmed audio file, such as
/path/to/trimmed-audio.wav
- --start_time <start_time>¶
start time in seconds (None to keep from beginning)
- --end_time <end_time>¶
end time in seconds (None to keep until end)
- --overwrite, --no-overwrite¶
overwrite existing files, otherwise raises an error
- --verbose, --no-verbose¶
verbosity of the script
psifx audio speech¶
Command-line interface for extracting non-verbal speech features from an audio track.
usage: psifx audio speech [-h] [--all-help] {opensmile} ...
- -h, --help¶
show this help message and exit
- --all-help¶
show help recursively and exit
psifx audio speech opensmile¶
Command-line interface for running OpenSmile.
usage: psifx audio speech opensmile [-h] [--all-help] {inference} ...
- -h, --help¶
show this help message and exit
- --all-help¶
show help recursively and exit
psifx audio speech opensmile inference¶
Command-line interface for extracting non-verbal speech features from an audio track with OpenSmile.
usage: psifx audio speech opensmile inference [-h] --audio AUDIO --diarization
DIARIZATION --features FEATURES
[--feature_set FEATURE_SET]
[--feature_level FEATURE_LEVEL]
[--overwrite | --no-overwrite]
[--verbose | --no-verbose]
- -h, --help¶
show this help message and exit
- --audio <audio>¶
path to the input audio file, such as
/path/to/audio.wav
- --diarization <diarization>¶
path to the input diarization file, such as
/path/to/diarization.rttm
- --features <features>¶
path to the output feature archive, such as
/path/to/opensmile.tar.gz
- --feature_set <feature_set>¶
available sets: [‘ComParE_2016’, ‘GeMAPSv01a’, ‘GeMAPSv01b’, ‘eGeMAPSv01a’, ‘eGeMAPSv01b’, ‘eGeMAPSv02’, ‘emobase’]
- --feature_level <feature_level>¶
available levels: [‘lld’, ‘lld_de’, ‘func’]
- --overwrite, --no-overwrite¶
overwrite existing files, otherwise raises an error
- --verbose, --no-verbose¶
verbosity of the script
psifx audio transcription¶
Command-line interface for transcribing audio tracks.
usage: psifx audio transcription [-h] [--all-help] {whisperx,enhance} ...
- -h, --help¶
show this help message and exit
- --all-help¶
show help recursively and exit
psifx audio transcription enhance¶
Command-line interface for enhancing a transcription with diarization and identification.
usage: psifx audio transcription enhance [-h] --transcription TRANSCRIPTION
--diarization DIARIZATION
--identification IDENTIFICATION
--enhanced_transcription
ENHANCED_TRANSCRIPTION
[--overwrite | --no-overwrite]
[--verbose | --no-verbose]
- -h, --help¶
show this help message and exit
- --transcription <transcription>¶
path to the input transcription file, such as
/path/to/transcription.vtt
- --diarization <diarization>¶
path to the input diarization file, such as
/path/to/diarization.rttm
- --identification <identification>¶
path to the input identification file, such as
/path/to/identification.json
- --enhanced_transcription <enhanced_transcription>¶
path to the output transcription file, such as
/path/to/enhanced-transcription.vtt
- --overwrite, --no-overwrite¶
overwrite existing files, otherwise raises an error
- --verbose, --no-verbose¶
verbosity of the script
psifx audio transcription whisperx¶
Command-line interface for running OpenAI Whisper.
usage: psifx audio transcription whisperx [-h] [--all-help] {inference} ...
- -h, --help¶
show this help message and exit
- --all-help¶
show help recursively and exit
psifx audio transcription whisperx inference¶
Command-line interface for transcribing an audio track with WhisperX.
usage: psifx audio transcription whisperx inference [-h] --audio AUDIO
--transcription
TRANSCRIPTION
[--language LANGUAGE]
[--model_name MODEL_NAME]
[--translate_to_english | --no-translate_to_english]
[--batch_size BATCH_SIZE]
[--device DEVICE]
[--overwrite | --no-overwrite]
[--verbose | --no-verbose]
- -h, --help¶
show this help message and exit
- --audio <audio>¶
path to the input audio file, such as
/path/to/audio.wav
- --transcription <transcription>¶
path to the output transcription file, such as
/path/to/transcription.vtt
- --language <language>¶
language of the audio, if ignore, the model will try to guess it, it is advised to specify it
- --model_name <model_name>¶
size of the model to use (tiny, tiny.en, base, base.en, small, small.en, distil-small.en, medium, medium.en, distil-medium.en, large-v1, large-v2, large-v3, large, distil-large-v2, distil-large-v3, large-v3-turbo, or turbo), a path to a converted model directory, or a CTranslate2-converted Whisper model ID from the HF Hub
- --translate_to_english, --no-translate_to_english¶
whether to transcribe the audio in its original language or to translate it to english
- --batch_size <batch_size>¶
batch size, reduce if low on GPU memory
- --device <device>¶
device on which to run the inference, either ‘cpu’ or ‘cuda’
- --overwrite, --no-overwrite¶
overwrite existing files, otherwise raises an error
- --verbose, --no-verbose¶
verbosity of the script