Separate multiple speaker in mono audio files

Mai 2025 Release

Speaker Diarization - Voice Harbor V3.0

Changelog

We are excited to unveil Monster, our new speaker diarization system, Nijta’s latest innovation in audio segmentation that redefines both accuracy and efficiency. This first release marks a major step forward in speaker diarization, offering precise segmentation, multilingual support, and robust performance on your noisy data. From medical conversations to customer service calls and teleconference, our advanced model adapts to various acoustic conditions, ensuring high accuracy where other diarization systems fall short.

Diarization Benchmark – CALLHOME Eval Set

Model / Dataset	de	en	sp	jp	zh
Pyanote 3.1	23	31	34	42	26
Voice Harbor V3.0	26	31	35	43	26
revAI	27	35	38	44	33
Pyanote AI	19	16	18	21	17
Voice Harbor V3.0-Large	18	22	26	31	22

Bolded values indicate best performance for the language.

Features

Precise speaker diarization
Seamless handling of overlapping speech
Robust performance across varied acoustic environments
Unlimited number of speakers

Diarize and Redact PHI in audio, followed by transcription using Python SDK

Set Up the Client and Create a Job Once you have the necessary setup, you can create a job on the server and interact with the API. Here’s how to do it:

BASE_URL = "https://voiceharbor.ai"
usage_token = "your_usage_token_here"

# Create a new job on the server via the class method.
# Create a new job immediately.
job_id = VoiceHarborClient.create_job(BASE_URL, usage_token)
print(f"Job created with ID: {job_id}")

Next, initialize the client and define the parameters for your job, including the files you want to submit for transcription.

# Initialize the client with the new job_id and input directory.
client = VoiceHarborClient(
    base_url=BASE_URL,
    job_id=job_id,
    token=usage_token,  # Ensure correct token is used here
    inputs_dir="./inputs"
)

# Build job parameters (e.g., list of agents and any other details).
job_params = {
    "model": "advanced",
    "diar": True
    "files": []  # The submit_files method will append uploaded file names.
}

# Upload input files.
job_params = client.submit_files(job_params)

Now, submit the job and wait for the transcription results. Here’s how to complete the process:

# Immediately submit the job file (YAML)
job_file = client.submit_job(job_params)
print(f"Job file submitted: {job_file}")

# Wait for and download the processed results.
downloaded_results = client.download_results(output_dir="./results")
print("Downloaded results:", downloaded_results)

PHI and Biometric reduction

Build with Voice Harbor

Speech to Text

Speaker Diarization

Playground

Separate multiple speaker in mono audio files

Changelog

Diarization Benchmark – CALLHOME Eval Set

Features

Diarize and Redact PHI in audio, followed by transcription using Python SDK

PHI and Biometric reduction

Build with Voice Harbor

Speech to Text

Speaker Diarization

Playground

​Changelog

​Diarization Benchmark – CALLHOME Eval Set

​Features

​Diarize and Redact PHI in audio, followed by transcription using Python SDK

Changelog

Diarization Benchmark – CALLHOME Eval Set

Features

Diarize and Redact PHI in audio, followed by transcription using Python SDK