Parameters

File parameters

files
list
required

List of local input audio file name(s).

files:
- filename_XYZ.mp3
- filename_XYZ.wav
prefix
string
default:""

Apply the task to the files with mentioned prefix in their file name.

prefix: XYZ

Task, model and agent parameters

Below is a table summarizing the output files for each task mode.

TaskOutputExplanation
transcribe.json and .wavContains the full text transcript and Original audio
phi.json and .wavTranscript plus PHI spans and Original audio with all PHI segments muted
biometric.json and .wavTranscript plus Audio with a new synthetic identity, gender, and age
phi-biometric.json and .wavTranscript plus PHI spans and Audio with both PHI removed and a new synthetic identity, gender, and age
task
string
default:"protect"

Task to apply to attached files. Available task’s:

  • transcribe (Transcription)
  • phi (transcribe + PHI redaction)
  • biometric (Speech to speech voice print redaction)
  • phi-biometric (transcribe + phi + biometric)
task: transcribe
model
string
default:"mini"

Model tier to use for phi reduction. Pass advanced if you want to use advanced reasoning via our private-LLM. Available values:

  • mini
  • advanced Example:
model: mini
model: advanced
agents
list
default:"health-generic"

Agents with specififc domain dependency and purpose. Available values:

  • hipaa
  • health-generic
  • clinical
model: advanced
agents:
- hipaa

Diarization and transcription parameters

language
string
default:"en"

Set the target language for transcription. Leave this parameter to detect automatically amoung supported languages else force target language.

language: en
code-switch
boolean
default:"false"

If automatic code-switching should be applied to your files. Available languages supported for code-switching transcription:

  • de
  • en
  • fr
  • sp
code-switch: true
diar
boolean
default:"false"

If speaker diarization should be applied to your files.

diar: true

Biometric parameters

biometric
string
default:"en"

If biometric voice print redaction should be applied to your files, and if so, which language. Available values:

  • en
  • fr
biometric: en
biometric_gender
string
default:"random"

The gender which should be applied to the anonymised version. Available values:

  • random
  • same
  • opposite
biometric_gender: random
biometric_age
string
default:"middle-aged-adult"

The age group which should be applied to the anonymised version. Available values:

  • young-adult (18-39)
  • middle-aged-adult (40-69)
  • same
  • random
biometric_age: middle-aged-adult

Submit your Job using Voice Habor’s SDK

Speech to text is by default applied for the task protect. To have the transcription without any reduction use transcribe as task.

1

Define your parameter based on data sensitivity.

2

Build Job File

Build your job as yaml file containing the parameters in your target programming language.

Minimal job example:

files:
- filename1.mp3
- filename2.wav
model: mini
BASE_URL = "https://voiceharbor.ai"
usage_token = "USAGE_TOKEN"
# Create a new job on the server via the class method.
job_id = VoiceHarborClient.create_job(BASE_URL, usage_token)

client = VoiceHarborClient(
    base_url=BASE_URL,
    job_id=job_id,
    token=usage_token,
    inputs_dir="./inputs/tests"
)

# Submit input files and the job file. 
job_params = {"files": [], "model":"mini"}  
job_params = client.submit_files(job_params)
job_file = client.submit_job(job_params)
logger.info(f"Job file created: {job_file}")

Get Started with SDK

Start coding today using Python and integrate the Voice Harbor API into your workflows.

Transcription, Translation, and Protection Use-Case examples