API reference
Authentication
Use your token to authenticate your calls. The header authentication method ensures that you are able to use Voice Harbor.
If you do not own a Nijta token yet, contact [email protected].
Workflow
To use the VoiceHarbor API:
- First, start a session using your authentication token.
- Once the session is established, submit your job (anonymizing audio files, transcribing audio files, or masking PII in text) to the appropriate endpoint.
- Each task submission returns a unique task_id which you will use to track the status of your job.
- Periodically check the status of your task.
- Once the task status is finished, you can retrieve the result.
Important Note
The results of your tasks are only available for a brief period of 5 seconds on the server. Therefore, it is crucial to implement a loop that checks the task status frequently and retrieves the result as soon as the task is marked as finished. An example is given at the end of this page. Failure to retrieve the result within this window will result in loss of the data, and the task will need to be resubmitted.
Endpoints
1. Start Session
URL
POST /session
Description
Once you got your token, start a new session. You will need to pass the session id obtained here with the token to submit jobs to the API.
Headers
Content-Type
:application/json; charset=utf-8
TOKEN
:<YOUR_NIJTA_TOKEN>
Example cURL Command
curl -X POST --url https://api.nijta.com/session
-H "Content-Type: application/json; charset=utf-8"
-H "TOKEN:<YOUR_NIJTA_TOKEN>"
Response
session_id
(string): The ID of the session if the request is authorized.status
(string): Status of the request if unauthorized.
2. Submit a job
Voice Harbor offers three primary tasks you can submit for processing: anonymizing audio files, transcribing audio files (ASR), and masking PII in text only. Each task requires a different endpoint and set of parameters to ensure accurate processing. Below, you will find detailed instructions and example cURL commands for each task.
Anonymize Audio Files
URL
POST /tasks/{session_id}
Description
Sends a request for anonymizing audio files.
Headers
Content-Type
:multipart/form-data
TOKEN
:<YOUR_NIJTA_TOKEN>
Parameters
Include the parameters in the request:
- audio files with name and binary content
- The parameters of anonymisation (only language and gender are required)
Parameter | Description |
---|---|
language | 'french_8', 'english_16'. Only French/8kHz and English/16kHz are available in this version. |
gender | choose a gender for the target "pseudo" speakers (values: 'f' or 'm'). |
robotic | (optional) By default, the original variations of the pitch are preserved. With this option turned on, they are removed and the results sounds robotic (values: True, False. Default is False). |
seed | (optional): you can set a seed for reproducibility during evaluation (not recommended in production). |
voice | (default True) Set this parameter to False if you don't wish to transform the original voice. |
content | (default False) Set this parameter to True if you wish to remove sensitive content. |
entities | Give the list of type of entities you want to hide. Some examples: Name,Organization,Location,City,Country,Numbers,Age,Date,Credit Card Number,Email,Concept,Product,Event,Technology,Group,Medical Condition,Characteristic,Research,County, ... |
ner_threshold | (optional) Our NER hides entities based on a score. Depending on your needs, you might want to mask more or fewer entities than the default. Set a value between 0 and 1 (default is 0.5): closer to zero hides more entities, while closer to 1 only hides highly scored ones.. |
regex_entities | (optional) In addition to NER, you can also mask entities based on regular expressions. Format: add the key 'regex_entities to the params with the value in the following format: [ ('regex1', 'tag1), ('regex2', 'tag2'), .... ] For example, to mask emails based on regex: params = { 'regex_entities': [('[a-zA-Z0-9_.]+[@]{1}[a-z0-9]+[\.][a-z]+', 'email'),] } Any email address matching the regex will be replaced with the tag <email> |
mask | If the content parameter is set to True, specify how you would like to conceal the content. Choose between "silence" (default) or "beep". |
words | Set this parameter to True to get the timestamps of each word with the transcription, when the content parameter is True. |
separate | (optional) Set this parameter to True to apply speaker separation to your input files before applying anonymisation. This is useful for mono files with 2 speakers. I the input file is stereo, no separation will be applied. Check the Speaker Separation section below to get more information |
Example cURL Command
curl -X POST https://api.nijta.com/tasks/<SESSION_ID>
-H "Content-Type: multipart/form-data"
-H "TOKEN:<YOUR_NIJTA_TOKEN>"
-F "audio_file.wav=@audio_file.wav"
-F "voice=true"
-F "language=french_8"
-F "gender=f"
-F "content=true"
-F "entities=Name,Organization,Location"
Transcribe Audio Files (ASR)
URL
POST /tasks/asr/{session_id}
Description
Sends a processing request for transcribing audio files.
Headers
Content-Type
:multipart/form-data
TOKEN
:<YOUR_NIJTA_TOKEN>
Parameters
Include the audio files to transcribe as params with name and binary content.
Example cURL Command
curl -X POST https://api.nijta.com/tasks/asr/<SESSION_ID>
-H "Content-Type: multipart/form-data"
-H "TOKEN:<YOUR_NIJTA_TOKEN>"
-F "[email protected]"
Process Text
URL
POST /tasks/text/{session_id}
Description
Sends a request to mask PII in text.
Headers
Content-Type
:application/json; charset=utf-8
TOKEN
:<YOUR_NIJTA_TOKEN>
Parameters
Include the parameters in the request body in JSON format.
text
(str): The text data to process.entities
(string): Comma-separated list of entities to extract (e.g.,Name,Organization,...
).
Example cURL Command
curl -X POST https://api.nijta.com/tasks/text/<SESSION_ID>
-H "Content-Type: application/json; charset=utf-8"
-H "TOKEN:<YOUR_NIJTA_TOKEN>"
-d '{
"text": ["Your text data here"],
"entities": "Name,Organization,Location"
}'
Response
For the 3 tasks, here's the response you will get:
{
"data": {
"task_id": "TASK_ID"
},
"failed_files": [],
"submission_status": "success",
"submission_time": "1717776327.8996203"
}
data.task_id
(string): The ID of the submitted task.failed_files
(list): List of any files that failed during processing.submission_status
(string): Status of the submission (e.g., success or failed).submission_time
(string): Timestamp of the submission.
3. Read Result
URL
GET /tasks/{task_id}
Description
Checks the status of the submitted task and retrieves the result once the task is finished.
Headers
Content-Type
:application/json; charset=utf-8
TOKEN
:<YOUR_NIJTA_TOKEN>
Example cURL Command
curl -X GET https://api.nijta.com/tasks/$TASK_ID
-H "Content-Type: application/json; charset=utf-8"
-H "TOKEN:<YOUR_NIJTA_TOKEN>"
Response
When the task is still processing:
{
"data": {
"task_id": "TASK_ID",
"task_status": "processing"
}
}
data.task_id
(string): The ID of the submitted task.data.task_status
(string): Status of the task (e.g.,processing
,finished
,failed
).
When the task is finished:
{
"data": {
"task_id": "TASK_ID",
"task_status": "finished",
"task_result": {
"response": "hex_encoded_response"
}
}
}
data.task_id
(string): The ID of the submitted task.data.task_status
(string): Status of the task (e.g.,processing
,finished
,failed
).data.task_result.response
(string): The hex-encoded result of the processed task. This should be decoded to get the actual result.
Decoding the Response
The response for a finished task is hex-encoded. To decode the response, you can use a command like xxd
to convert it back to the original format. Below is an example of decoding the response in a shell script:
RESPONSE=$(curl -s -X GET https://api.nijta.com/tasks/$TASK_ID -H "Content-Type: application/json; charset=utf-8" -H "TOKEN:<YOUR_NIJTA_TOKEN>")
STATUS=$(echo $RESPONSE | jq -r '.data.task_status')
if [ "$STATUS" == "finished" ]; then
HEX_RESPONSE=$(echo $RESPONSE | jq -r '.data.task_result.response')
RESULT=$(echo $HEX_RESPONSE | xxd -r -p)
echo "$RESULT"
else
echo "Task status: $STATUS"
fi
Example Result
For an audio anonymization task, the decoded result may look like:
{
"original_file_path_1": {
"audio": "binary_audio_data",
"transcription": "Transcribed text of the audio"
},
"original_file_path_2": {
"audio": "binary_audio_data",
"transcription": "Transcribed text of the audio"
}
}
For a text masking task, the decoded result may look like:
[
"Masked text 1",
"Masked text 2"
]
original_file_path_x.audio
(binary): The binary data of the anonymized audio.original_file_path_x.transcription
(string): The transcription of the audio.Masked text x
(string): The masked text.
Example scripts
Anonymizing audio
#!/bin/bash
# Constants
API_URL="https://api.nijta.com"
TOKEN=YOU_NIJTA_TOKEN
AUDIO_FILE="my_file.wav"
LANGUAGE="english_16"
GENDER="f"
VOICE="true"
CONTENT="true"
ENTITIES="Name,Organization,Location"
OUTPUT_DIR="output"
mkdir -p $OUTPUT_DIR
# Start a session
SESSION_RESPONSE=$(curl -s -X POST "$API_URL/session" \
-H "Content-Type: application/json; charset=utf-8" \
-H "TOKEN:$TOKEN")
SESSION_ID=$(echo "$SESSION_RESPONSE" | jq -r '.session_id')
if [ "$SESSION_ID" == "null" ]; then
echo "Failed to start session: $SESSION_RESPONSE"
exit 1
fi
echo "Session started: $SESSION_ID"
# Submit the anonymize audio job
TASK_RESPONSE=$(curl -s -X POST "$API_URL/tasks/$SESSION_ID" \
-H "Content-Type: multipart/form-data" \
-H "TOKEN:$TOKEN" \
-F "$(basename $AUDIO_FILE)=@$AUDIO_FILE" \
-F "voice=$VOICE" \
-F "language=$LANGUAGE" \
-F "gender=$GENDER" \
-F "content=$CONTENT" \
-F "entities=$ENTITIES")
TASK_ID=$(echo "$TASK_RESPONSE" | jq -r '.data.task_id')
if [ "$TASK_ID" == "null" ]; then
echo "Failed to submit task: $TASK_RESPONSE"
exit 1
fi
echo "Task submitted: $TASK_ID"
# Poll for the task result
while true; do
RESPONSE=$(curl -s -X GET "$API_URL/tasks/$TASK_ID" \
-H "TOKEN:$TOKEN")
TASK_STATUS=$(echo "$RESPONSE" | jq -r '.data.task_status')
if [ "$TASK_STATUS" == "finished" ]; then
echo "Job finished. Retrieving results..."
RESULT=$(echo $RESPONSE | jq -r '.data.task_result.response' | xxd -r -p)
# Save the anonymized audio to the output directory and print the masked transcription
FILENAME=$(basename $AUDIO_FILE)
AUDIO_CONTENT=$(echo $RESULT | jq -r --arg filename "$FILENAME" '.data[$filename].audio')
TRANSCRIPTION=$(echo "$RESULT" | jq -r --arg filename "$FILENAME" '.data[$filename].transcription')
echo "$AUDIO_CONTENT" | xxd -r -p > "$OUTPUT_DIR/$FILENAME"
echo "$FILENAME transcription: $TRANSCRIPTION"
break
elif [ "$TASK_STATUS" == "failed" ]; then
echo "Task failed: $STATUS_RESPONSE"
exit 1
else
echo "Task status: $TASK_STATUS. Checking again in 1 second..."
sleep 1
fi
done
Masking text
#!/bin/bash
# Replace with your actual token
TOKEN=YOU_NIJTA_TOKEN
API_URL="https://api.nijta.com"
TEXT_DATA='["My name is John"]'
ENTITIES="Name,Organization,Location"
# Start a new session
SESSION_ID=$(curl -s -X POST "$API_URL/session" \
-H "Content-Type: application/json; charset=utf-8" \
-H "TOKEN: $TOKEN" | jq -r '.session_id')
# Send request to mask PII in text
RESPONSE=$(curl -s -X POST "$API_URL/tasks/text/$SESSION_ID" \
-H "Content-Type: application/json; charset=utf-8" \
-H "TOKEN: $TOKEN" \
-d '{
"text": '"$TEXT_DATA"',
"entities": "'"$ENTITIES"'"
}')
TASK_ID=$(echo $RESPONSE | jq -r '.data.task_id')
# Check job status every second and read the response
while true; do
sleep 1
RESPONSE=$(curl -s -X GET "$API_URL/tasks/$TASK_ID" \
-H "Content-Type: application/json; charset=utf-8" \
-H "TOKEN: $TOKEN")
STATUS=$(echo $RESPONSE | jq -r '.data.task_status')
if [ "$STATUS" == "finished" ]; then
echo "Job finished. Retrieving results..."
RESULT=$(echo $RESPONSE | jq -r '.data.task_result.response' | xxd -r -p)
echo "$RESULT"
break
else
echo "Job status: $STATUS"
fi
done
echo "Done."
Updated 4 months ago