Processing Text for Entity Masking

The VoiceHarbor API can also process text-only input to mask specified categories of entities.

Quickstart

import json
import time
import nijtaio

TOKEN = '<token>'
API_URL = 'https://api.nijta.com'
headers = {'Content-Type': 'application/json; charset=utf-8', 'TOKEN': TOKEN}
params = {
    'entities': 'Name,Location',
}

response = nijtaio.send_request(
    ['My name is John', 'I live in New York'],
    params,
    nijtaio.session(TOKEN, api_url=API_URL),
    task='text',
    headers=headers,
    api_url=API_URL
)

task_id = json.loads(response.content)['data']['task_id']
print(f'{task_id = }')

print('Waiting for the batch to be processed.')
status = ''

while status != 'finished':
    time.sleep(1)
    status, anonymized_batch = nijtaio.read_response(task_id, api_url=API_URL)
    print(f'{status = }', end='\r')
print()

for masked_text in anonymized_batch:
    print(masked_text)

print('Done.')

Step by step

Follow the steps below to use the API for text processing:

  1. Setup and Configuration: Import necessary libraries and set up your API token and URL.

    import json
    import time
    import nijtaio
    
    TOKEN = '<token>'
    API_URL = 'https://api.nijta.com'
    headers = {'Content-Type': 'application/json; charset=utf-8', 'TOKEN': TOKEN}
    params = {
        'entities': 'Name,Location',
    }
    
  2. Send Text for Processing: Pass a list of text strings and the list of categories you wish to mask.

    response = nijtaio.send_request(
        ['My name is John', 'I live in New York'],
        params,
        nijtaio.session(TOKEN, api_url=API_URL),
        task='text',
        headers=headers,
        api_url=API_URL
    )
    
    task_id = json.loads(response.content)['data']['task_id']
    print(f'{task_id = }')
    
  3. Wait for Processing to Complete: Call the API to check the status of the batch processing.

    print('Waiting for the batch to be processed.')
    status = ''
    
    while status != 'finished':
        time.sleep(1)
        status, anonymized_batch = nijtaio.read_response(task_id, api_url=API_URL)
        print(f'{status = }', end='\r')
    print()
    

📘

Good to know

The result is kept only for 5 seconds on our server.

  1. Retrieve and Display Results: Once processing is complete, retrieve and print the anonymized text.

    for masked_text in anonymized_batch:
        print(masked_text)
    
    print('Done.')
    

Expected Result

The API returns a list of text strings where the specified entities are masked. For example:

['My name is <Name>', 'I live in <Location>']