API Documentation
Job Management
Job Content
Choose the right model
Context-aware PHI redaction, fast and efficient.
Advanced vs. Mini
Our Mini and Advanced systems are designed to meet different customer needs when it comes to PHI redaction.
The Mini system harnesses a fine-tuned transformer model to quickly and accurately reduce PHI. With benchmark results showing it processes a 6-minute mono file in about 83.58 seconds, Mini is ideal for high-volume or time-sensitive scenarios. Its streamlined architecture ensures that redaction is performed rapidly without compromising on quality, making it an excellent choice for environments where speed is crucial.
In contrast, the Advanced system leverages a private large language model (LLM) equipped with sophisticated reasoning capabilities. Although it takes around 185.39 seconds to process the same 6-minute file, the Advanced system excels in handling complex redaction tasks. Its integrated reasoning allows for a deeper contextual analysis, ensuring that even subtle or ambiguous PHI elements are accurately identified and redacted. This level of precision is particularly beneficial for applications where data security and comprehensive compliance are paramount.
Using Mini model
BASE_URL = "https://voiceharbor.ai"
usage_token = "USAGE_TOKEN"
# Create a new job on the server via the class method.
job_id = VoiceHarborClient.create_job(BASE_URL, usage_token)
client = VoiceHarborClient(
base_url=BASE_URL,
job_id=job_id,
token=usage_token,
inputs_dir="./inputs/tests"
)
# Submit input files and the job file.and
job_params = {"files": [], "model":"mini"}
job_params = client.submit_files(job_params)
job_file = client.submit_job(job_params)
logger.info(f"Job file created: {job_file}")
Labels
Available labels
"medical_record_number": "A unique identifier assigned to a patient's medical records.",
"date_of_birth": "The birth date of an individual.",
"ssn": "Social Security Number, a unique number assigned to U.S. citizens and some residents.",
"date": "A generic date field representing an event or record date.",
"first_name": "The given name of a person.",
"email": "An electronic mail address used for communication.",
"last_name": "The family or surname of a person.",
"customer_id": "A unique identifier for a customer account.",
"employee_id": "A unique identifier for an employee within an organization.",
"name": "A general name field which could represent a full name or title.",
"street_address": "The physical street address for a residence or business.",
"phone_number": "A contact telephone number.",
"ipv4": "An IPv4 address used in computer networking.",
"credit_card_number": "A number associated with a credit card account.",
"license_plate": "A vehicle's registration identifier.",
"address": "A general address field for various uses.",
"user_name": "A chosen username for account login or identification.",
"device_identifier": "A unique identifier assigned to a device.",
"bank_routing_number": "A number used to identify a financial institution in transactions.",
"date_time": "Combined date and time information.",
"company_name": "The name of a company or business entity.",
"unique_identifier": "A general purpose unique identifier.",
"biometric_identifier": "A unique biometric feature, such as a fingerprint or facial recognition data.",
"account_number": "A number associated with a financial or user account.",
"city": "The name of a city in an address.",
"certificate_license_number": "An identifier for a professional or business license.",
"time": "A specific time, separate from the date.",
"postcode": "A postal or ZIP code used for mail delivery.",
"vehicle_identifier": "An identifier unique to a vehicle.",
"coordinate": "Geographical coordinate data (latitude/longitude).",
"country": "The name of a country.",
"api_key": "A key used to authenticate and authorize API requests.",
"ipv6": "An IPv6 address used in modern computer networks.",
"password": "A secret word or phrase used for authentication.",
"health_plan_beneficiary_number": "An identifier for a health plan beneficiary.",
"national_id": "A national identification number.",
"tax_id": "A number used for tax identification purposes.",
"url": "A web address.",
"state": "A state or region within a country.",
"swift_bic": "A code used for international bank transfers (SWIFT/BIC).",
"cvv": "The Card Verification Value found on credit cards.",
"pin": "A Personal Identification Number used for security."
Each title and subtitle creates an anchor and also shows up on the table of contents on the right.
Using Advanced model
BASE_URL = "https://voiceharbor.ai"
usage_token = "USAGE_TOKEN"
# Create a new job on the server via the class method.
job_id = VoiceHarborClient.create_job(BASE_URL, usage_token)
client = VoiceHarborClient(
base_url=BASE_URL,
job_id=job_id,
token=usage_token,
inputs_dir="./inputs/tests"
)
# Submit input files and the job file.and
job_params = {"files": [], "model":"advanced", "agents": ["health-generic"]}
job_params = client.submit_files(job_params)
job_file = client.submit_job(job_params)
logger.info(f"Job file created: {job_file}")
Labels
"health-generic": {
"NAME": "Any personal name (first, middle, last, suffix).",
"LOCATION": "Indicates the geographical location relevant to the health record, such as city or country.",
"ORGANIZATION": "Names of organizations like hospitals, clinics, or labs.",
"PHONE": "Represents a contact telephone number associated with the record.",
"AGE": "Age or age group of an individual.",
"DATE": "A significant date related to the health event, such as admission or diagnosis date.",
"DISEASE": "Diseases that can affect an individual (e.g., cancer, diabetes).",
"DISORDER": "Mental or physical disorders (e.g., anxiety, arthritis).",
"DRUG": "Medications or drugs prescribed or used.",
"RESULT": "Clinical or lab results indicating test outcomes.",
"SYMPTOM": "Reported symptoms indicating potential conditions.",
"SYNDROME": "Recognized medical syndromes.",
"TEST": "Medical tests or procedures (e.g., blood test).",
"TREATMENT": "Medical treatments or therapeutic procedures.",
"ORGANIZATION_ID": "Unique identification codes for organizations."
},
"clinical": {
"GENE": "Denotes a specific gene relevant to the clinical or genetic data.",
"PROTEIN": "Represents a protein that is studied or measured in clinical settings.",
"CHEMICAL": "Specifies a chemical substance involved in biomedical studies or treatments.",
"DISEASE_Syndrome_Disorder": "A combined label representing various clinical conditions such as diseases, syndromes, or disorders.",
"DRUG_Ingredient": "Refers to the active component or ingredient of a drug formulation.",
"DRUG_BrandName": "Represents the commercial brand name under which the drug is marketed.",
"MUTATION": "Indicates a genetic mutation relevant to diagnosis or research.",
"PATHWAY": "Represents a biological pathway involved in physiological or pathological processes.",
"CELL_TYPE": "Specifies the type of cell that is relevant in the clinical context.",
"BIOPROCESS": "Describes a biological process that is significant in clinical research or treatment."
},
"hipaa": {
"NAME": "Any personal name (first, middle, last, suffix).",
"GEOGRAPHIC": "Any geographic data (street, city, county, ZIP, etc.).",
"DATES": "Dates (except year) directly related to an individual (birth, admission, discharge, death).",
"TELEPHONE": "Telephone numbers (landline or mobile).",
"FAX": "Fax numbers.",
"EMAIL": "Email addresses.",
"SSN": "Social Security numbers or equivalent identifiers.",
"MEDICAL_RECORD": "Unique medical record numbers.",
"HEALTH_PLAN": "Health plan beneficiary or insurance policy numbers.",
"ACCOUNT": "Account numbers related to healthcare billing.",
"LICENSE": "Certificate or license numbers issued by authorities.",
"VEHICLE": "Vehicle identifiers or license plate numbers.",
"DEVICE": "Device identifiers and serial numbers for medical equipment.",
"WEB": "Web URLs or website addresses.",
"IP": "IP address numbers.",
"BIOMETRIC": "Biometric identifiers (fingerprints, retinal scans, voice prints).",
"PHOTOGRAPHIC": "Full face photographic images or comparable images.",
"OTHER_UNIQUE": "Any other unique identifying number, characteristic, or code.",
"ORGANIZATION": "Names of organizations like hospitals, clinics, or labs.",
"AGE": "Age or age group of an individual.",
"DISEASE": "Diseases that can affect an individual (e.g., cancer, diabetes).",
"DISORDER": "Mental or physical disorders (e.g., anxiety, arthritis).",
"DRUG": "Medications or drugs prescribed or used.",
"RESULT": "Clinical or lab results indicating test outcomes.",
"SYMPTOM": "Reported symptoms indicating potential conditions.",
"SYNDROME": "Recognized medical syndromes.",
"TEST": "Medical tests or procedures (e.g., blood test).",
"TREATMENT": "Medical treatments or therapeutic procedures.",
"ORGANIZATION_ID": "Unique identification codes for organizations."
}