Data Sources

All data on Voxsanity is sourced from publicly available government registries. This page lists each source, its licence status, whether commercial use is permitted, and how we handle attribution.

Confirmed sources

Source Data type Licence Commercial use Attribution
ClinicalTrials.gov
clinicaltrials.gov/api/v2
Clinical trial registrations US Federal public domain Yes Source linked on every trial page
openFDA
api.fda.gov
Drug approvals, labelling data CC0 1.0 Universal Yes Not required (CC0 waives all rights)
NIH Reporter
api.reporter.nih.gov
Research funding grants US Federal public domain Yes Not required
OpenAlex
api.openalex.org
Academic papers and research volume CC0 Yes Not required (CC0)
PatentsView
search.patentsview.org
Patent filings (pipeline signal) US Federal public domain Yes Not required
ANZCTR
api.anzctr.org.au
Australian and New Zealand clinical trials Conditional Pending verification Required: source, modification disclosure, processing date
PBS (Pharmaceutical Benefits Scheme)
api.pbs.gov.au
Australian drug subsidy listings Australian Government open data Yes Not required

Excluded sources

Source Reason for exclusion
WHO ICTRP Commercial use expressly prohibited by WHO terms of use. Permanently excluded.

Notes on specific sources

ClinicalTrials.gov

Clinical trial data is sourced from the ClinicalTrials.gov v2 API, operated by the US National Library of Medicine. Eligibility criteria are returned in CommonMark Markdown format by the API. This data is in the public domain as a work of the United States Federal Government.

Trial records are synced nightly. The date of the most recent sync is shown on every trial card. Trial status can change at any time. Always verify current status directly with the trial site before acting on this information.

openFDA

Drug approval and labelling data is sourced from the openFDA API, operated by the US Food and Drug Administration. openFDA data is released under the Creative Commons CC0 1.0 Universal licence, which means all rights are waived and no attribution is legally required. We link to FDA source records regardless.

Note: GMDN device data accessed via openFDA is not covered by the CC0 licence and is not used by Voxsanity.

ANZCTR

ANZCTR data is used conditionally pending formal commercial terms verification. When displayed, ANZCTR data carries mandatory attribution as the source, discloses that eligibility criteria have been rewritten in plain English, and shows the date the data was processed. Voxsanity uses the official ANZCTR web service API, not scraping.

PatentsView

PatentsView is mid-migration to the USPTO Open Data Portal as of 2026. Patent count data is treated as a supplementary pipeline signal and is displayed with a note where the migration affects data availability.

How we process data

Raw data from each source is stored in our database after each nightly sync. Plain English descriptions of eligibility criteria are generated using AI language models and stored separately from the original source text. The original source text is always preserved and linked. When a plain English interpretation is not yet available, the original text is displayed.

AI-generated plain English summaries are reviewed against source data on a sample basis. If you notice a discrepancy, please contact us.

Update frequency

Trial data: updated nightly. Drug approval data: updated nightly. Pipeline and funding data (NIH, OpenAlex): updated nightly. PBS listing data: updated monthly. The timestamp on each data record shows when it was last pulled from its source.

Data quality and errors

Voxsanity presents data as it appears in source registries. Errors or inconsistencies in source data (for example, incomplete eligibility criteria or missing dates) are shown as received. If you identify a data quality issue, please let us know and we will review it.