Request body
sourceUrl— HTTPS or signed URL pointing to the PDF. Required iffileis not provided.file— Optional uploaded PDF file (usemultipart/form-datawhen sending the binary).sourceName— Optional label saved with the extracted records (e.g.,Playbook 2025).options— Optional object. Supported keys:segmentLength— Target characters per chunk (default1000).language— ISO language hint (en,fr, etc.) to improve OCR accuracy.ocr— Boolean; force OCR on scan-heavy PDFs (defaults to automatic detection).
metadata— Optional object for custom tags (e.g.,{"department":"Support"}).webhookUrl— Optional HTTPS URL Horizon should call when the extraction finishes.
Sample request
Response
Returns202 Accepted with a jobId, status, and statusUrl. When the PDF is small enough to finish synchronously, the normalized chunks are included in result.
Notes
- Signed URLs should remain valid until the job completes; most files process within a few minutes.
- Set
segmentLengthto align with downstream token budgets. - OCR runs automatically when vector text is unavailable; use
ocr: falseto bypass it for machine-generated PDFs. - Poll
GET /jobs/{jobId}(the same as the returnedstatusUrl) to monitor progress or retrieve the final result later. - To upload the file directly, send
multipart/form-datawith afilefield instead ofsourceUrl(e.g.,curl -F "file=@playbook.pdf").
x402 flow
PDF extraction is billed per document via Coinbase’s x402 protocol. When payment is required, Horizon returns a structured402 challenge:
accepts entry to your facilitator, calling /verify and /settle, then replaying the request with the facilitator-provided Base64 payload inside X-PAYMENT. Successful responses include X-PAYMENT-RESPONSE with the settlement receipt. See the Coinbase quickstart if you need help provisioning facilitator credentials.Body
application/json
Provide either sourceUrl or file.
Extraction hints such as language, segmentLength, transcriptionModel, or sheet preferences depending on the endpoint.
Webhook to call when the extraction completes.
Upload the raw file instead of providing sourceUrl.
Response
Extraction job accepted
Example:
"job_01hx9q9"
Available options:
queued, processing, completed, failed Canonical link to GET /jobs/{jobId} for this job.
Example:
"extract/pdf"
Present when the job completes synchronously.
Estimated seconds until completion.