Request body
sourceUrl— HTTPS or signed URL pointing to the PDF. Required iffileis not provided.file— Optional uploaded PDF file (usemultipart/form-datawhen sending the binary).sourceName— Optional label saved with the extracted records (e.g.,Playbook 2025).options— Optional object. Supported keys:segmentLength— Target characters per chunk (default1000).language— ISO language hint (en,fr, etc.) to improve OCR accuracy.ocr— Boolean; force OCR on scan-heavy PDFs (defaults to automatic detection).
webhookUrl— Optional HTTPS URL Horizon should call when the extraction finishes.
Sample request
Response
Returns202 Accepted with a jobId, status, and statusUrl. When the PDF is small enough to finish synchronously, the normalized chunks are included in result.
Notes
- Signed URLs should remain valid until the job completes; most files process within a few minutes.
- Set
segmentLengthto align with downstream token budgets. - OCR runs automatically when vector text is unavailable; use
ocr: falseto bypass it for machine-generated PDFs. - Poll
GET /jobs/{jobId}(the same as the returnedstatusUrl) to monitor progress or retrieve the final result later. - To upload the file directly, send
multipart/form-datawith afilefield instead ofsourceUrl(e.g.,curl -F "[email protected]").
x402 flow
PDF extraction is billed per document via Coinbase’s x402 protocol. When payment is required, Horizon returns a structured402 challenge:
accepts entry to your facilitator, calling /verify and /settle, then replaying the request with the facilitator-provided Base64 payload inside X-PAYMENT. Successful responses include X-PAYMENT-RESPONSE with the settlement receipt. See the Coinbase quickstart if you need help provisioning facilitator credentials.