robots.txt.
Request body
url— Required absolute URL to the page or sitemap.depth— Optional integer for how many link levels to follow (0–3); defaults to1.priority— Optional value (low,normal,high) that influences queue time.notify— Optional webhook configuration ({ "url": "...", "events": ["completed"] }).webhookUrl— Optional shorthand to register a completion webhook for this crawl job.
Sample request
Response
Returns202 Accepted with a crawlId, status (queued), and the normalized rootUrl. Monitor progress via the statusUrl in the response or subscribe to webhook notifications.
Behavior
- The crawler deduplicates submissions; repeated requests extend expiration but reuse the same
crawlId. - Scripts, tracking pixels, and assets over 10 MB are skipped to keep the dataset lean.
- Poll
GET /jobs/{jobId}(thestatusUrltargets this endpoint) to observe progress until the crawl completes.
x402 flow
Ingestion work is priced per crawl via Coinbase’s x402 protocol. When the account needs funding, Horizon returns a402 challenge:
- Provide the
acceptsobject to your facilitator as described in Client & server responsibilities and call/verify+/settle. - Replay the crawl request with the facilitator-issued Base64 payload in the
X-PAYMENTheader: - Horizon validates the payment, queues the crawl, and returns the usual
202 Accepted. Successful responses can exposeX-PAYMENT-RESPONSEfor bookkeeping.
Body
application/json