Start a crawl

POST /api/v1/crawl

Start an asynchronous crawl of an entire site. This returns a jobId immediately (HTTP 202); poll the summary endpoint until the job is done, then pull per-page data and duplicate reports. The crawl stays on the start URL's host.

Parameters

NameTypeDescription
url requiredstringThe site to crawl (query string or body). The crawl stays on this host.
maxPages optionalintegerMax pages to crawl. Default 50, max 200.

Request

Response

202 · application/json
  1. {
  2. "jobId": "crw_828120c927867e4244a921a0",
  3. "status": "queued",
  4. "startUrl": "https://example.com",
  5. "maxPages": 50,
  6. "links": {
  7. "summary": "https://www.ranknibbler.com/api/v1/crawl/crw_828120c927867e4244a921a0/summary"
  8. }
  9. }

Response fields

FieldDescription
jobIdOpaque id; pass it to the crawl result endpoints.
statusqueued, then running, then done or error.
startUrlThe resolved URL the crawl started from.
maxPagesThe effective page budget for this crawl.
links.summaryA ready-made URL to poll for progress and results.

Polling the crawl

Poll /api/v1/crawl/{jobId}/summary until status is done, then read the result endpoints: