--crawl-max-links
Cap the total number of pages a link crawl or sitemap walk fetches into a WARC. The seed counts as the first page, so --crawl-max-links 20 fetches at most 20 pages. On reaching the cap the walk stops and exits successfully with what it captured. Without it, the walk is bounded only by depth, scope, and the request timeout.
Applies to both --crawl-links crawls and --crawl-url-is-sitemap walks. Status lines report progress toward the budget, Fetching (1/N) url, Fetching (2/N) url, …; without a budget they report position over the currently-known total.
The budget is authoritative over --crawl-link-depth: a crawl follows links beyond the depth bound, shallowest first, until the budget is met.
Example
zshot -t warc -f site.warc.gz --crawl-links --crawl-max-links 20 https://example.com