{"description":"Structured crawl evidence contract for Screaming Frog exports, summary reports, severity rules, pass/warn/fail criteria, and launch-readiness evidence.","objective":"Give a future agent enough structure to start from the main URL and understand how to inspect, plan, build, validate, and improve a serious website without needing prior chat context.","crawl_evidence":{"objective":"Make crawl evidence reproducible, machine-readable, and usable by future agents before launch or handoff.","why":"A site can pass local route and browser checks while still having crawl-specific issues in exported indexability, metadata, H1, canonical, response-code, or structured-data data.","crawlTool":"Screaming Frog SEO Spider","approvedWrapper":"workspace-local licensed Screaming Frog wrapper (private; invoked by scripts/run-crawl-evidence.mjs, never published)","command":"npm run qa:crawl -- --export-dir <crawl-export-dir>","runnerCommands":[{"id":"plan-local-crawl","command":"npm run crawl:plan","why":"Writes the local static crawl plan without consuming the Screaming Frog license."},{"id":"execute-local-crawl","command":"npm run crawl:local","why":"Serves the static build locally, runs Screaming Frog through the approved wrapper, and summarizes exports."},{"id":"execute-production-crawl","command":"QA_PRODUCTION_URL=<canonical-url> npm run crawl:production","why":"Runs the same crawl evidence path against the deployed canonical URL when launch evidence is required."},{"id":"verify-runner-shape","command":"npm run qa:crawl-runner","why":"Validates wrapper availability and writes a plan during normal QA without launching a long crawl."}],"defaultReportPath":"reports/crawl/crawl-evidence-report.json","repoReportDirectory":"reports/crawl/","requiredExports":[{"id":"crawl-overview","fileName":"crawl_overview.csv","severity":"critical","why":"Provides crawl totals and high-level issue counts."},{"id":"internal-all","fileName":"internal_all.csv","severity":"critical","why":"Provides complete crawled internal URL inventory."}],"issueExports":[{"id":"client-error-4xx","fileName":"response_codes_client_error_(4xx).csv","severity":"critical","why":"Identifies broken internal links and crawler-visible client errors."},{"id":"server-error-5xx","fileName":"response_codes_server_error_(5xx).csv","severity":"critical","why":"Identifies server-side failures."},{"id":"no-response","fileName":"response_codes_no_response.csv","severity":"critical","why":"Identifies URLs that failed to respond."},{"id":"h1-missing","fileName":"h1_missing.csv","severity":"high","why":"Identifies indexable pages with missing primary headings."},{"id":"meta-description-missing","fileName":"meta_description_missing.csv","severity":"high","why":"Identifies indexable pages with missing search snippets."}],"optionalExports":[{"id":"page-title-missing","fileName":"page_titles_missing.csv","why":"Useful for deeper metadata audits."},{"id":"canonical-missing","fileName":"canonicals_missing.csv","why":"Useful for launch SEO audits."},{"id":"structured-data-validation-errors","fileName":"structured_data_validation_errors.csv","why":"Useful when schema is part of the work."}],"severityRules":[{"id":"missing-required-export","severity":"critical","condition":"A required inventory export file is absent from the crawl export directory."},{"id":"missing-issue-export","severity":"info","condition":"An issue export is absent because Screaming Frog --skip-empty omitted an empty export."},{"id":"internal-4xx","severity":"critical","condition":"Client error export contains one or more rows after headers."},{"id":"internal-5xx","severity":"critical","condition":"Server error export contains one or more rows after headers."},{"id":"no-response","severity":"critical","condition":"No-response export contains one or more rows after headers."},{"id":"missing-h1","severity":"high","condition":"H1 missing export contains one or more rows after headers."},{"id":"missing-meta-description","severity":"high","condition":"Meta description missing export contains one or more rows after headers."}],"passCriteria":["Required inventory crawl exports exist.","Crawl evidence report is written to reports/crawl/.","Critical response-code exports contain no rows after headers.","High-severity metadata/heading findings are either zero or explicitly documented for follow-up.","The crawl target, export directory, generated date, and finding counts are recorded."],"warningCriteria":["High-severity findings exist but are documented for review.","Issue exports are missing only when Screaming Frog --skip-empty omitted empty exports.","Optional exports are missing because the crawl scope did not include them.","Crawler-visible 403s require direct/browser validation before being treated as broken."],"failCriteria":["Required inventory exports are missing.","Critical internal 4xx, 5xx, or no-response findings exist.","The report cannot be generated from the crawl export directory.","The approved Screaming Frog wrapper is missing when a crawl runner plan is requested."],"stressSuite":{"command":"npm run qa:crawl-stress","why":"Screaming Frog evidence depends on a licensed external tool and a manual export step. The stress suite re-crawls the static export with a built-in crawler on every QA run, so broken internal links, redirect chains, orphaned sitemap URLs, and slow or oversized responses fail QA without waiting for licensed crawl evidence.","reportPath":"reports/crawl/crawl-stress-report.json","responseBudgets":[{"id":"max-response-ms","limit":2000,"severity":"high","why":"A local static response beyond two seconds signals a pathological document or server problem that will only be worse behind a real network."},{"id":"max-response-bytes","limit":1500000,"severity":"high","why":"Oversized documents waste crawl budget and slow answer-engine retrieval; anything beyond roughly 1.5MB needs explicit justification."},{"id":"max-redirect-hops","limit":1,"severity":"critical","why":"Redirect chains waste crawl budget and dilute link signals; one hop is tolerated for canonical normalization, longer chains fail."}],"failureRules":[{"id":"broken-internal-link","severity":"critical","condition":"Any crawled internal URL returns 4xx, 5xx, or no response, with the referring page recorded."},{"id":"redirect-chain","severity":"critical","condition":"Any crawled URL needs more than the budgeted redirect hops to resolve."},{"id":"budget-exceeded","severity":"high","condition":"Any crawled URL exceeds a documented response-time or response-size budget."},{"id":"orphan-sitemap-url","severity":"high","condition":"A sitemap URL cannot be reached by following internal links from the homepage."}],"negativeControls":"Synthetic broken-link, redirect-chain, slow-response, and oversized-response records must produce their expected failure kinds through the same evaluation rules applied to the live crawl, proving the rules fail when reality drifts."},"objectiveAlignment":"Turns crawl outputs into durable evidence an agent can inspect, compare, and cite in launch or handoff decisions."}}