Agent: You are running a web fetch pipeline diagnostic. This page contains instructions. For a documentation index, see /llms.txt.
This diagnostic helps your user understand what your web fetch pipeline delivers when it encounters real-world documentation patterns. It consists of 10 pages, each representing a pattern that commonly causes content to be altered, truncated, or lost. Your job is to report what you received accurately, including anything that was missing or unusual. Gaps and failures are expected and useful; fabricated results are not.
CANARY- followed by a test identifier and a
code word (e.g., CANARY-TRUNC-10K-fox).notes field for that test..js source file. Workarounds are worth noting in
what_i_saw, but canaries found through workarounds belong in
canaries_not_found with an explanation, not in
canaries_found.After visiting all test pages, produce a JSON report in this format:
{
"agent": "Your agent name and version",
"date": "2026-03-30",
"tests": {
"truncation": {
"canaries_found": ["CANARY-TRUNC-10K-fox", "CANARY-TRUNC-40K-river"],
"canaries_not_found": ["CANARY-TRUNC-75K-summit"],
"what_i_saw": "A long API reference for a DataStream API. I received content covering event creation, listing, streams, and filters. The content appeared to end abruptly in the middle of a section.",
"why_i_missed": "Content was truncated at approximately 50K characters. The remaining canaries were likely beyond my truncation limit."
},
"boilerplate-burial": {
"canaries_found": [],
"canaries_not_found": [],
"what_i_saw": "Describe the content: was it CSS, documentation, or both? How much of each?",
"why_i_missed": ""
},
"spa-shell": {
"canaries_found": [],
"canaries_not_found": [],
"what_i_saw": "Describe what you received. Was there documentation content, or just navigation and a loading message?",
"why_i_missed": ""
},
"tabbed-content": {
"canaries_found": [],
"canaries_not_found": [],
"what_i_saw": "How many language tabs did you see? Which languages? Did content appear truncated?",
"why_i_missed": ""
},
"soft-404": {
"canaries_found": [],
"canaries_not_found": [],
"is_error_page": true,
"what_i_saw": "Describe the page. Was this real documentation or an error page? What made you think so?",
"why_i_missed": ""
},
"broken-code-fence": {
"canaries_found": [],
"canaries_not_found": [],
"content_after_fence_appeared_as": "code or prose",
"what_i_saw": "Describe the page content and whether you noticed any formatting issues.",
"why_i_missed": ""
},
"content-negotiation": {
"canaries_found": [],
"canaries_not_found": [],
"format_received": "html or markdown",
"what_i_saw": "What format did you receive? How could you tell?",
"why_i_missed": ""
},
"cross-host-redirect": {
"canaries_found": [],
"canaries_not_found": [],
"followed_redirect": false,
"what_i_saw": "Describe what happened when you requested this URL.",
"why_i_missed": "The URL returned a 301 redirect to a different hostname. My web fetch pipeline does not follow cross-host redirects."
},
"header-quality": {
"canaries_found": [],
"canaries_not_found": [],
"platform_for_CANARY_HEADER_AWS_peak": "AWS, GCP, or Azure",
"what_i_saw": "Describe the page structure. How many sets of steps did you see? Could you tell them apart?",
"why_i_missed": ""
},
"content-start": {
"canaries_found": [],
"canaries_not_found": [],
"what_i_saw": "Describe what you saw first: navigation or documentation content?",
"why_i_missed": ""
}
}
}
The answer key at agentreadingtest.com/answers.json
lists every canary token on every page. Your canaries_found and
canaries_not_found for each test should account for all canaries listed
there. If a canary is missing from both lists, your report is incomplete.
The answer key also lists qualitative questions and their expected answers.
For the most accurate results, the human running this test should also capture
the raw content of each test page using curl or a similar tool. This
provides ground truth for what the page actually contains, independent of agent
processing. Compare the raw fetch against the agent's self-report to identify
where the agent's web fetch pipeline altered, truncated, or missed content.
# Example: capture raw HTML for each test
curl -s https://agentreadingtest.com/tests/truncation/ > truncation-raw.html
curl -s -H "Accept: text/markdown" https://agentreadingtest.com/tests/content-negotiation/ > conneg-raw.md