Gaffa Documentation Audit
The docs are a thin GitBook surface over a small REST API — the happy path is described, but the API Reference link is broken on the pages that recommend it, several parameter contracts contradict each other across pages, and agent-friendly affordances are advertised more than they're delivered.
1. The API Reference link is dead on every page that points to it (critical)
Location: /docs (Introduction), /docs/get-started, /docs/features/browser-requests
Problem: The Introduction's card table sends "API Reference" to /broken/pages/Jer3HvlR3KNzesxDbiIL (rendered as the literal text "Broken link"). Get Started repeats the same dead anchor for "interactive API definitions" and "other endpoints that are part of the API." The Browser Requests page does it again under "API Reference — Complete endpoint documentation and technical details." The same broken-id pattern reappears for the llms.txt tutorial under /broken/pages/TM6N5OaBEOPp2EA1LBbI.
Consequence: Every prominent CTA to the reference 404s. A new developer following the documented onboarding ("read about all the other actions… as well as the other endpoints that are part of the API") cannot get to the reference at all without hand-editing the URL or guessing route names. AI coding agents indexing the site will record the reference target as an error page.
The fix: Repoint every /broken/pages/Jer3HvlR3KNzesxDbiIL link to the actual /docs/api-reference/... index, and audit the GitBook export — these are unresolved GitBook page references, not 404s in the underlying content.
2. max_cache_age unit contradicts itself between pages (critical)
Location: /docs/features/browser-requests/parameters vs /docs/features/mapping-requests
Problem: Parameters page: "Users can set a max_cache_age parameter (in seconds, ≥0)." Mapping Requests page describes the same field on the sitemap endpoint as "a max_cache_age in milliseconds." Same parameter name, two different units, with a 1000× gap.
Consequence: A developer who sets max_cache_age: 3600 expecting "an hour" gets either 1 hour or 3.6 seconds depending on the endpoint — silently. Cache misses look like Gaffa bugs; cache hits on stale data look like data freshness bugs. Agents that paste an example from one page into the other endpoint will produce hard-to-diagnose drift.
The fix: Pick one unit (seconds is the more common API convention) and update the contradicting page. Annotate the OpenAPI schema with the unit so both pages render from one source.
3. The advertised llms.txt file doesn't exist at llms.txt (significant)
Location: /docs/get-started ("Want to build faster with AI assistance?")
Problem: The docs say: "You can use Gaffa's llms.txt file to give AI assistants like ChatGPT or Claude instant, accurate context about the Gaffa API." The link then points to https://gaffa.dev/docs/llms-full.txt. There is no llms.txt at the conventional root path that agents probe for (/llms.txt); only the -full.txt variant exists, and only under /docs/.
Consequence: The whole point of the llms.txt convention is auto-discovery — agents fetch /llms.txt at a site root. Hosting only llms-full.txt under /docs/ defeats discovery, and the prose calling it "llms.txt" misleads developers who try the canonical filename and get a 404.
The fix: Publish a proper llms.txt index at https://gaffa.dev/llms.txt (short, linked) and keep llms-full.txt as the verbose companion, also at the site root. Update copy so the name in the docs matches the file actually served.
4. The "browser request examples" link points to a private Claude.ai chat (critical)
Location: /docs/features/browser-requests (Examples section)
Problem: Verbatim from the page: "We've created a number of sample browser requests you can read about here." That URL is a private Claude.ai conversation, not a documentation page.
Consequence: External visitors get a Claude.ai sign-in / "conversation not found" page when they click the most prominent Examples link. This looks like a draft-content leak that shipped to production.
The fix: Replace the link with the api-playground-examples/ index (which already exists in the docs tree) and add a CI check that rejects claude.ai/chat/, notion.so, and similar private URLs in docs source.
5. Request-status enum disagrees between two list endpoints (significant)
Location: /docs/api-reference/get-v1-browser-requests vs /docs/api-reference/get-v1-site-map
Problem: GET /v1/browser/requests documents status values as "pending, running, completed, failed" (four states). GET /v1/site/map documents status as "pending, completed, or failed" (no running). The endpoints share a job-lifecycle model but expose different vocabularies.
Consequence: A developer who polls both endpoints with a shared status-filter helper will either (a) get 400s when filtering running on /site/map, or (b) miss in-flight site-map jobs because the docs never mention a running state. An agent generating typed clients will emit two incompatible status enums for what is plausibly the same field.
The fix: Confirm with the API whether /site/map jobs actually have a running state. Align the enums (and the OpenAPI spec) in both directions — either add running to /site/map docs or remove it from /browser/requests if the API really only emits three.
6. The action count contradicts itself, and the lists don't add up (significant)
Location: /docs/features/browser-requests/settings, /docs/features/browser-requests/actions
Problem: Settings page: "We currently support ten different types of actions." Actions overview: "Without Outputs (4 actions): click, scroll, type, and wait operations" plus "With Outputs (10 actions): capture_cookies, capture_dom, capture_screenshot, capture_snapshot, download_file, generate_markdown, generate_simplified_dom, parse_json, and print." The "10" list contains only 9 named actions. capture_element is documented as a beta action on its own page but doesn't appear in either bucket.
Consequence: Developers can't enumerate the action surface from the docs. Agents that build action-name allowlists from this prose will either over-include (10) or under-include (miss capture_element). The "ten total" claim on the Settings page is wrong under either reading.
The fix: Render one canonical actions table from the OpenAPI spec / a single source file, including beta actions with a beta: true flag. Remove the "we currently support ten" claim — it will go stale immediately.
7. The OpenAPI spec is served from an inconsistent, time-limited URL (significant)
Location: /docs/api-reference/get-v1-browser-requests-id, /docs/api-reference/post-v1-site-map, /docs/api-reference/post-v1-schemas
Problem: GET /v1/browser/requests/{id} embeds a signed Cloudflare R2 URL with X-Amz-Date=20260313T160749Z&X-Amz-Expires=172800 — i.e., a presigned link with a 48-hour TTL. Other API-reference pages point to a stable URL: https://api.gaffa.dev/swagger.json. Two ways to reach the same spec, one of them expires.
Consequence: Agents that follow the embedded OpenAPI link to programmatically discover the schema will hit a 403 once the signature expires. Humans on different reference pages get different views of the spec depending on which page they landed on.
The fix: Always reference https://api.gaffa.dev/swagger.json (or a versioned, stable URL). Regenerate the GitBook OpenAPI embeds so they all point to the same canonical source — the signed R2 URL is a GitBook export artifact, not something a consumer should ever see.
8. Schema field type is exposed as a 0-7 integer enum with no mapping (significant)
Location: /docs/api-reference/get-v1-schemas vs /docs/features/browser-requests/actions/parse-json
Problem: GET /v1/schemas describes schemaField.type as "(integer): Field type (0-7 enum values)." The parse-json doc lists eight named types (array, boolean, datetime, decimal, double, integer, object, string) but never says which integer maps to which name.
Consequence: A developer who lists their schemas via the API gets back integers with no documented decode. They have to guess (alphabetical? declaration order?), or create a schema with one type and read it back to reverse-engineer the mapping. Agents generating typed clients will emit type: number with no enum guard.
The fix: Either change the API to return the string name (preferred) or publish the integer↔name table in the GET /v1/schemas reference. Mirror it in the OpenAPI spec as a x-enum-varnames or named enum.
9. proxy_location advertised values are never given as actual strings (significant)
Location: /docs/features/browser-requests/parameters (Proxy Servers)
Problem: The Parameters page says proxies are available in "four locations: the United States, Ireland, Singapore, and France." Nowhere is the actual proxy_location parameter value documented — is it "US", "united_states", "us-east", "USA"? The pricing page references proxy_location by name ("All requests that use a proxy_location parameter use our network of residential proxies"), but no enum is shown.
Consequence: Developers must guess the string and iterate via 400s, or open the API Playground to reverse-engineer it. Agents have nothing to put in generated code beyond a placeholder.
The fix: Add an enum table with the four accepted values and example requests for each. Ideally surface them in the OpenAPI spec as a constrained enum on the proxy_location field.
10. No error reference, rate limits, or retry guidance anywhere (significant)
Location: Entire docs tree
Problem: The scraped content covers auth, endpoints, actions, and pricing — but contains no documentation of: HTTP error status codes, error response body shape, rate limits, concurrency limits per plan, retry-after semantics, or what happens when a browser request fails partway through an action array. The closest the docs come is mentioning that responses include "either an output URL or error message" per action — without any error-message schema.
Consequence: Developers building anything beyond a one-shot script can't write correct error handling or backoff. "Scalable… 100s in parallel" (Introduction) is asserted but the practical limit is undocumented. Agents writing production code default to retry-on-anything, which can compound credit burn given the 1 credit / 30 seconds billing model.
The fix: Add an Errors page with the full HTTP status table and an example error body; document per-plan concurrency and rate limits; explain whether failed actions consume credits and how partial-array failures are reported.
11. Beta-feature access has no documented path (significant)
Location: /docs/features/browser-requests/actions/capture-element
Problem: "This feature is currently restricted to approved users. Access requires contacting support for enablement on your account." No link to support, no form, no expected turnaround. The capture_cookies action is also marked beta but the gating model isn't explained at all. Similarly, POST /v1/schemas carries an unexplained {% include "../.gitbook/includes/beta-feature.md" %} snippet that suggests beta gating without describing it.
Consequence: Developers who read about capture_element, decide to use it, write code, then 403 in production with no path forward. The "contact support" instruction lacks an actionable target.
The fix: Link to a concrete request channel (support email, gated form, or an opt-in toggle in the dashboard). Add a Beta Program page that lists every beta-gated capability and the access mechanism for each.
12. No SDKs in the docs despite shipping SDK example repos (significant)
Location: Docs tree vs github.com/GaffaAI (GaffaPythonExamples, GaffaNodeSamples)
Problem: The GitHub org publishes GaffaPythonExamples and GaffaNodeSamples repositories, but the docs site contains only REST examples — no pip install, no npm install, no "Quickstart in Python/Node" page. Get Started's only path is curl-equivalent JSON to the REST endpoints.
Consequence: Developers don't discover the example repos from the docs and instead reinvent HTTP clients. Agents asked to "use the Gaffa SDK" will hallucinate a package name because nothing canonical is documented.
The fix: Either ship and document a real SDK, or rename the example repos and explicitly call them out in Get Started as "code samples, not an SDK — Gaffa is REST-only." Link to both repos from Get Started.
13. Sign-up link has a double-slash and bounces through GitBook's onboarding (minor)
Location: /docs/get-started ("Create an account")
Problem: The Create-an-account link is https://accounts.gaffa.dev/sign-up?redirect_url=https%3A%2F%2Fgaffa.dev%2F%2Fauth%2Fsign-in — note the encoded %2F%2F after the host, producing gaffa.dev//auth/sign-in after decoding. The emailto:support@gaffa.dev link in the same page's hint is also malformed (should be mailto:).
Consequence: The double-slash may or may not break depending on the auth provider's tolerance; the emailto: link definitely does not open a mail client.
The fix: Fix to https://gaffa.dev/auth/sign-in (single slash) and mailto:support@gaffa.dev. Add a docs linkchecker to CI that catches malformed schemes.
14. Surface-level typos and GitBook-syntax leakage (minor)
Location: Multiple
Problem: "no need to learn another new framewor" (Introduction, missing 'k'). "descripton" in the parse-json field-type table (missing 'i'). The model parameter row has a stray backtick: type: `string ``. Theprintaction documents asizeparameter "(default:A4, currently only accepts A4`)" — a parameter with a single legal value reads like dead UI.
Consequence: Mostly cosmetic, but the descripton typo is in a column header that developers may try to use as a literal field name when defining schemas — which then silently breaks JSON parsing because the field is misspelled. The A4-only size parameter signals a placeholder that should not have been exposed.
The fix: Spellcheck, lint the GitBook export for stray backticks/template tags, and either accept additional paper sizes for print.size or drop the parameter until you do.
What they do well
- A single canonical OpenAPI spec at
api.gaffa.dev/swagger.jsonexists — the path to a clean API reference is short. - An
llms-full.txtis shipped and acknowledged in the docs, even if the naming and placement need cleanup. - The Parameters / Settings / Actions split keeps each conceptual surface short and individually scannable.
Top 3 recommendations
- Fix the API Reference link site-wide — the same
/broken/pages/...slug breaks the most prominent CTA on at least three pages. - Reconcile the parameter contracts that disagree between pages:
max_cache_ageunits, request-status enum, action count, andproxy_locationaccepted values. Drive all of them from the OpenAPI spec. - Publish a proper
/llms.txtat the site root, an Errors/Rate Limits page, and either a real SDK doc or an explicit "REST only — here are the sample repos" callout in Get Started.