turbopuffer Documentation Audit
The architecture, guarantees, and limits material is unusually strong for the category — but the on-ramp (quickstart, auth, vector search guide) is alarmingly thin, API versioning is inconsistent without explanation, and several pages contain truncated text, missing code samples, or values that contradict the changelog announcing them.
1. Quickstart is seven bare links, not a quickstart (critical)
Location: /docs/quickstart
Problem: The page body, in full, is: "Get a quick feel for the API with some examples." followed by seven bare link labels (Python client, TypeScript client, Go client, Java client, Ruby client, Community Rust client, Sample Python notebook). There is no inline curl example, no "first write + first query" walkthrough, no shape of an API call, no mention of the namespace concept, no API key acquisition steps.
Consequence: A developer (or coding agent) landing on the canonical "quickstart" gets zero copy-pasteable code. They are forced to navigate to a client-specific README to learn the most basic shape of a request. For an agent following docs to wire up a first integration, this page is dead weight — it cannot bootstrap a working request. The intro page also tells readers "To get started with turbopuffer, see the quickstart guide," so this is the explicit recommended entry point.
The fix: Put a minimal end-to-end example directly on the page: create a namespace, upsert a vector with an attribute, run a query, and show the JSON request/response for at least curl + one SDK. Keep the client links, but anchor them under a real walkthrough.
2. Auth page ends before the example (critical)
Location: /docs/auth
Problem: The entire auth documentation, as rendered, is two sentences: "All API calls require authenticating with your API key. You can create and expire tokens in the dashboard." and "The HTTP API expects the API key to be formatted as a standard Bearer token and passed in the Authorization header:" — the page ends on a colon. There is no header example, no token prefix shown, no mention of expiration behavior, no explanation of the read-only / read/write / admin permission tiers that other pages reference (the namespaces page says "This endpoint is available to API keys with read-only, read/write, or admin permissions" — but no page defines those scopes).
Consequence: Auth is the single most common point of friction for a first integration. Developers can't see the header shape without trial-and-error or jumping into a client library. Agents trying to construct a request from docs alone have no anchor for the Authorization header syntax. The permission tiers referenced elsewhere are also unanchored.
The fix: Add an Authorization: Bearer tpuf_... example, document the three permission scopes (read-only, read/write, admin), specify token expiration semantics ("can expire tokens" is mentioned but TTL/rotation behavior isn't), and link the OpenAPI spec from this page rather than burying it.
3. Endpoints split between v1 and v2 with no map (significant)
Location: /docs/write, /docs/query, /docs/namespaces, /docs/metadata, /docs/recall, /docs/warm-cache, /docs/delete-namespace
Problem: The API mixes versions inconsistently with no documentation explaining why:
POST /v2/namespaces/:namespace(write)POST /v2/namespaces/:namespace/queryDELETE /v2/namespaces/:namespaceGET /v1/namespaces(list)GET /v1/namespaces/:namespace/metadataPATCH /v1/namespaces/:namespace/metadataGET /v1/namespaces/:namespace/hint_cache_warmPOST /v1/namespaces/:namespace/_debug/recall
The roadmap mentions "v2 query API (unifies vector and full-text ranking)" in May 2025 and "/v1/vectors deprecated in favor of /v1/namespaces" in December 2024, but no current page tells a developer which endpoints have been migrated to v2 and which haven't, or whether the v1 endpoints will eventually be migrated.
Consequence: Developers and agents constructing URLs from memory or pattern-matching will get 404s. There is no way to predict which version any given endpoint lives at. The /v1/.../_debug/recall prefix in particular signals "unstable internal" but the page documents it as a normal feature.
The fix: Add an "API versions" section that lists every endpoint with its current version, deprecation status, and migration plan. Either move metadata/namespaces/recall/warm-cache to v2 or explicitly mark them as stable v1 endpoints that won't move.
4. Limits table contradicts the changelog that announced the limit (significant)
Location: /docs/limits vs /docs/roadmap
Problem: The limits table says:
Max full-text query length | 8,192 | 1,024
The two columns are "Observed in production" and "Production limits (current)" — so the current production limit is 1,024 characters. But the April 2026 changelog entry on /docs/roadmap reads:
💗 Increased full-text query length limit to 8,192 chars
The changelog announces an 8,192 limit; the limits table caps it at 1,024.
Consequence: Developers who read the changelog and design queries up to 8,192 chars will hit 1,024-char rejections in production. Agents that ingest both pages get two contradictory ceilings for the same parameter, with no way to resolve which is authoritative.
The fix: Reconcile the table with the changelog. If the rollout happened, update the "current" column to 8,192. If it didn't, retract or qualify the changelog entry.
5. Pricing page has no per-unit rates and the calculator never loads (significant)
Location: /pricing
Problem: The rendered pricing page lists three plans (launch $64/mo min, scale $256/mo min, enterprise ≥$4,096/mo with a 35% usage premium) and a list of FAQ titles ("How are queries billed?", "How are writes billed?", "How is vector logical storage size calculated?") — but no per-unit rates are visible in the page text, and the "Cost calculator" section renders as "Loading...". The pricing changelog (/docs/pricing-log) references rates like "$5/PB" decreasing to "$1/PB" for queried data, but those numbers never appear on the actual pricing page.
Consequence: A developer cannot estimate cost without either running the calculator (which depends on JS that didn't load during the audit) or piecing together numbers from a changelog meant to describe historical changes. Agents and procurement reviewers cannot extract a structured price table from this page at all.
The fix: Render per-unit rates server-side as a static table — TB queried, write GB, storage GB-month, copy throughput, pinned GB-hour — independent of the cost calculator. Expand the FAQ titles into answered text inline rather than collapsible widgets that the static HTML doesn't expose.
6. Vector Search Guide is five lines for the flagship feature (significant)
Location: /docs/vector
Problem: The Vector Search Guide, in full, is roughly:
turbopuffer supports vector search with filtering. Vectors are incrementally indexed in an SPFresh vector index for performant search. Writes appear in search results immediately. The vector index is automatically tuned for 90-100% recall ("accuracy"). We automatically monitor recall for production queries. You can use the recall endpoint to test yourself.
That's it. No worked example, no discussion of distance metrics, no link to ANN/kNN distinctions, no schema example for declaring a vector column, no [N]f32 vs [N]f16 tradeoff (which exists on /docs/write but isn't cross-linked here).
Consequence: "Vector and full-text search built on object storage" is the headline product. The guide for the headline feature offers less depth than the tradeoffs page. New users have to reverse-engineer vector usage from the query reference (/docs/query) and the writes reference (/docs/write), neither of which is a guide.
The fix: Expand /docs/vector into a real guide: schema declaration, dimensions/precision tradeoffs (with link into /docs/performance), worked upsert + ANN query + kNN query, when to use sparse vs dense, and a pointer to hybrid search.
7. Homepage labels cold-query latency as full-text-search latency (significant)
Location: / (homepage), /docs/architecture, /docs (intro)
Problem: /docs/architecture clearly attributes 874ms to a cold first-query: "The first query to a namespace reads object storage directly and is slow (p50=874ms for 1M documents), but subsequent, cached queries to that node are faster (p50=14ms for 1M documents)." The homepage (per WebFetch) describes the same 874ms number as "p50 latency of 14ms for vector search and 874ms for full-text search" — re-labeling a cold-vs-warm distinction as a vector-vs-FTS distinction. The intro page meanwhile cites p90=1214ms cold for 1M vectors, which is a different percentile of (plausibly) the same workload but isn't reconciled with the 874ms figure either.
Consequence: Latency claims are the most consulted numbers on a vector-DB site. A prospective customer reading the homepage will conclude FTS is intrinsically ~60x slower than vector search, when in fact the 874ms is a cold-cache figure for either workload. Procurement reviewers and agents cannot cite a single canonical number.
The fix: Pick one canonical performance table (workload, percentile, cold/warm, vector/FTS) and use it verbatim on the homepage, intro, and architecture page. In particular, rewrite the homepage line so 874ms is labeled "cold p50" rather than "full-text search".
8. Architecture page contains truncated text and leaked UI helpers (significant)
Location: /docs/architecture
Problem: Two unfinished sentences and a UI helper string are baked into the rendered page:
"When the namespace is cached in NVME/memory rather than fetched directly from object storage, the query time drops dramatically to p50=14."
No unit — should be "p50=14ms".
"The roundtrip to object storage for consistency, which we can relax on request for eventually consistent sub 10ms queries."
This is a sentence fragment with no main clause.
"Press enter or space to select a node.You can then use the arrow keys to move the node around. Press delete to remove it and escape to cancel. Press enter or space to select an edge. You can then press delete to remove it or escape to cancel."
This is the keyboard-shortcut tooltip for an interactive diagram that's leaking into the prose body of the page.
Consequence: The architecture page is one of the strongest pieces of marketing/engineering content turbopuffer has — and it ends mid-sentence and includes UI helper strings. Agents scraping for technical claims will ingest "p50=14." and the diagram instructions as content.
The fix: Fix the truncated sentences, append "ms" to the latency number, and exclude the diagram's accessibility tooltip from the rendered markdown body.
9. Hybrid search is recommended but has no code (significant)
Location: /docs/hybrid
Problem: The hybrid search page is positioned as the recommended starting pattern (the intro page says "See Hybrid Search to get started.") and the page recommends: "Keep search logic in {search.py, search.ts}. Use turbopuffer for initial retrieval to narrow millions of results to dozens for rank fusion and re-ranking." But it shows zero code — no example multi-query payload, no client-side RRF function, no scoring snippet, no schema for an FTS+vector namespace. The query reference mentions "Multi-queries: send multiple queries to the same namespace used for hybrid searches" but doesn't link back here either.
Consequence: The recommended entry point for the product is conceptual prose with no implementation. Developers must figure out the request shape, the response shape, and the rank-fusion math themselves. Agents have nothing to copy-paste.
The fix: Add a worked example: schema with one text column and one vector column, a multi-query payload combining BM25 and ANN, and a reference RRF implementation in Python or TypeScript.
10. Backups page promises a script and doesn't include it (significant)
Location: /docs/backups
Problem: Under "Running Backups on Schedule":
"Here's an example script (run via cron or any scheduler) that backs up all namespaces matching a prefix. It appends the date to each backup namespace name and automatically cleans up backups older than 7 days:"
The sentence ends on a colon. No script follows in the scraped content. The "Recovering a Namespace" section similarly says "copy it to a new namespace in your preferred region as shown below:" with no example below in the captured text.
Consequence: Backup is a compliance/DR feature where users expect a paste-and-run reference. Without the script, customers must implement their own naming scheme, retention, and parallelism — and the page implies one already exists somewhere.
The fix: Inline the script (with copy_from_namespace invocations, prefix filtering, naming, and retention pruning), or remove the lead-in if the script lives elsewhere and link it explicitly.
11. BYOC script filename is hyphenated in instructions but underscored on disk (significant)
Location: /docs/byoc/deployment
Problem: The kit's directory listing shows scripts/generate_secrets.py (underscore). Step 4 of the runlist tells operators to "Run ./scripts/generate-secrets.py to generate values.secret.yaml" — with a hyphen. Same script, two different filenames in the same document.
Consequence: Operators copy-pasting the command will hit a "No such file or directory" error on the very first secrets-generation step of a BYOC deployment. BYOC is an enterprise-only setup flow where the runlist is the contract.
The fix: Pick one canonical filename (underscore is the convention shown in the tree) and use it everywhere — directory listing, runlist text, and any other doc references.
12. BYOC runlist stops at step 6 with no upgrade procedure (minor)
Location: /docs/byoc/deployment
Problem: The runlist contains steps 0–6: Mise en place, Cluster configuration, kubectl, Configure Helm, Generate API keys, Deploy turbopuffer, Run post-deployment sanity checks. There is no documented helm upgrade or maintenance step, even though the install step (5) uses helm install and most operators will eventually need an upgrade procedure.
Consequence: Operators who deploy turbopuffer BYOC successfully have no documented path for routine upgrades, leaving them to either guess at helm upgrade flags or open a support ticket every time turbopuffer publishes a new chart.
The fix: Add a step 7 (or a separate "Maintenance" section) documenting the helm upgrade invocation, expected downtime, and rollback procedure.
13. Recall endpoint is documented under a _debug path with no stability note (minor)
Location: /docs/recall
Problem: The recall endpoint is POST /v1/namespaces/:namespace/_debug/recall. The _debug prefix conventionally signals an internal or unstable endpoint, but the page documents it as a normal feature with billing rules and a use-case ("You can use the recall endpoint to test yourself" on /docs/vector). No note explains whether the URL is stable or whether it may move.
Consequence: Developers integrating recall monitoring into their own dashboards can't tell whether to depend on this URL. If it gets renamed, downstream tooling breaks silently.
The fix: Either move it out of the _debug namespace, or add a stability/SLA note clarifying that the path is committed despite the prefix.
14. disable_backpressure exceptions buried in prose (significant)
Location: /docs/write
Problem: The disable_backpressure parameter has an important asymmetry buried in a single prose paragraph: "Only takes effect for upserts and delete-by-id. Ignored for patch-by-id, patch-by-filter, delete-by-filter, and conditional writes, since those operations require a strongly consistent read of existing rows." This isn't a callout or a table — it's the last line of the parameter description, and it changes the semantics of bulk ingest pipelines that mix upserts and patches.
Consequence: A bulk loader that sets disable_backpressure: true and uses patches for partial updates will silently still backpressure, get 429s under load, and the developer will believe they disabled it.
The fix: Pull this into a callout/admonition under the parameter, and add an "Effective on" / "Ignored on" table.
15. Conditional-write semantics described twice with different rules on the same page (significant)
Location: /docs/write
Problem: The same conditional-write behavior is documented in two places on /docs/write with different rules for the "document does not exist" case. The first description, under upsert_condition, says:
"If the document does not exist, the write is applied unconditionally (i.e. the document is created)."
The second description, later in the "Conditional writes" section, correctly distinguishes upserts from patches and deletes:
"If the document does not exist, the write is applied unconditionally for upserts and skipped unconditionally for patches and deletes."
The first statement is wrong (or at best incomplete) for conditional patches and deletes.
Consequence: A developer reading the parameter description for upsert_condition and using it with a conditional delete or patch will expect the document to be created — but per the later section it would be skipped. The exact bug is then invisible at runtime: a "delete" that silently no-ops.
The fix: Rewrite the first description to match the more complete rules in the second, or factor the conditional-write semantics into a single canonical table referenced from both spots.
16. Two changelogs, no single index (minor)
Location: /docs/roadmap, /docs/pricing-log
Problem: Product changes live on /docs/roadmap (titled "Roadmap & Changelog"); pricing changes live on /docs/pricing-log (titled "Pricing Changelog"). A pricing change like the April 2026 introduction of pinning's GB-hour billing is on /docs/pricing-log, while the pinning feature itself is on /docs/roadmap. There is no unified changelog page surfaced in navigation.
Consequence: A customer reviewing "what changed since I last integrated" has to remember to check two pages. Agents indexing changes for migration help will only find half the story per fetch.
The fix: Add a unified /docs/changelog page that interleaves both streams, or surface a top-of-page "see also" cross-link between the two pages.
17. Write commit latency claim in /docs/tradeoffs is below the documented p50 (significant)
Location: /docs/tradeoffs vs /docs/write, /docs/architecture
Problem: /docs/tradeoffs says: "writes take up to 200ms to commit". /docs/write and /docs/architecture both quote p50=285ms for 500kB upserts ("Latency: Upsert latency 500kb docs — p50 285ms, p90 370ms, p99 688ms"; "high write latency (p50=285ms for 500kB)"). The "up to 200ms" claim is below the median documented elsewhere, let alone p90 (370ms) or p99 (688ms).
Consequence: Customers reading /docs/tradeoffs design write-path SLOs around 200ms, then in production observe steady-state latencies 40-200% above that number with no docs-supported explanation. Agents asked "what's the write latency?" pull contradictory answers from two pages.
The fix: Replace "up to 200ms" on /docs/tradeoffs with the canonical figures (p50=285ms / p90=370ms / p99=688ms for 500kB), or qualify it ("p50 for small writes can be under 200ms").
18. Recall band stated differently across three pages (minor)
Location: /docs/concepts vs /docs/vector, /docs/limits
Problem: Same metric, three different numbers:
- /docs/concepts: "maintaining >90-95% recall@10 even in large namespaces"
- /docs/vector: "automatically tuned for 90-100% recall"
- /docs/limits: "Vector search recall@10 | 90-100% | 90-100%"
>90-95% reads as "above 90, below or around 95"; 90-100% reads as a wider band that includes the 95-100% region.
Consequence: Customers and procurement reviewers comparing turbopuffer to alternatives need a single number for recall@10. Three different bands let prospects pick the worst (or competitors cite the worst) — and undermine the strength of the actual figure.
The fix: Pick one canonical band (likely 90-100% since limits and vector agree) and use it on /docs/concepts as well, or explain why concepts uses a tighter range.
19. /docs/delete-namespace ends on "## Examples" with no examples (minor)
Location: /docs/delete-namespace
Problem: The full body of the page is a one-paragraph description of DELETE /v2/namespaces/:namespace, followed by an ## Examples heading and nothing under it. Same truncation pattern as /docs/auth and /docs/backups.
Consequence: Anyone scanning the page for a request example (or an agent scraping the page for one) leaves with nothing. The endpoint is simple enough that this is low-impact for humans but still signals that the page is unfinished.
The fix: Add a curl example showing the namespace name, the Authorization header, and a successful response — or remove the empty "Examples" heading.
20. /docs/export tells you to "use the query API" but shows no example (significant)
Location: /docs/export
Problem: The export page reads, in full: "To export all documents in a namespace, use the query API to page through documents by advancing a filter on the id attribute. Documents inserted while the export is in progress will be included." There is no example payload, no pagination-cursor scheme shown, no filter shape for "advance on id", and no client-side loop. The link to copy_from_namespace covers server-side copies but is positioned as an alternative, not an example.
Consequence: "Export" is a primary operational concern — backups, migrations, debugging. Telling a developer "page through documents by advancing a filter on the id attribute" without a code sample means the first export script is hand-rolled and error-prone (off-by-one cursors, missing the inclusive/exclusive boundary on id).
The fix: Add a worked pagination loop: an id > last_seen_id filter, the request shape, the page-size choice, and a stop condition.
21. copy_from_namespace is referenced across many pages but has no canonical reference (minor)
Location: /docs/regions, /docs/backups, /docs/branching, /docs/export, /docs/cmek (all referencing it); /docs/write (only listed as one bullet in the writes list)
Problem: copy_from_namespace is mentioned as the recommended approach on at least five pages (cross-region moves, backups, branching alternative, export-with-transform, CMEK re-encryption). The scraped /docs/write page mentions it only as one bullet in a list of write types ("Copy from namespace: copies all documents from another namespace.") with no expanded parameter spec, throughput model, or example. There is no dedicated /docs/copy-from-namespace reference page visible.
Consequence: The single API call that backups, region migrations, and CMEK rotation all hinge on doesn't have a canonical reference. Developers piecing together a backup pipeline have to triangulate behavior from /docs/backups, /docs/regions, /docs/branching, and the one-line mention on /docs/write.
The fix: Add a /docs/copy-from-namespace reference page (parameters, throughput limits, billing discount, cross-region/cross-cloud behavior, encryption semantics) and link it from all five pages that already mention the operation.
22. Unit drift between /docs/limits and /docs/write for the same threshold (minor)
Location: /docs/limits vs /docs/write (disable_backpressure)
Problem: /docs/limits says "Max ingested, unindexed data | 2 GB". The disable_backpressure parameter description on /docs/write says it "Disables HTTP 429 backpressure on writes when unindexed data exceeds 2 GiB." Same threshold, two different units — and 2 GiB ≠ 2 GB (≈7.4% off in a database where the customer is being told to design pipelines around this number). The /docs/permissions page also uses "8Mib" while /docs/limits consistently uses "8 MiB", and /docs/metadata contains a duplicated phrase: "The timestamp when the namespace when the namespace's data or schema was last modified".
Consequence: Customers tuning bulk ingest pipelines don't know whether the actual cutoff is 2,000,000,000 bytes or 2,147,483,648 bytes. Agents asked "what's the unindexed-data limit?" return whichever page they hit first.
The fix: Pick IEC units (GiB/MiB/KiB) and apply consistently across /docs/limits, /docs/write, and /docs/permissions; fix the duplicated phrase on /docs/metadata.
What they do well
- The /docs/architecture, /docs/guarantees, /docs/tradeoffs, and /docs/limits pages are unusually clear and specific for the category — concrete latency numbers, named index (SPFresh), CAP positioning, and a candid "may not be the best fit for" column.
- Public OpenAPI spec is published (github.com/turbopuffer/turbopuffer-openapi), which gives agents a programmatic discovery path even where the prose docs are thin.
- The pricing changelog (/docs/pricing-log) is a genuinely useful artifact most vendors don't publish.
Top 3 recommendations
- Fix the on-ramp. A real /docs/quickstart with end-to-end curl + SDK examples, an /docs/auth page that doesn't end on a colon, and a /docs/export page that shows the pagination loop. These are the highest-leverage pages and currently among the thinnest.
- Reconcile contradictions and truncations. FTS length limit (1,024 vs 8,192), write latency (200ms vs p50=285ms), recall band (>90-95% vs 90-100%), unindexed-data threshold (2 GB vs 2 GiB), homepage labeling cold-p50 as FTS-p50, the conditional-write rule mismatch on /docs/write, and the unfinished sentences plus leaked diagram tooltip on /docs/architecture.
- Document the v1/v2 split and the BYOC script name. Either migrate the v1 stragglers to v2 or add a single "API versions" table; and fix the
generate_secrets.pyvsgenerate-secrets.pyfilename mismatch in /docs/byoc/deployment so step 4 doesn't fail on a fresh shell.