Teradata Documentation Audit
The docs.teradata.com portal is a JavaScript SPA fronting a sprawling, multi-decade reference archive. The surface index doesn't render without JS, sibling reference manuals for the same engine carry mismatched version stamps (17.20, 20.00, 24.08, 3.x), deep-links into manual subsections 404 server-side, and AI Unlimited has been quietly redirected off the docs host to a github.io page that itself returns 404 to non-browser clients. Agent-discovery infrastructure (llms.txt, OpenAPI surfaces, a server-rendered search) is absent.
1. SPA shell prevents non-browser access to every doc page (critical)
Location: https://docs.teradata.com/ (root) and every /r/... deep link
Problem: A bare web_fetch of the root returns only the title string "Docs Teradata"; a Playwright render waited >20s and still surfaced only cookie/consent chrome with no docs content. Deep-links like https://docs.teradata.com/r/Enterprise_IntelliFlex_VMware/SQL-Data-Manipulation-Language/INSERT and /UPDATE return server-side HTTP 404 — the route is only resolved client-side after the SPA boots.
Consequence: AI coding agents (Claude, Cursor, Copilot), curl/wget-based ingestion, archival crawlers, and link-checkers cannot read the docs at all. Developers who bookmark or share a sub-section URL hand colleagues a 404. Search engines that fall back from JS rendering see effectively nothing.
The fix: Server-render (or pre-render) the docs HTML for at least page title, breadcrumbs, headings, and prose. Make every TOC node a real GET-able URL with the same content the SPA renders client-side. At minimum, ship <noscript> content with the page body.
2. No llms.txt / llms-full.txt and no agent-curated index (significant)
Location: https://docs.teradata.com/llms.txt, https://docs.teradata.com/llms-full.txt
Problem: Both probes return HTTP 404. The only machine-discovery surface is sitemap.xml, which splays into home.xml (1 URL), pages.xml (6 URLs, half non-English mirrors), plus huge structured/ and unstructured/ lists containing opaque hashed slugs (e.g. /r/OH1tUhcqCBT6UiaF1iFe7Q/root, /r/w89~KD_M_nbsCg28uZMTGg/root, /v/u/wzGlv88OlovE~4rNqGzkFA) with no semantic titles.
Consequence: Agents have no curated entry point into Teradata's docs. Combined with the SPA problem (Issue 1), agents either skip the site or chew through tens of thousands of stale PDF-mirror entries that don't even carry titles in their URLs.
The fix: Publish llms.txt with a hand-curated map of the current product line (VantageCloud Enterprise, VantageCloud Lake, Database Engine 20, AI Factory, AI Unlimited, ClearScape Analytics, Viewpoint, Python/LangChain SDKs). Drop legacy archive pages from the main sitemap or move them to an archive/ sitemap. Replace hash-only slugs with named ones.
3. Sitemap index stamps every child sitemap with today's date (significant)
Location: https://docs.teradata.com/sitemap.xml
Problem: The top-level sitemap index lists all four child sitemaps (home.xml, pages.xml, structured/1.xml, unstructured/1.xml) with lastmod = 2026-05-18T19:14:24.031Z, identical to today's date — even though the child sitemaps include legacy documents like Teradata-Tools-and-Utilities-for-Oracle-Solaris-on-SPARC-Systems-Installation-Guide/March-2014 and 2800-Platform-Product-and-Site-Preparation-Guide/February-2018 that have not been touched in years.
Consequence: lastmod is the primary freshness signal for crawlers, and on this host it is uniform across hundreds of pages of differing ages. Search engines have no way to prefer truly fresh docs over abandoned ones; agents inherit the same blindness.
The fix: Emit lastmod per child sitemap from the most recent real page change in that sitemap, and emit per-URL <lastmod> from each page's actual "Last Update" timestamp (the Data Mover guide already carries 2023-07-14, the Python ML reference February 2024). If true mtimes aren't tracked, omit lastmod rather than emit a uniform value.
4. AI Unlimited has been moved off the docs host and the redirect target 404s (critical)
Location: https://docs.teradata.com/p/ai-unlimited → https://teradata.github.io/ai-unlimited-docs/
Problem: The only "AI Unlimited" landing on the docs portal 302-redirects to a different host (teradata.github.io), and that destination returns HTTP 404 to non-browser fetchers. A separate rendering shows the canonical "Get started" link is a relative path /ai-unlimited/install-ai-unlimited/ that only resolves on developers.teradata.com, not on the github.io host the redirect actually points at.
Consequence: A first-time AI Unlimited user clicking from the docs hub lands on a 404 (or a page whose internal links lead to a 404). AI Unlimited is also one of the four featured tutorials on developers.teradata.com/quickstarts/, so the contradiction is visible to anyone who lands from a search engine.
The fix: Either host AI Unlimited docs back under docs.teradata.com/p/ai-unlimited with a real (non-redirect) page, or fix the github.io target to respond 200 with the rendered docs and absolute links. Add the AI Unlimited product to sitemap/pages.xml.
5. Database Utilities is the only engine reference on 20.00; every other reference stayed on 17.20 (critical)
Location: Concurrent reference manuals:
/r/Enterprise_IntelliFlex_VMware/Database-Utilities— "Teradata Vantage | 20.00", "Database Engine 20", June 2025, Last Update 2026-03-26/r/Enterprise_IntelliFlex_VMware/SQL-Data-Manipulation-Language— "Teradata Vantage | 17.20", June 2022, Last Updated April 2, 2025/r/Enterprise_IntelliFlex_VMware/Database-Design— "Vantage | 17.20", June 2022, Last Update 2025-11-21/r/Enterprise_IntelliFlex_VMware/SQL-Operators-and-User-Defined-Functions— 17.20 / June 2022, Last Update 2025-03-29/r/Enterprise_IntelliFlex_VMware/Geospatial-Data-Types— 17.20 / June 2022, Last Update 2023-08-30/r/Enterprise_IntelliFlex_VMware/XML-Data-Type— 17.20 / June 2022, Last Update 2023-10-30/r/Enterprise_IntelliFlex_VMware/Time-Series-Tables-and-Operations— 17.20 / June 2022, Last Update 2023-10-30/r/Enterprise_IntelliFlex_Lake_VMware/Teradata-Viewpoint-User-Guide-24.08— Release 24.08- VantageCloud Enterprise release summaries — 2.4.1, 3.0.0.0, 3.3.0.0
Problem: The Database Utilities reference has been refreshed and rebadged to 20.00 (Database Engine 20). Every other engine-side reference manual — DML, Database Design, SQL Operators, Geospatial, XML, Time Series — stayed on the 17.20 banner from June 2022, even though several have been edited as recently as 2025. There is no compatibility note on the docs portal explaining whether the 17.20 references are still authoritative for Database Engine 20.
Consequence: A developer or DBA cannot tell whether the SELECT/MERGE/UPDATE syntax in the 17.20 DML reference is current for Database Engine 20. The asymmetry — one manual on 20.00, everything else on 17.20 — looks like an in-progress migration, not a deliberate compatibility statement. Agents synthesizing answers from multiple manuals will silently mix versions.
The fix: Either roll the remaining engine-side references to the 20.00 banner once re-validated, or add an explicit "Validated for Database Engine 20.00" header on every 17.20 manual that is in fact current. Publish a single canonical compatibility table at the top of every reference manual.
6. Deprecation guidance and current DBA guidance live on different pages with no cross-reference (significant)
Location: /r/Fast-Facts/Loading-and-Unloading-Data (February 2020) and /r/Database-Administration/June-2020
Problem: The Fast Facts page flags FastLoad, MultiLoad, TPump, and FastExport as "Deprecated – use TPT instead". The Database Administration guide — explicitly tagged "Previous lifecycle version", but still in the sitemap and the only DBA guide surfaced — instructs admins to use "FastLoad, FastExport, and MultiLoad utilities" with no banner pointing to the deprecation list or to a current DBA guide for Database Engine 20.
Consequence: Operators following the DBA guide design ETL pipelines around utilities the rest of the docs marks deprecated. There's no inline migration pointer.
The fix: Add a top-of-page banner on the legacy DBA guide pointing to the current Database Engine 20 equivalent and to the deprecation list. Inline a "TPT replaces this" callout next to each FastLoad/FastExport/MultiLoad usage. Date-stamp the Fast Facts page as authoritative or replace it with a refreshed guide — six years is long for a "Fast Facts" surface.
7. Sitemap-published page returns 404; Lake landing slug 404s too (significant)
Location: /r/Fast-Facts/Vantage-Documentation-on-a-Page (entry 47 in sitemap/unstructured/1.xml); /r/Lake
Problem: Teradata's own crawl index points search engines and agents at Vantage-Documentation-on-a-Page; the URL returns HTTP 404. The /r/Lake short slug — implied by every /r/Lake/... reference page, and listed as a supported edition by the Viewpoint 24.08 user guide ("Lake editions") — also returns HTTP 404 despite Lake being the only product surfaced in sitemap/pages.xml.
Consequence: Self-inflicted dead links. Crawlers report them; agents that follow the sitemap or follow Viewpoint's "supported deployments" list waste budget on 404s and may downgrade the site's perceived quality.
The fix: Either restore the pages or remove them from the sitemap. Render a real /r/Lake landing that matches what Viewpoint and other products advertise as a supported edition. Run a CI link-checker against your own sitemap on every deploy.
8. "Getting Started" navigation points to a redirect stub and a sibling portal with broken links (significant)
Location: https://docs.teradata.com/ header → "Getting Started" → quickstarts.teradata.com → developers.teradata.com/quickstarts/
Problem: quickstarts.teradata.com resolves only as a "Redirect Notice" page. developers.teradata.com/docs returns HTTP 404. developers.teradata.com/quickstarts/get-access-to-vantage returns HTTP 404 for one of the advertised tutorial categories ("Get access to Vantage"). The four featured "Getting Started" guides (Docker / AWS / Open Table Formats / AI Unlimited) live only on developers.teradata.com and have no equivalents on docs.teradata.com's product index.
Consequence: A first-time visitor clicking "Getting Started" from the docs portal hops two hosts to reach the actual content; clicking a sibling category from the destination lands on 404. Documentation discoverability fails at the first click.
The fix: Pick one canonical "Getting Started" surface. Fix the broken category URLs on the developers portal. Either fold the quickstart tutorials into docs.teradata.com or have docs.teradata.com deep-link directly into specific quickstart pages, not through quickstarts.teradata.com.
9. No server-rendered search endpoint (significant)
Location: https://docs.teradata.com/search
Problem: Returns HTTP 404. Search is only available client-side after the SPA boots; there is no deep-linkable query URL (no ?q=…), no application/json search result endpoint, and no OpenSearch description discoverable on the portal.
Consequence: Agents and link-share workflows can't construct a search URL ("look at the search results for X" / "share this query"). Users can't bookmark queries. Browsers can't register the site as a search engine.
The fix: Add a server-rendered /search?q=… route that returns at least an HTML list of titles + URLs. Expose an OpenSearch descriptor and a JSON results endpoint.
10. Release summary HTML pages publish only a title; the change log is PDF-gated (significant)
Location: /v/u/Enterprise/VantageCloud-Enterprise-Release-Summary-3.3.0.0 (and 3.0.0.0, 2.4.1, …)
Problem: The 3.3.0.0 release summary page's entire body is a title, an edition label ("Enterprise"), a version string ("3.3.0.0"), a language label, and a footer. Verbatim: there are no release notes, no "what's new", no "fixed", no "known issues". The 3.0.0.0 page links out to B700-4000-075C_Enterprise_3.0.0.0_ReleaseSummary.pdf; the 2.4.1 page links to B700-4000-013C_VantageCloud_Enterprise_2.4.1_ReleaseSummary.pdf. The 2.4.1 entry is also still titled "Teradata Vantage on Google Cloud" — a product name predating the rename to VantageCloud Enterprise.
Consequence: Release-summary HTML is unsearchable, unindexable, unlinkable to specific changes, and invisible to agents. Renamed product lines aren't back-renamed, so users searching "VantageCloud Enterprise 2.4.1" can't find the 2.4.1 release summary by its current product name.
The fix: Render the PDF's change log content as actual HTML on the summary page. Add anchors per change item. Reconcile historical product names (either rename old titles or add an alias/redirect from current name to historical entries).
11. Last-update divergence among 17.20 sibling manuals signals staleness only sometimes (significant)
Location: 17.20-branded references — last updates range from 2023-08-30 (Geospatial Data Types), 2023-10-30 (XML Data Type), 2023-10-30 (Time Series), through 2025-03-29 (SQL Operators), 2025-04-02 (SQL DML), to 2025-11-21 (Database Design)
Problem: Same product release; same banner version; one manual hasn't been touched in ~2.5 years (Geospatial, last update August 2023) while a peer (Database Design) was updated within the last six months. There is no visible "stale since" or "last verified against engine X.Y" indicator on any of these pages.
Consequence: Readers can't tell whether the Geospatial reference is just stable or quietly abandoned. Agents lifting examples from a 2023 page assume parity with a 2025 page.
The fix: Add an explicit "Verified against engine release N as of date D" header. Auto-warn (UI banner) when a manual hasn't been re-verified within a release cycle.
12. Python SDK reference is 2 years stale while sibling LangChain reference is current; no compatibility matrix (significant)
Location: /r/Lake/Teradata-Package-for-Python-Function-Reference-on-VantageCloud-Lake vs /r/Enterprise/Teradata-Package-for-LangChain-Function-Reference
Problem: The Python (teradataml) reference is "Published in November 2022 with a last update in February 2024" and banner-branded 17.20. The LangChain reference is labelled version 20.00.00.01, published in December 2025. Neither page tells the reader which teradataml + langchain-teradata versions are tested together, or which engine each is validated against.
Consequence: A developer wiring up RAG against Teradata picks a teradataml version that may or may not work with langchain-teradata 20.00.00.01 and against an unspecified engine version. Agents have no signal to recommend a known-good combination.
The fix: Publish a compatibility table on both reference pages: SDK version × LangChain version × engine version × tested status. Refresh the teradataml reference against current engine and bump its banner.
13. Python SQL driver and MCP server are documented only on GitHub; no link from docs.teradata.com (significant)
Location: github.com/Teradata/python-driver README and github.com/Teradata/teradata-mcp-server vs the teradataml reference under docs.teradata.com
Problem: There are at least three distinct, agent-relevant Python entry points: the high-level teradataml analytics package (documented on docs.teradata.com, stale at 17.20), the low-level teradatasql PEP-249 driver (documented only in the GitHub README — including pip install teradatasql, auth mechanisms KRB5, LDAP, TD2, TDNEGO, BEARER, BROWSER, CODE, CRED, JWT, ROPC, SECRET, and TLS detail), and the community-developed teradata-mcp-server (51 stars on GitHub, MIT-licensed, the only first-party MCP surface for Teradata). docs.teradata.com does not link to any of these repos individually — only to the generic github.com/teradata org.
Consequence: Developers hitting the docs portal for "Python" find only the analytics SDK and miss the driver entirely; agents asked "is there an MCP server for Teradata" find nothing in the docs and have to discover the repo independently. An agent asked "how do I connect to Teradata from Python" can answer using the wrong package because the docs surface only one.
The fix: On the docs Python/AI landing, explicitly distinguish teradataml (analytics), teradatasql (driver), and teradata-mcp-server (MCP). Link to each GitHub repo with version and install commands. Document logmech values, especially the OIDC-era ones (BEARER, JWT, BROWSER), on docs.teradata.com rather than only in the README.
14. AI Factory landing has no introductory text — only a link list (significant)
Location: /r/Enterprise_IntelliFlex_VMware/Teradata-AI-Factory
Problem: AI Factory is a new flagship product (Published: June 2025; Last Update 2025-07-03) but its landing page is "a documentation index rather than a detailed introduction. No introductory text, procedures, or expanded table of contents content is present." The body offers six navigation links (AI Factory, AI Workbench, Database Engine 20, AI Microservices with NVIDIA, Customer GPU, Get your Documentation Guides) and a generic "Get your Documentation Guides" CTA — nothing that tells a first-time reader what AI Factory is, what it includes, or how to start.
Consequence: First-time visitors and agents arriving at the canonical AI Factory page get no signal about scope, prerequisites, or what differentiates AI Factory from AI Workbench, AI Unlimited, or AI Microservices. The product is invisible to anyone who doesn't already know the navigation tree.
The fix: Write a real overview on the AI Factory landing: what it is, who it's for, what components ship, how it relates to AI Workbench / AI Microservices, and a concrete "start here" path. Treat the landing as the canonical introduction, not just a table of contents.
15. Data Mover guide is 3 years stale despite being listed as supported on every current edition (significant)
Location: /r/Enterprise_IntelliFlex_Lake_VMware/Teradata-Data-Mover-User-Guide-20.00
Problem: The Data Mover User Guide is labeled v20.00 and "Published: July 2023" with Last Update: 2023-07-14 — the same day as publication, meaning the document has not been edited in nearly 3 years. The same page still claims supported deployments across "VantageCloud, VantageCore, VMware, Enterprise, IntelliFlex, Lake" — every current edition. No staleness banner.
Consequence: Operators rely on a 3-year-old reference for command-line syntax, RESTful API behavior, encryption utilities, and DSA incremental copy details against engines and editions that have shipped major releases since. Anything that changed in the product after July 2023 is silently missing.
The fix: Either re-validate the Data Mover guide against current engine/edition releases and update the "Last Update" stamp with real changes, or surface a banner stating the document was last verified in July 2023 and listing which editions/releases it is still authoritative for.
16. Deprecated DBS settings list features without "use X instead" pointers (minor)
Location: /r/Enterprise_IntelliFlex_VMware/Database-Utilities (Release 20.00, June 2025)
Problem: Multiple utility settings carry a bare [Deprecated] flag with no inline replacement: DMLStatementShipping, LockLogger, LockLogger Delay Filter, LockLogger Delay Filter Time, LockLogSegmentSize, RepCacheSegSize.
Consequence: A DBA reading the current 20.00 Utilities reference learns a setting is deprecated but not what to migrate to. The deprecation notice is, in the rest of the docs (see Fast Facts), expected to come with "use TPT instead"-style guidance.
The fix: For every [Deprecated] entry, append "Use X instead" or "Removed in release N; see [link]". If no replacement exists, say so explicitly.
17. Localized URLs contain dakuten-stripped slugs (minor)
Location: /r/Enterprise/VantageCloud-Enterprise-on-AWS-DIY-インストールと管理カイト-3.2.0.0; sitemap entries like Teradata-QueryGridTM-インストールとユーサー-カイト-2.20, SQLテータ-タイフおよひリテラル
Problem: Japanese slugs lose dakuten marks (ガイド → カイト, データ → テータ, ユーザー → ユーサー), producing characters that read as misspellings to native readers and that can't be reliably typed or shared. Some surfaces mix English and dakuten-stripped Japanese in the same URL.
Consequence: Japanese-speaking developers can't search for, type, or accurately share their own docs' URLs. Copy-paste through clients that normalize Unicode breaks the link.
The fix: Either transliterate localized slugs (e.g. installation-guide for the Japanese variant) or store the original Japanese with full dakuten and serve them URL-encoded. Don't strip combining marks.
18. Opaque hash-only slugs in the public sitemap (minor)
Location: sitemap/structured/1.xml and sitemap/unstructured/1.xml
Problem: Entries like /r/OH1tUhcqCBT6UiaF1iFe7Q/root, /r/w89~KD_M_nbsCg28uZMTGg/root, /r/eytOQ0OSK3HmPUoq9CR_RA/root, /v/u/wzGlv88OlovE~4rNqGzkFA are listed alongside titled pages.
Consequence: Crawlers and agents have no semantic signal to decide whether to fetch these. They look like junk and degrade the trust of the sitemap as a whole.
The fix: Either include the human-readable slug as the canonical URL and 301 the hash slug to it, or omit hash-only slugs from the public sitemap.
What they do well
- The Viewpoint 24.08 guide carries an honest, recent "Last Updated: November 21, 2025" stamp and is current.
- The LangChain reference (
20.00.00.01, December 2025) is on a current release and matches the pace of LLM tooling. - The product mix surfaced via the developers portal (dbt-teradata, Airbyte, JDBC, ModelOps, Iceberg/Delta) shows the developer-facing surface area exists — it just isn't linked from the docs portal.
Top 3 recommendations
- Server-render the SPA. Every
/r/...and sub-section URL must respond 200 with real HTML — this is the root cause behind agent invisibility, broken deep-links, and search-engine starvation. Pair with a real/search?q=…endpoint. - Reconcile version banners. Either roll the remaining 17.20 engine references (DML, Database Design, SQL Operators, Geospatial, XML, Time Series) onto the same 20.00 banner Database Utilities already uses, or add an explicit "Validated for Database Engine 20.00" header on every 17.20 manual that is in fact current. Publish a single cross-product compatibility matrix.
- Ship
llms.txt, clean the sitemaps, and surface the GitHub stack. Curate a hand-built index of current products (Database Engine 20, VantageCloud Enterprise/Lake, AI Factory, AI Unlimited, ClearScape, Viewpoint, Python SDKs, LangChain SDK, dbt adapter,teradatasqldriver,teradata-mcp-server). Move legacy and hash-only entries to anarchive/sitemap, and emit honest per-pagelastmodtimestamps instead of stamping every entry with today's date.