# P06 — AI-Legibility

<div class="principle-header" id="bkmrk-scd-p06-%C2%A0%C2%B7%C2%A0-principl"><div class="principle-id">SCD-P06 · Principle 6 of 10</div><div class="principle-title">AI-Legibility</div><div class="principle-tagline">“If an AI agent cannot cite you, you do not exist.”</div> <span class="category-badge">Digital Layer</span></div>## Definition

<div class="definition-box" id="bkmrk-climate-data-and-imp">Climate data and impact claims must be structured so that AI agents can discover, parse, cite, and cross-reference them without human intermediation. Minimum requirements: schema.org/Dataset markup on all public pages, JSON-LD structured data, persistent canonical URLs, and inclusion in at least one major open index (Global Forest Watch, Climate TRACE, EDGAR, or Copernicus).</div>## Rationale

<div class="rationale-box" id="bkmrk-the-ai%2Besg-verificat">The AI+ESG verification market is growing at 28% CAGR and already processes over 100,000 ESG sources daily using NLP models (WEF, 2024). AI agents are used by investors, regulators, and procurement teams for automated due diligence — without notifying the organisations being assessed. An organisation invisible to AI agents is invisible to the decision-makers those agents serve. Critically: AI without grounded sovereign data can also hallucinate plausible-sounding figures — making sovereign data not just a visibility tool but a truth anchor.</div>## Implementation Steps

1. Add schema.org/Dataset JSON-LD to every public data page (see JSON-LD reference below).
2. Submit datasets to Google Dataset Search via schema.org markup.
3. Register with at least one global open data index (CKAN, DataCite, GFW API).
4. Maintain a public llms.txt file (analogous to robots.txt) guiding AI agents to authoritative sources.
5. Run quarterly AI scans: query Climate TRACE, GFW, and EDGAR to confirm your data is indexed.

## JSON-LD Reference Example

```json
{
  "@context": "https://schema.org",
  "@type": "Dataset",
  "name": "Colombia Deforestation Alerts 2024",
  "description": "GLAD-L primary forest loss alerts for Colombia, 2024.",
  "url": "https://data.cleantechhub.net/datasets/colombia-deforestation-2024",
  "identifier": "https://doi.org/10.XXXX/cth-col-def-2024",
  "creator": { "@type": "Organization", "name": "CleantechHUB" },
  "datePublished": "2025-01-15",
  "license": "https://creativecommons.org/licenses/by/4.0/",
  "spatialCoverage": { "@type": "Place", "name": "Colombia" },
  "temporalCoverage": "2024-01-01/2024-12-31",
  "keywords": ["deforestation", "Colombia", "GLAD-L", "primary forest", "sovereign data"],
  "distribution": [{
    "@type": "DataDownload",
    "encodingFormat": "application/json",
    "contentUrl": "https://data.cleantechhub.net/api/v1/datasets/colombia-deforestation-2024"
  }]
}
```

## Compliance Checklist

<table class="checklist-table" id="bkmrk-criterionwhat-it-mea"> <thead><tr><th></th><th>Criterion</th><th>What it means</th></tr></thead> <tbody><tr><td>☐</td><td>**schema.org/Dataset markup live**</td><td>JSON-LD is present on all public data pages and passes Google Rich Results test.</td></tr><tr><td>☐</td><td>**Listed in open index**</td><td>Dataset appears in at least one: GFW API, Climate TRACE, EDGAR, DataCite.</td></tr><tr><td>☐</td><td>**llms.txt published**</td><td>A public llms.txt file at cleantechhub.net/llms.txt guides AI agents.</td></tr><tr><td>☐</td><td>**Quarterly AI scan**</td><td>Last scan date recorded; data confirmed indexed in at least one major platform.</td></tr></tbody></table>

## Regulatory References

- EU CSRD — Art. 8 (machine-readable XBRL tagging requirement)
- TCFD Recommendations — Pillar 4 (Metrics and Targets, digital disclosure)
- IICSR AI+ESG Market Report 2025

## Recommended Tools and Platforms

<span class="tag">Google Rich Results Test</span> <span class="tag">schema.org validator</span> <span class="tag">llms.txt specification</span> <span class="tag">DataCite</span>

## Keywords

<span class="tag tag-kw">AI legibility</span> <span class="tag tag-kw">schema.org</span> <span class="tag tag-kw">JSON-LD</span> <span class="tag tag-kw">ESG AI</span> <span class="tag tag-kw">llms.txt</span> <span class="tag tag-kw">due diligence</span> <span class="tag tag-kw">NLP</span>

<div class="related" id="bkmrk-related-principles%3A-"> **Related Principles:** [SCD-P01](https://wiki.cleantechhub.net/books/sovereign-climate-data/page/pp01) · [SCD-P02](https://wiki.cleantechhub.net/books/sovereign-climate-data/page/pp02)</div><div class="meta-footer" id="bkmrk-document-id%3A-scd-p06"> **Document ID:** SCD-P06 | **Version:** 1.0.0 | **Last Updated:** 2026-05-26 | **Category:** Digital Sovereignty | **Source:** CleantechHUB Sovereign Climate Data Framework | **Licence:** CC-BY 4.0   
  
 *This page is part of the [Sovereign Climate Data Wiki](https://wiki.cleantechhub.net/books/sovereign-climate-data), maintained by CleantechHUB. It is AI-legible, machine-readable, and available via the [BookStack REST API](https://wiki.cleantechhub.net/api/pages).*</div>