# The Three-Tier Validation Pipeline

# The Three-Tier Validation Pipeline

The Three-Tier Validation Pipeline is the data architecture that converts raw operational records into verified SUI events. It is named for its three sequential layers: Ingest, Digital Twin, and Conversion. Each tier has a defined responsibility, a defined data format, and a defined handoff protocol to the next tier.

## Pipeline Overview

```

┌─────────────────────────────────────────────────────────────────┐
│                    THREE-TIER VALIDATION PIPELINE                │
├─────────────────┬───────────────────────┬───────────────────────┤
│   TIER 1        │      TIER 2           │      TIER 3           │
│   INGEST        │   DIGITAL TWIN        │   CONVERSION          │
│                 │                       │                       │
│ Raw operational │ LCA simulation &      │ MDB-ready metrics &   │
│ data from all   │ counterfactual        │ financial instrument   │
│ source systems  │ modelling engine      │ trigger outputs       │
│                 │                       │                       │
│ ERP records     │ Emission factors      │ IRIS+ mapped values   │
│ IoT sensors     │ Baseline comparison   │ EU Taxonomy aligned   │
│ Lab results     │ Uncertainty calc.     │ Auditor export pkg.   │
│ GPS/satellite   │ Version-locked params │ SUI Ledger entry      │
└─────────────────┴───────────────────────┴───────────────────────┘
```

## Tier 1: Ingest

### Purpose

Capture all raw evidence of product application events in structured, timestamped form. The Ingest tier is the intake valve of the SSOT — every piece of data relevant to a SUI claim must enter through it.

### Data Sources by Sector

<table id="bkmrk-sectortypical-ingest"><thead><tr><th>Sector</th><th>Typical Ingest Sources</th></tr></thead><tbody><tr><td>AgTech / Bio-inputs</td><td>Batch production records (ERP), field application GPS logs, customer delivery confirmations, soil lab analysis PDFs</td></tr><tr><td>Clean Energy / EV</td><td>IoT charging session data (kWh, duration, vehicle ID), grid connection records, utility meter readings</td></tr><tr><td>Water Treatment</td><td>Flow meter readings, water quality sensors (turbidity, pH, pathogen count), treatment plant operational logs</td></tr><tr><td>Circular Economy</td><td>Material inflow/outflow manifests, weight measurements, recycler receipts, chain-of-custody certificates</td></tr><tr><td>Built Environment</td><td>BMS (Building Management System) data, energy audit reports, occupancy sensors, utility bills</td></tr></tbody></table>

### Ingest Requirements

- **Timestamp:** Every record carries a machine-generated UTC timestamp, not a manually entered date
- **Source ID:** Every record carries the identifier of the system or device that generated it
- **Immutability flag:** Once ingested, records are locked; corrections create new records with a "supersedes" link to the original
- **Schema validation:** Records are validated against a defined schema on ingest; malformed records are quarantined, not silently dropped

## Tier 2: Digital Twin

### Purpose

Apply the SUI calculation logic to the ingested records — comparing observed outcomes to the counterfactual baseline, applying emission factors, calculating uncertainty, and producing a per-application SUI magnitude.

### What "Digital Twin" Means Here

In this context, "Digital Twin" refers to a computational model of the enterprise's impact mechanism — not a real-time operational simulation. The Digital Twin encodes:

- The Life Cycle Assessment (LCA) model for the product's impact pathway
- The baseline values (counterfactual scenario) and their uncertainty ranges
- The emission factors or conversion coefficients (e.g., IPCC AR6 values for N₂O emission from synthetic nitrogen)
- The aggregation rules (how individual application events are summed to period totals)

### Version Control for Model Parameters

Every change to the Digital Twin model — a new emission factor, an updated baseline, a revised LCA boundary — must be version-controlled. Each SUI event in the ledger is tagged with the model version that produced it. This allows historical SUI calculations to be reproduced exactly, even after model updates.

### Digital Twin Outputs

For each ingested application event, the Digital Twin produces:

- SUI magnitude (central estimate)
- Uncertainty range (±N%, at 95% confidence)
- Model version tag
- Calculation audit log (step-by-step computation)
- Data quality flag (complete data vs. estimated vs. proxy)

## Tier 3: Conversion

### Purpose

Transform the Digital Twin outputs into the formats required by different stakeholders — investors, MDBs, auditors, regulators, and the SUI Ledger itself.

### Output Formats

<table id="bkmrk-outputformataudience"><thead><tr><th>Output</th><th>Format</th><th>Audience</th></tr></thead><tbody><tr><td>SUI Ledger Entry</td><td>Structured JSON record in the SSOT</td><td>SSOT system, auditors</td></tr><tr><td>IRIS+ Report</td><td>Indicator values mapped to IRIS+ codes</td><td>Impact investors, GIIN reporting</td></tr><tr><td>EU Taxonomy Contribution Statement</td><td>% revenue / capex / opex aligned</td><td>European institutional investors</td></tr><tr><td>MDB Project Brief</td><td>AIMM-compatible impact narrative + data table</td><td>IFC, IDB Invest, ADB co-investors</td></tr><tr><td>Auditor Export Package</td><td>CSV + methodology PDF + raw data links</td><td>Third-party verifiers (ISAE 3000)</td></tr><tr><td>Investor Dashboard</td><td>Aggregated charts + drill-down to unit level</td><td>Board, VCs, DFI monitors</td></tr></tbody></table>

### Conversion Layer Controls

The Conversion tier must enforce several data integrity controls:

- **No manual overrides:** Conversion outputs are computed, not manually adjusted. Any "rounding" or "presentation formatting" must be documented and must not change material values.
- **Uncertainty propagation:** Uncertainty from the Digital Twin is carried through to all Conversion outputs, not dropped at the reporting layer.
- **Audit trail linkage:** Every Conversion output includes a reference back to the Tier 2 records that produced it, and through those, to the Tier 1 raw inputs.

## The Pipeline in Practice: Becaps Example

1. **Ingest:** Becaps ERP records batch #BC-2024-0441: 500 kg product shipped to cooperative Finca Verde, La Calera, Colombia. GPS delivery confirmed. Cooperative confirms application to 500 ha (1 kg/ha). Soil lab reports uploaded: pre/post nitrogen content for 12 sample plots.
2. **Digital Twin:** Model applies: Baseline = 220 kg N/ha (DANE 2023 Colombian synthetic fertiliser use). Observed = 85 kg N/ha (lab-confirmed). Net displacement = 135 kg N/ha. IPCC AR6 conversion: 135 kg N × 0.758 CO₂e/kg N = 102.3 kg CO₂e/ha. Uncertainty: ±12.3 kg CO₂e (±12%). Model version: DT-Becaps-v2.1.
3. **Conversion:** 500 SUI events recorded in ledger (one per hectare). IRIS+ PI5765 report: 51,150 kg CO₂e avoided this batch. Auditor export package generated. MDB project brief updated with cumulative totals.

---

*Next: [The Digital Twin for Impact Verification](#bkmrk-next%3A-the-digital-tw) — building and validating the computational model.*