Emit institutionally-signed TRACE TRO for every webapp simulation run

## Context

Today's working meeting with Lars Vilhuber (AEA Data Editor), Tara Watson, John Sabelhaus, and Tim Clark / Casper of the TRACE project reframed where a TRACE Transparent Research Object (TRO) adds value for PolicyEngine.

Lars, Tim, and Casper converged on a specific answer: **TRACE's value for us is institutional certification of runs researchers cannot easily re-run themselves.** For PolicyEngine, the two places that fits:

1. The **us-data microdata build** (us-data PR #746 already emits a build-TRO per release ✓)
2. The **policyengine.org simulation runs** — because the webapp runs the simulation on infrastructure and against data (the calibrated enhanced CPS `.h5`) that the researcher does not fully control. This issue is about that second surface.

Lars explicitly said TRACE *does not* primarily serve the researcher-running-policyengine-locally case — in that case the reader can just `pip install` the same versions and rerun without a TRO.

## What to build

When a simulation runs through the policyengine-api (household-level via `/calculate`, or economy-wide via `/economy/{country_id}/over/{policy_id}`), the server should emit a signed TRACE TRO that binds:

- **Software**: `policyengine`, `policyengine-{country}`, `policyengine-{country}-data` wheel SHA-256 + versions. *Not* the full installed Python-package transitive list — TRACE has explicitly not built transitive-dep tracing in (per the 2026-04-21 meeting with Tim/Casper); a verifier who wants the full environment can resolve declared dependencies against a public index.
- **Data**: the HF-hosted `enhanced_cps_*.h5` (or UK / Canada equivalent) SHA-256 + `DataReleaseManifest` SHA-256
- **Reform**: the reform JSON the user submitted, content-hashed
- **Inputs**: the household JSON (for `/calculate`) or the simulation config (for `/economy/...`), content-hashed
- **Results**: a content-hashed `results.json` with the aggregate metrics we currently return to the UI. Whether to additionally bind a full per-household weighted output frame (parquet) is an open design question — see below.
- **Institutional attestation**: CI/deployment run URL, git SHA, cloud region, timestamp, and a signature by a PolicyEngine service account so the TRO is verifiably "PolicyEngine ran this" not "someone with a fork ran this"

All hashes canonical-JSON normalized via `policyengine.provenance.trace.canonical_json_bytes` so third-party validators can reproduce them.

## Storage and retrieval

Each emitted TRO is a JSON-LD document (~few KB). Persist:

- In GCS / object storage under `traces/{country}/{year}/{run-id}.trace.tro.jsonld` with a durable public URL
- Indexed in the policyengine-api database so the result page can fetch its own TRO
- Retention: indefinite (these are citations; they must never 404)

The TRO URL lives in the simulation result JSON returned to the frontend — see the companion issue on policyengine-app for the download button and version badge UX.

## Non-goals (per meeting)

## Additional requirements surfaced on review

Codex review of this issue (2026-04-21, against the meeting transcript) flagged that the original scope conflates "institutional attestation metadata" with "institutional certification." They are not the same. To actually deliver the latter, this issue needs:

- **Durable storage commitment.** The TRO URL in a paper citation must resolve forever. "Persist to GCS" is not a commitment; it is an implementation. The policy needs to specify: retention duration (indefinite), content-addressing scheme, a migration plan for bucket / region / provider changes, and a URL resolver that survives service rewrites. Zenodo deposit as a durable mirror is on Lars's mind (the meeting flagged HuggingFace lacking a clear preservation policy, and pointed at Zenodo as the reference pattern for this kind of artifact) — worth considering at least as a secondary location.
- **Verifier-facing trust model for the signature/key.** A PolicyEngine service-account signature is only as credible as a reader's ability to verify (a) the key actually belongs to PolicyEngine, (b) it was not rotated between emission and verification without a traceable chain, and (c) the signing service itself cannot be spoofed. Possible answers: GCP workload-identity + short-lived signatures, a published keychain rooted in a DNS TXT record at policyengine.org, or a Sigstore-style transparency log. Needs an explicit design choice, not just "sign with a service account."
- **Binding to the actual production runtime and request**, not merely CI/deploy metadata. The TRO should attest "this specific request, running on this specific container image / function version, at this specific time, produced these outputs." CI run URL + git SHA documents how the container was *built*; the TRO also needs to bind the running container image SHA + region + pod / function instance at the time of execution.

## On "institutional self-certification"

The meeting transcript is explicit that an institution certifying its own runs "carries technically no difference" from an author certifying their own runs — the arms-length property is lost. Our value comes from institutional *reputation* and from providing structured evidence that a verifier can query, not from a cryptographic equivalent of arms-length independence. The issue should avoid language that oversells this. We are producing an institution-backed self-attestation; that is valuable and aligns with TRACE's current scope, but it is not arms-length third-party certification.

## Prerequisites



- Do **not** embed TRO emission in the end-user `policyengine` Python package as a default researcher-facing feature. `policyengine trace-tro` CLI already exists; that's fine for the build process and power users, but the researcher-laptop case is not where TRACE adds value.
- Do **not** rebuild the TRO schema. `policyengine.provenance.trace.build_trace_tro_from_release_bundle` + `build_simulation_trace_tro` already emit canonical TROv 0.1. The work here is wrapping those in the API request lifecycle + signing + persistence.

## Open design questions

- **Per-household frame default**: always include in the TRO, opt-in, or opt-out? The meeting transcript does not reach consensus on this. Max raised it, Sabelhaus noted the app already produces both the full counterfactual microdata and the summary statistics, but Lars/Tim/Casper did not endorse "make the full frame the TRO default." Design choice here should be ours, made with explicit trade-offs listed: TRO file size, downstream-analysis utility, privacy posture in UK-style restricted-data cases.
- **Signing mechanism**: GCP workload-identity-based signing of the TRO bytes, GPG key, or just rely on the attestation fields being bound to an auditable CI/deploy provenance chain?
- **Back-population**: should we retroactively emit TROs for simulations already stored, or only emit going forward?

## Dependencies

**Blocker: policyengine-api needs migration to policyengine>=4.0 before this issue can be implemented.** The current pin is `policyengine>0.12.0,<1` (pre-v4 orchestrator), and the api imports pre-v4 modules like `policyengine.simulation.SimulationOptions` that do not exist in current pe.py. The TRACE emission helpers (`policyengine.provenance.trace.*`) only exist in v4. See the separate migration issue filed as a prerequisite.

- Requires `policyengine==4.3.1+` which exposes `build_simulation_trace_tro` (already shipped)
- Requires `policyengine-us-data==1.85.2+` whose `DataReleaseManifest` ships alongside every HF h5 upload (already shipped)
- Requires `policyengine.py` PR #314 to have merged (TROv 0.1 canonical namespace migration)

## Related

- us-data PR #746 (build-TRO emission, landed)
- policyengine.py PR #314 (TROv 0.1 namespace migration)
- household-api-docs PR #7 (will need a follow-up commit to put the webapp-run flow ahead of the local-Python-CLI flow)
- Meeting on 2026-04-21 with Lars Vilhuber and the TRACE team




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Emit institutionally-signed TRACE TRO for every webapp simulation run #3485

Context

What to build

Storage and retrieval

Non-goals (per meeting)

Additional requirements surfaced on review

On "institutional self-certification"

Prerequisites

Open design questions

Dependencies

Related

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Emit institutionally-signed TRACE TRO for every webapp simulation run #3485

Description

Context

What to build

Storage and retrieval

Non-goals (per meeting)

Additional requirements surfaced on review

On "institutional self-certification"

Prerequisites

Open design questions

Dependencies

Related

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions