Open Census MCP Server

09_reproducibility.md•2.29 KiB

# Chapter 9: Reproducibility Contract [← Ch. 8](08_execution_procedure.md) | [README](README.md) --- ## 9.1 Reproducibility Components Per QR-016, results are reproducible given these four components: | Component | Artifact | Versioning Method | |---|---|---| | Server configuration | `src/census_mcp/config.py` | Git commit hash | | Pack content | `packs/acs.db` | SHA-256 content hash | | Test battery | `src/eval/battery/queries.yaml` | Git commit hash | | Model identifiers | Recorded in JSONL output | Pinned checkpoint strings | All four are recorded in `results/config_state.txt` and in the JSONL output metadata per QR-014. ## 9.2 What "Reproducible" Means Here LLM outputs are non-deterministic. Exact response text will vary across runs even with identical configuration. The evaluation protocol accounts for this through statistical aggregation across 39 queries and 3 judges, not through exact reproducibility of individual responses. **Reproducible means:** Given the same four components, a re-run should produce statistically consistent conclusions (same direction and approximate magnitude of treatment effect), not identical output text. ## 9.3 Configuration File as Single Source of Truth Per DEC-4B-019 and C-006: ALL parameters that affect outputs reside in `src/census_mcp/config.py`. No output-affecting defaults are hardcoded in application logic. This is a permanent, non-negotiable project rule. The config file supports environment variable overrides for deployment flexibility, but the file itself documents the defaults and serves as the audit trail for what values were in effect for any given git commit. ## 9.4 Versioning Discipline Any change to any of the four reproducibility components creates a **new experiment**, not a reproduction of the old one. Results files include timestamps and git hashes to distinguish runs. | Change | Consequence | |---|---| | Update `config.py` defaults | New experiment. Old results are from a different configuration. | | Add/modify pack content | New experiment. Pragmatics available to treatment changed. | | Add/modify battery queries | New experiment. Query set changed. | | Model version update | New experiment. Different model checkpoint. | | Re-run with identical components | Statistical replication. Expect consistent conclusions, not identical text. |

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/brockwebb/open-census-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

09_reproducibility.md•2.29 KiB