How It Works #
Census computes SHA-256 hashes of files, sends only hashes to the CertiSigma API, and receives three layers of cryptographic proof:
- T0 — ECDSA Signature — Immediate, server signs the hash with P-256
- T1 — TSA Timestamp — Minutes, Merkle tree + RFC 3161 Time Stamping Authority
- T2 — Bitcoin Anchor — Hours, Merkle root anchored via OpenTimestamps
Installation #
Python 3.10+. TOML config support on 3.10 uses tomli (auto-installed).
# Base install
pip install certisigma-census
# With watch mode (filesystem monitoring)
pip install certisigma-census[watch]
# With PDF report generation
pip install certisigma-census[report]
# Everything
pip install "certisigma-census[watch,report]"
Quick Start #
Inventory scan
export CERTISIGMA_API_KEY=cs_...
# Scan and attest all file hashes
census scan /path/to/sensitive-files --source inventory-hr
# Dry run (hash only, no attestation)
census scan /path/to/files --dry-run
Breach comparison
# Compare suspect files against the registry
census compare /path/to/suspect-files --manifest inventory.db
# Exit 0 = no matches, 1 = exfiltration detected
Integrity check
# Check files against manifest baseline (100% local)
census integrity manifest.db
# Differential: only new findings since last run
census integrity manifest.db --since auto --write-state auto
GitHub Action #
Composite action with zero Docker overhead. SARIF auto-upload to GitHub Security tab.
# Breach detection with SARIF upload
- uses: certisigma/census-action@v1
with:
command: compare
target: ./artifacts
manifest: ./inventory.db
env:
CERTISIGMA_API_KEY: ${{ secrets.CERTISIGMA_API_KEY }}
# Integrity check (no API key needed)
- uses: certisigma/census-action@v1
with:
command: integrity
manifest: ./inventory.db
| Input | Required | Default | Description |
|---|---|---|---|
command | Yes | — | scan, integrity, compare, bulk-scan |
target | No | . | Directory to scan or check |
manifest | No | .census-manifest.db | Manifest file path |
api-key | No | — | API key (or set env var) |
format | No | auto | text, json, jsonl, sarif (sarif only for compare) |
upload-sarif | No | true | Auto-upload SARIF to Security tab |
source | No | — | Audit label for the scan |
exit-zero | No | false | Report-only mode (compare/bulk-scan) |
version | No | latest | Pin certisigma-census version |
extra-args | No | — | Additional CLI flags |
python-version | No | 3.12 | Python version for setup-python |
Commands #
Census provides 40+ commands. Expand each for options and examples.
census scan <dir>
Walk directory, compute SHA-256 hashes, attest in batch, save manifest.
| Option | Description |
|---|---|
--source LABEL | Source label for attestations |
--manifest PATH | Manifest output path |
--dry-run | Hash only, no attestation |
--resume | Resume interrupted scan |
--workers N | Parallel hashing (1–8) |
--attest-manifest | Attest manifest’s own hash |
--include/--exclude | Glob patterns for filtering |
--min-size/--max-size | Size filters (e.g. 1K, 100M) |
--json | Machine-readable output |
census compare <dir>
Hash suspect files and verify against the CertiSigma registry.
| Option | Description |
|---|---|
--manifest PATH | Local manifest for cross-reference |
--format text|json|sarif|jsonl | Output format |
--detailed | Enriched results (source, T0/T1/T2 level) |
--exit-zero | Report-only mode (always exit 0) |
--summary | Counts only, no match details |
--on-match CMD | Execute CMD on matches (JSON on stdin) |
census integrity <manifest>
Tamper detection against manifest baseline. 100% local, no API calls.
| Option | Description |
|---|---|
--strict | Exit 1 on any discrepancy |
--since PATH | Differential mode (auto = sidecar) |
--write-state PATH | Save state for next run |
--format text|json|jsonl | Output format |
census diff <base> <target>
Compare two manifests. AIDE-style bitmask exit codes (1=added, 2=removed, 4=modified).
census bulk-scan <dir>
Bulk leak detection via /scan endpoint. Up to 50K hashes per call with auto-chunking.
| Option | Description |
|---|---|
--dry-run | Hash only, no API call |
--exit-zero | Report-only mode |
--summary | Counts only |
--source LABEL | Incident tracking label |
census verify <hash|--file>
Verify a hash or file against the registry. Full T0/T1/T2 evidence chain. No API key required.
census verify-manifest <manifest>
Full-chain verification: all manifest hashes against the registry.
census update <manifest>
AIDE-style baseline update: detect → review → accept. New entries are unattested.
census report <manifest> -o <file>
Forensic reports: HTML (zero deps), PDF (fpdf2), evidence bundles (ZIP with OTS proofs).
| Option | Description |
|---|---|
-o / --output PATH | Output file (.html, .pdf, or .json) |
--evidence | Include full T0/T1/T2 evidence chain |
--bundle | ZIP evidence bundle (report + OTS proofs + SHA256SUMS) |
--attest | Attest the report hash for tamper-evidence |
--integrity | Include integrity check results |
--json | Machine-readable JSON output |
census watch <dir>
Continuous filesystem monitoring via native OS events (inotify/FSEvents). Batched attestation with debouncing. Requires [watch] extra.
| Option | Description |
|---|---|
--on-change CMD | Shell command on file change (JSON on stdin) |
--on-attest CMD | Shell command after attestation (JSON on stdin) |
--on-t1 CMD | Shell command on T1 (TSA) webhook event |
--on-t2 CMD | Shell command on T2 (Bitcoin) webhook event |
--webhook-secret-file PATH | Signing secret for embedded webhook receiver |
--webhook-port N | Webhook receiver port (default: 9514) |
--webhook-bind ADDR | Webhook bind address (default: 127.0.0.1) |
--debounce SEC | Debounce interval (default: 2) |
--scan-on-start / --no-scan-on-start | Full scan at startup |
--include/--exclude | Glob patterns for filtering |
--on-t1 / --on-t2 with --webhook-secret-file to start an embedded webhook receiver alongside the watcher. See Webhooks section.census seal / verify-seal
HMAC-SHA256 tamper-evidence seal for manifests (Tripwire/AIDE pattern).
census track <attestation_id>
Track attestation status and T0/T1/T2 progression. Poll until a target proof level is reached.
| Option | Description |
|---|---|
--poll | Poll until target level is reached |
--level T1|T2 | Target proof level (default: T2). Use T1 for faster TSA-only. |
--poll-interval SEC | Seconds between polls (default: 60) |
--timeout SEC | Max seconds to poll (default: 3600) |
--json | Machine-readable output |
census archive <manifest>
Create a forensic evidence preservation package (ZIP) from a manifest. Includes manifest, inventory, chain of custody metadata, and SHA256SUMS integrity file.
| Option | Description |
|---|---|
-o, --output PATH | Output ZIP path (default: evidence-YYYY-MM-DD.census.zip) |
--compress / --no-compress | ZIP compression (default: compressed) |
--include-seal | Include manifest seal in archive |
--json | Machine-readable output |
census verify-archive <archive>
Verify the integrity of a forensic archive package using the embedded SHA256SUMS file.
census webhook register | list | delete | deliveries | verify-payload | serve
Webhook management for T1/T2 lifecycle push notifications. See Webhooks section for full documentation.
census export / hash / stats
Manifest export (CSV/JSON/sha256sum), standalone hashing, org statistics.
census compliance-report <manifest>
Generate compliance reports mapping Census data to NIS2, DORA, or ISO 27001 requirements. 100% local — no API calls.
census compliance-report manifest.db -o report.html
census compliance-report manifest.db --template dora -o report.html
census compliance-report manifest.db --template iso27001 --json
census compliance-report manifest.db --integrity -o report.html| Option | Description |
|---|---|
--template nis2|dora|iso27001 | Compliance framework (default: nis2) |
-o, --output PATH | Output file (.html or .json) |
--integrity / --no-integrity | Run integrity check and include results |
--json | Machine-readable JSON output |
census ai-policy init | apply | report
AI governance: classify inventoried assets for ML/AI training compliance.
census ai-policy init # generate .census-ai-policy.toml template
census ai-policy apply manifest.db --dry-run # classify only, no API calls
census ai-policy apply manifest.db # classify and tag attestations
census ai-policy report manifest.db -o ai.html # HTML compliance report
census ai-policy report manifest.db --json # JSON output| Option | Description |
|---|---|
-p, --policy PATH | TOML policy file (default: .census-ai-policy.toml) |
--dry-run | Classify only, do not tag attestations |
-o, --output PATH | Save report to file (.html or .json) |
--json | Machine-readable JSON output |
Policy files use TOML with [policy] section and [[rules]] array. Rules support glob patterns (*.md, docs/*.txt), size filters (min_size, max_size), and regulatory framework mapping (eu-ai-act, iso42001, c2pa). Safety-first: default_action = "exclude". Most-restrictive-wins on shared attestation IDs.
census sbom attest | verify | summary
SBOM attestation: parse SPDX 2.x / CycloneDX JSON, extract SHA-256 component hashes, batch-attest or verify via the CertiSigma API.
census sbom attest sbom.spdx.json --source "ci-pipeline"
census sbom attest bom.cdx.json --dry-run --json
census sbom verify sbom.spdx.json --json
census sbom verify bom.cdx.json --exit-zero --detailed
census sbom summary sbom.spdx.json --json| Option | Description |
|---|---|
--format auto|spdx|cyclonedx | Force SBOM format (auto-detected by default) |
--source LABEL | Source label for attestations (attest only) |
--manifest PATH | Save attested hashes to manifest (attest only) |
--dry-run | Parse only, do not call the API (attest only) |
--detailed | Include attestation level, source, timestamps (verify only) |
--exit-zero | Always exit 0, report-only mode for CI (verify only) |
--json | Machine-readable JSON output |
Supports SPDX 2.2/2.3 and CycloneDX 1.4/1.5/1.6 JSON. File size limit: 100 MB. No external SBOM libraries required. Hashes are normalised to lowercase hex and deduplicated before submission. Supports EU CRA, NIS2, and US EO 14028 compliance.
census status <manifest>
Show manifest summary: total files, attested/pending counts, root directory, schema version.
census doctor / config / completion
Self-diagnostic (--manifest, --json), TOML configuration (config init, config show, config paths), shell completions (bash/zsh/fish).
census audit-log / snapshot
Tamper-evident JSONL audit log (audit-log show, verify, clear). Named snapshots for compliance baselines (snapshot create, list, diff, delete).
census share / tag / derived-list / annotate / metadata / key-rotate / key-gen
Forensic cooperation: share tokens, structured tagging, HMAC-derived lists, annotations, key rotation.
Output Formats #
| Format | Flag | Use case |
|---|---|---|
| Text | (default) | Human-readable terminal output |
| JSON | --json or --format json | CI/CD automation, machine parsing |
| JSONL | --format jsonl | SIEM/ELK streaming, log pipelines |
| SARIF | --format sarif | GitHub Security tab, VS Code, Defect Dojo |
| CSV | --output report.csv | Spreadsheets, compliance reporting |
| sha256sum | --format sha256sum | GNU coreutils compatible (sha256sum -c) |
| HTML/PDF | -o report.html, -o report.pdf | Forensic reports (census report, compliance-report) |
| ZIP | --bundle | Evidence bundle (report + OTS proofs + SHA256SUMS) |
All JSON output includes census_version and elapsed_seconds for forensic traceability. JSONL streams end with a _summary trailer.
Forensic Features #
- Evidence chain —
census verifywith T0/T1/T2 details, OTS proof export - Forensic reports — HTML, PDF, evidence bundles (ZIP with OTS proofs + SHA256SUMS)
- Audit log — Tamper-evident JSONL with SHA-256 hash chain (
census audit-log verify) - Named snapshots — Compliance baselines with diff comparison
- Manifest seal — HMAC-SHA256 tamper-evidence (Tripwire/AIDE pattern)
- Differential integrity —
--since auto --write-state autofor new-findings-only mode - Baseline update — AIDE-style detect → review → accept workflow
- Forensic annotation — Case IDs, notes, tags with AES-256-GCM zero-knowledge encryption
- Forensic archive —
census archivepackages manifest, inventory, chain of custody, and SHA256SUMS into a verifiable ZIP.census verify-archivechecks integrity. - Webhook evidence —
census webhook verify-payloadcryptographically verifies a saved webhook delivery against its HMAC-SHA256 signature, proving authenticity in the evidence chain. - File attribution — Owner, group, and permissions captured during scan (schema v3), available in reports and archives.
- Attested reports —
census report --attestattests the report’s own hash for tamper-evidence.census verify-reportverifies it.
Cooperation #
Share forensic data with third parties without exposing original content.
- Derived lists — HMAC-SHA256 opaque hash lists for third-party breach detection. The third party can match suspects without seeing your inventory.
- Share tokens — Time-limited, use-limited tokens for chain of custody.
- Structured tagging — Key-value classification with encrypted tags and cursor-paginated query.
- Annotations — Add forensic notes, case IDs, and metadata to attestations.
# Create an opaque derived list from your manifest
census derived-list create --manifest ./inventory.db --label "Q1 2026"
# Third party matches their suspects
census derived-list match <list_id> --list-key <hex64> --hashes-file suspects.txt
CI/CD Integration #
Census is designed for automation. Exit codes, report-only mode, and SARIF output integrate with any CI/CD pipeline.
| Feature | Description |
|---|---|
--exit-zero | Report-only: always exit 0 (upload SARIF without gating) |
--summary | Counts only, no match details (concise CI logs) |
--format sarif | SARIF v2.1.0 for GitHub Security tab upload |
--on-match CMD | Execute command with results on stdin when matches > 0 |
--format jsonl | Streaming output for SIEM/ELK log pipelines |
--no-color | Disable colored output (also respects NO_COLOR env var) |
-q / --quiet | Suppress info output (errors and JSON always shown) |
sbom verify --exit-zero | Verify SBOM components against the registry without gating the build |
sbom attest --source ci | Attest SBOM component hashes as part of the build pipeline |
certisigma/census-action@v1 for seamless CI/CD integration. See GitHub Action section.Exit Codes
| Code | Context | Meaning |
|---|---|---|
0 | All commands | Success (or --exit-zero report-only mode) |
1 | All commands | General error (API, I/O, config, or matches found) |
2 | All commands | Usage error (invalid arguments) |
1 | integrity --strict | Violations detected |
| bitmask | diff | 1=added, 2=removed, 4=modified (OR'd together) |
Webhooks #
Push-based T1/T2 lifecycle notifications. Instead of polling for attestation completion, register a webhook and receive server-side callbacks when proofs are ready.
Register a webhook
# Register and save the signing secret
census webhook register \
--url https://hooks.example.com/certisigma \
--events t1_complete,t2_complete \
--label prod-monitor \
--save-secret .census-webhook-secret
--save-secret to persist it with 0o600 permissions, or copy it immediately. It cannot be retrieved later.Manage webhooks
# List registered webhooks
census webhook list --json
# Show delivery history
census webhook deliveries wh_abc123
# Delete a webhook
census webhook delete wh_abc123
Receive webhooks
Start a lightweight HTTP receiver with HMAC-SHA256 verification, anti-replay guard, and hook dispatch:
# Standalone receiver with shell hooks
census webhook serve \
--secret-file .census-webhook-secret \
--on-t1 'notify-send "T1 certified"' \
--on-t2 'curl -X POST https://slack/hook -d @-'
# Or embed in watch mode for full lifecycle
census watch /data \
--on-change 'echo "changed"' \
--on-t1 'echo "T1 done"' \
--on-t2 'echo "T2 anchored"' \
--webhook-secret-file .census-webhook-secret
| Option | Description |
|---|---|
--secret-file PATH | Signing secret file (from register --save-secret) |
--on-t1 CMD | Shell command on T1 (TSA) event (JSON payload on stdin) |
--on-t2 CMD | Shell command on T2 (Bitcoin) event (JSON payload on stdin) |
--port N | Listen port (default: 9514) |
--bind ADDR | Bind address (default: 127.0.0.1 — loopback only) |
--tls-cert / --tls-key | PEM files for built-in TLS (reverse proxy recommended) |
--replay-window SEC | Anti-replay window in seconds (default: 300) |
Forensic verification
Verify a saved webhook delivery is authentic and unmodified:
census webhook verify-payload delivery.json \
--signature "sha256=abc..." \
--secret-file .census-webhook-secret
Security properties
- HMAC-SHA256 on every delivery — Signature verified before JSON parsing. Invalid signatures are rejected (401).
- Anti-replay guard — Bounded delivery ID deduplication (10K entries, FIFO) + timestamp window (300s). Prevents replay attacks.
- Secret file permissions —
--save-secretwrites with 0o600 (owner-only). Load strips comments and whitespace. - Loopback by default — Binds to
127.0.0.1. For public exposure, use--bind 0.0.0.0behind a reverse proxy with TLS termination. - Optional built-in TLS —
--tls-cert/--tls-keyfor environments without a reverse proxy. Minimum TLS 1.2. - Payload size limit — 1 MB maximum. Requests exceeding this are rejected before reading the body.
- Graceful shutdown — SIGINT/SIGTERM cleanly stops the receiver.
Configuration #
Census reads configuration from TOML files with user/project precedence:
- CLI flags (highest priority)
- Environment variables (
CERTISIGMA_API_KEY,CERTISIGMA_BASE_URL) - Project config (
.census.tomlin current directory) - User config (
~/.config/census/config.toml)
# Create a project config template
census config init --project
# View effective configuration
census config show
# Shell completions
eval "$(census completion bash)"
Security Model #
- Content never leaves the client — Only SHA-256 hashes are transmitted to the API. The original file content stays on your infrastructure.
- Zero-knowledge metadata — Annotations and tag values can be encrypted client-side with AES-256-GCM before sending to the API. The server stores ciphertext only.
- HMAC-derived lists — Third-party breach detection uses HMAC-SHA256 derivation. The third party sees opaque derived hashes, not your original inventory.
- Manifest is local — The hash-to-filepath mapping lives on your filesystem. CertiSigma never sees file paths or directory structure.
- Manifest encryption at rest — Manifests can be encrypted with AES-256-GCM using
--encryption-keyorCENSUS_ENCRYPTION_KEYenv var. Encrypted files use a compact binary format with 96-bit random nonce and authenticated encryption. Auto-detected on load. - API key scoping — RBAC scoped keys allow read-only access for analysts with full audit trail.
- Webhook HMAC-SHA256 — Every webhook delivery is signed with a per-webhook secret. The receiver verifies signatures before processing. Anti-replay guard prevents stored replay attacks.
- Secret file management — Webhook secrets are written with 0o600 permissions (owner-only). Never logged, never committed. Display-once semantics at registration.
config show and doctor output.Compliance Mapping #
Census provides cryptographic evidence chains that map to regulatory requirements:
| Requirement | Framework | Census capability |
|---|---|---|
| Asset inventory | NIS2 Art.21, ISO 27001 A.8.1 | census scan + manifest |
| Change detection | NIS2 Art.21, DORA Art.9 | census integrity + differential |
| Incident response evidence | NIS2 Art.23, DORA Art.17 | census compare + forensic reports |
| Data integrity verification | DORA Art.11, ISO 27001 A.14 | census verify-manifest |
| Audit trail | NIS2 Art.21, ISO 27001 A.12.4 | census audit-log (tamper-evident) |
| Third-party risk | NIS2 Art.21, DORA Art.28 | Derived lists + share tokens |
| Data classification | ISO 27001 A.8.2 | Structured tagging + encryption |
| Cryptographic controls | ISO 27001 A.10, DORA Art.9 | T0/T1/T2 proof chain, AES-256-GCM |
| Supply chain integrity | NIS2 Art.21(2d) | census seal + verify-seal |
| Continuous monitoring | DORA Art.9(2) | census watch + webhooks + systemd |
| Evidence preservation | ISO 27001 A.16.1.7 | census archive + verify-archive |
Architecture #
Census is a client of the CertiSigma API. It uses the published Python SDK and treats it as a black box.
| Component | Description |
|---|---|
| CLI | Click-based, 40+ commands, global flags (-v, -q, --no-color) |
| Manifest | SQLite (WAL mode), schema v3, auto-migration from JSON |
| Scanner | Streamed SHA-256, parallel hashing (ProcessPoolExecutor), glob filters |
| Watcher | watchdog + producer/consumer, debounce, batch attestation |
| Retry | Exponential backoff on 429/5xx with Retry-After header |
| Reports | HTML (zero deps), PDF (fpdf2), ZIP bundles with OTS proofs |
| Audit | JSONL with SHA-256 hash chain, tail-read for last hash |
| Webhooks | Lightweight HTTP receiver, HMAC-SHA256 verification, anti-replay guard, TLS optional |
| Archive | Forensic evidence ZIP packages with SHA256SUMS integrity |
Global Options #
| Option | Description |
|---|---|
-v / --verbose | Enable debug logging |
-q / --quiet | Suppress informational output |
--log-format text|json | Log output format |
--no-color | Disable colored output (also NO_COLOR env) |
--encryption-key HEX64 | AES-256-GCM key for manifest encryption at rest (or CENSUS_ENCRYPTION_KEY env) |
--version | Show version |