How It Works #

Step 1
Scan
SHA-256 hash every file
Step 2
Attest
Three-layer proof (T0/T1/T2)
Step 3
Compare
Detect exfiltration

Census computes SHA-256 hashes of files, sends only hashes to the CertiSigma API, and receives three layers of cryptographic proof:

  1. T0 — ECDSA Signature — Immediate, server signs the hash with P-256
  2. T1 — TSA Timestamp — Minutes, Merkle tree + RFC 3161 Time Stamping Authority
  3. T2 — Bitcoin Anchor — Hours, Merkle root anchored via OpenTimestamps
Zero knowledge: Original file content never leaves the client. Only 64-character hex hashes are transmitted.
SBOM attestation: Parse SPDX 2.x and CycloneDX JSON SBOMs and attest every component hash with the same three-layer proof chain used for files. Supports compliance workflows for EU CRA, NIS2, and US EO 14028.

Installation #

Python 3.10+. TOML config support on 3.10 uses tomli (auto-installed).

bash
# Base install
pip install certisigma-census

# With watch mode (filesystem monitoring)
pip install certisigma-census[watch]

# With PDF report generation
pip install certisigma-census[report]

# Everything
pip install "certisigma-census[watch,report]"

Quick Start #

Inventory scan

bash
export CERTISIGMA_API_KEY=cs_...

# Scan and attest all file hashes
census scan /path/to/sensitive-files --source inventory-hr

# Dry run (hash only, no attestation)
census scan /path/to/files --dry-run

Breach comparison

bash
# Compare suspect files against the registry
census compare /path/to/suspect-files --manifest inventory.db

# Exit 0 = no matches, 1 = exfiltration detected

Integrity check

bash
# Check files against manifest baseline (100% local)
census integrity manifest.db

# Differential: only new findings since last run
census integrity manifest.db --since auto --write-state auto

GitHub Action #

Composite action with zero Docker overhead. SARIF auto-upload to GitHub Security tab.

yaml
# Breach detection with SARIF upload
- uses: certisigma/census-action@v1
  with:
    command: compare
    target: ./artifacts
    manifest: ./inventory.db
  env:
    CERTISIGMA_API_KEY: ${{ secrets.CERTISIGMA_API_KEY }}

# Integrity check (no API key needed)
- uses: certisigma/census-action@v1
  with:
    command: integrity
    manifest: ./inventory.db
InputRequiredDefaultDescription
commandYesscan, integrity, compare, bulk-scan
targetNo.Directory to scan or check
manifestNo.census-manifest.dbManifest file path
api-keyNoAPI key (or set env var)
formatNoautotext, json, jsonl, sarif (sarif only for compare)
upload-sarifNotrueAuto-upload SARIF to Security tab
sourceNoAudit label for the scan
exit-zeroNofalseReport-only mode (compare/bulk-scan)
versionNolatestPin certisigma-census version
extra-argsNoAdditional CLI flags
python-versionNo3.12Python version for setup-python

Commands #

Census provides 40+ commands. Expand each for options and examples.

census scan <dir>

Walk directory, compute SHA-256 hashes, attest in batch, save manifest.

OptionDescription
--source LABELSource label for attestations
--manifest PATHManifest output path
--dry-runHash only, no attestation
--resumeResume interrupted scan
--workers NParallel hashing (1–8)
--attest-manifestAttest manifest’s own hash
--include/--excludeGlob patterns for filtering
--min-size/--max-sizeSize filters (e.g. 1K, 100M)
--jsonMachine-readable output
census compare <dir>

Hash suspect files and verify against the CertiSigma registry.

OptionDescription
--manifest PATHLocal manifest for cross-reference
--format text|json|sarif|jsonlOutput format
--detailedEnriched results (source, T0/T1/T2 level)
--exit-zeroReport-only mode (always exit 0)
--summaryCounts only, no match details
--on-match CMDExecute CMD on matches (JSON on stdin)
census integrity <manifest>

Tamper detection against manifest baseline. 100% local, no API calls.

OptionDescription
--strictExit 1 on any discrepancy
--since PATHDifferential mode (auto = sidecar)
--write-state PATHSave state for next run
--format text|json|jsonlOutput format
census diff <base> <target>

Compare two manifests. AIDE-style bitmask exit codes (1=added, 2=removed, 4=modified).

census bulk-scan <dir>

Bulk leak detection via /scan endpoint. Up to 50K hashes per call with auto-chunking.

OptionDescription
--dry-runHash only, no API call
--exit-zeroReport-only mode
--summaryCounts only
--source LABELIncident tracking label
census verify <hash|--file>

Verify a hash or file against the registry. Full T0/T1/T2 evidence chain. No API key required.

census verify-manifest <manifest>

Full-chain verification: all manifest hashes against the registry.

census update <manifest>

AIDE-style baseline update: detect → review → accept. New entries are unattested.

census report <manifest> -o <file>

Forensic reports: HTML (zero deps), PDF (fpdf2), evidence bundles (ZIP with OTS proofs).

OptionDescription
-o / --output PATHOutput file (.html, .pdf, or .json)
--evidenceInclude full T0/T1/T2 evidence chain
--bundleZIP evidence bundle (report + OTS proofs + SHA256SUMS)
--attestAttest the report hash for tamper-evidence
--integrityInclude integrity check results
--jsonMachine-readable JSON output
census watch <dir>

Continuous filesystem monitoring via native OS events (inotify/FSEvents). Batched attestation with debouncing. Requires [watch] extra.

OptionDescription
--on-change CMDShell command on file change (JSON on stdin)
--on-attest CMDShell command after attestation (JSON on stdin)
--on-t1 CMDShell command on T1 (TSA) webhook event
--on-t2 CMDShell command on T2 (Bitcoin) webhook event
--webhook-secret-file PATHSigning secret for embedded webhook receiver
--webhook-port NWebhook receiver port (default: 9514)
--webhook-bind ADDRWebhook bind address (default: 127.0.0.1)
--debounce SECDebounce interval (default: 2)
--scan-on-start / --no-scan-on-startFull scan at startup
--include/--excludeGlob patterns for filtering
Full T1/T2 lifecycle: Combine --on-t1 / --on-t2 with --webhook-secret-file to start an embedded webhook receiver alongside the watcher. See Webhooks section.
census seal / verify-seal

HMAC-SHA256 tamper-evidence seal for manifests (Tripwire/AIDE pattern).

census track <attestation_id>

Track attestation status and T0/T1/T2 progression. Poll until a target proof level is reached.

OptionDescription
--pollPoll until target level is reached
--level T1|T2Target proof level (default: T2). Use T1 for faster TSA-only.
--poll-interval SECSeconds between polls (default: 60)
--timeout SECMax seconds to poll (default: 3600)
--jsonMachine-readable output
census archive <manifest>

Create a forensic evidence preservation package (ZIP) from a manifest. Includes manifest, inventory, chain of custody metadata, and SHA256SUMS integrity file.

OptionDescription
-o, --output PATHOutput ZIP path (default: evidence-YYYY-MM-DD.census.zip)
--compress / --no-compressZIP compression (default: compressed)
--include-sealInclude manifest seal in archive
--jsonMachine-readable output
census verify-archive <archive>

Verify the integrity of a forensic archive package using the embedded SHA256SUMS file.

census webhook register | list | delete | deliveries | verify-payload | serve

Webhook management for T1/T2 lifecycle push notifications. See Webhooks section for full documentation.

census export / hash / stats

Manifest export (CSV/JSON/sha256sum), standalone hashing, org statistics.

census compliance-report <manifest>

Generate compliance reports mapping Census data to NIS2, DORA, or ISO 27001 requirements. 100% local — no API calls.

bash
census compliance-report manifest.db -o report.html
census compliance-report manifest.db --template dora -o report.html
census compliance-report manifest.db --template iso27001 --json
census compliance-report manifest.db --integrity -o report.html
OptionDescription
--template nis2|dora|iso27001Compliance framework (default: nis2)
-o, --output PATHOutput file (.html or .json)
--integrity / --no-integrityRun integrity check and include results
--jsonMachine-readable JSON output
census ai-policy init | apply | report

AI governance: classify inventoried assets for ML/AI training compliance.

census ai-policy init                           # generate .census-ai-policy.toml template
census ai-policy apply manifest.db --dry-run    # classify only, no API calls
census ai-policy apply manifest.db              # classify and tag attestations
census ai-policy report manifest.db -o ai.html  # HTML compliance report
census ai-policy report manifest.db --json      # JSON output
OptionDescription
-p, --policy PATHTOML policy file (default: .census-ai-policy.toml)
--dry-runClassify only, do not tag attestations
-o, --output PATHSave report to file (.html or .json)
--jsonMachine-readable JSON output

Policy files use TOML with [policy] section and [[rules]] array. Rules support glob patterns (*.md, docs/*.txt), size filters (min_size, max_size), and regulatory framework mapping (eu-ai-act, iso42001, c2pa). Safety-first: default_action = "exclude". Most-restrictive-wins on shared attestation IDs.

census sbom attest | verify | summary

SBOM attestation: parse SPDX 2.x / CycloneDX JSON, extract SHA-256 component hashes, batch-attest or verify via the CertiSigma API.

census sbom attest sbom.spdx.json --source "ci-pipeline"
census sbom attest bom.cdx.json --dry-run --json
census sbom verify sbom.spdx.json --json
census sbom verify bom.cdx.json --exit-zero --detailed
census sbom summary sbom.spdx.json --json
OptionDescription
--format auto|spdx|cyclonedxForce SBOM format (auto-detected by default)
--source LABELSource label for attestations (attest only)
--manifest PATHSave attested hashes to manifest (attest only)
--dry-runParse only, do not call the API (attest only)
--detailedInclude attestation level, source, timestamps (verify only)
--exit-zeroAlways exit 0, report-only mode for CI (verify only)
--jsonMachine-readable JSON output

Supports SPDX 2.2/2.3 and CycloneDX 1.4/1.5/1.6 JSON. File size limit: 100 MB. No external SBOM libraries required. Hashes are normalised to lowercase hex and deduplicated before submission. Supports EU CRA, NIS2, and US EO 14028 compliance.

census status <manifest>

Show manifest summary: total files, attested/pending counts, root directory, schema version.

census doctor / config / completion

Self-diagnostic (--manifest, --json), TOML configuration (config init, config show, config paths), shell completions (bash/zsh/fish).

census audit-log / snapshot

Tamper-evident JSONL audit log (audit-log show, verify, clear). Named snapshots for compliance baselines (snapshot create, list, diff, delete).

census share / tag / derived-list / annotate / metadata / key-rotate / key-gen

Forensic cooperation: share tokens, structured tagging, HMAC-derived lists, annotations, key rotation.

Output Formats #

FormatFlagUse case
Text(default)Human-readable terminal output
JSON--json or --format jsonCI/CD automation, machine parsing
JSONL--format jsonlSIEM/ELK streaming, log pipelines
SARIF--format sarifGitHub Security tab, VS Code, Defect Dojo
CSV--output report.csvSpreadsheets, compliance reporting
sha256sum--format sha256sumGNU coreutils compatible (sha256sum -c)
HTML/PDF-o report.html, -o report.pdfForensic reports (census report, compliance-report)
ZIP--bundleEvidence bundle (report + OTS proofs + SHA256SUMS)

All JSON output includes census_version and elapsed_seconds for forensic traceability. JSONL streams end with a _summary trailer.

Forensic Features #

  • Evidence chaincensus verify with T0/T1/T2 details, OTS proof export
  • Forensic reports — HTML, PDF, evidence bundles (ZIP with OTS proofs + SHA256SUMS)
  • Audit log — Tamper-evident JSONL with SHA-256 hash chain (census audit-log verify)
  • Named snapshots — Compliance baselines with diff comparison
  • Manifest seal — HMAC-SHA256 tamper-evidence (Tripwire/AIDE pattern)
  • Differential integrity--since auto --write-state auto for new-findings-only mode
  • Baseline update — AIDE-style detect → review → accept workflow
  • Forensic annotation — Case IDs, notes, tags with AES-256-GCM zero-knowledge encryption
  • Forensic archivecensus archive packages manifest, inventory, chain of custody, and SHA256SUMS into a verifiable ZIP. census verify-archive checks integrity.
  • Webhook evidencecensus webhook verify-payload cryptographically verifies a saved webhook delivery against its HMAC-SHA256 signature, proving authenticity in the evidence chain.
  • File attribution — Owner, group, and permissions captured during scan (schema v3), available in reports and archives.
  • Attested reportscensus report --attest attests the report’s own hash for tamper-evidence. census verify-report verifies it.

Cooperation #

Share forensic data with third parties without exposing original content.

  • Derived lists — HMAC-SHA256 opaque hash lists for third-party breach detection. The third party can match suspects without seeing your inventory.
  • Share tokens — Time-limited, use-limited tokens for chain of custody.
  • Structured tagging — Key-value classification with encrypted tags and cursor-paginated query.
  • Annotations — Add forensic notes, case IDs, and metadata to attestations.
bash
# Create an opaque derived list from your manifest
census derived-list create --manifest ./inventory.db --label "Q1 2026"

# Third party matches their suspects
census derived-list match <list_id> --list-key <hex64> --hashes-file suspects.txt

CI/CD Integration #

Census is designed for automation. Exit codes, report-only mode, and SARIF output integrate with any CI/CD pipeline.

FeatureDescription
--exit-zeroReport-only: always exit 0 (upload SARIF without gating)
--summaryCounts only, no match details (concise CI logs)
--format sarifSARIF v2.1.0 for GitHub Security tab upload
--on-match CMDExecute command with results on stdin when matches > 0
--format jsonlStreaming output for SIEM/ELK log pipelines
--no-colorDisable colored output (also respects NO_COLOR env var)
-q / --quietSuppress info output (errors and JSON always shown)
sbom verify --exit-zeroVerify SBOM components against the registry without gating the build
sbom attest --source ciAttest SBOM component hashes as part of the build pipeline
GitHub Action: Use certisigma/census-action@v1 for seamless CI/CD integration. See GitHub Action section.

Exit Codes

CodeContextMeaning
0All commandsSuccess (or --exit-zero report-only mode)
1All commandsGeneral error (API, I/O, config, or matches found)
2All commandsUsage error (invalid arguments)
1integrity --strictViolations detected
bitmaskdiff1=added, 2=removed, 4=modified (OR'd together)

Webhooks #

Push-based T1/T2 lifecycle notifications. Instead of polling for attestation completion, register a webhook and receive server-side callbacks when proofs are ready.

T0
Attest
ECDSA signature (instant)
Webhook
T1 Complete
TSA timestamp (minutes)
Webhook
T2 Complete
Bitcoin anchor (hours)

Register a webhook

bash
# Register and save the signing secret
census webhook register \
  --url https://hooks.example.com/certisigma \
  --events t1_complete,t2_complete \
  --label prod-monitor \
  --save-secret .census-webhook-secret
One-time secret: The signing secret is returned once at registration. Use --save-secret to persist it with 0o600 permissions, or copy it immediately. It cannot be retrieved later.

Manage webhooks

bash
# List registered webhooks
census webhook list --json

# Show delivery history
census webhook deliveries wh_abc123

# Delete a webhook
census webhook delete wh_abc123

Receive webhooks

Start a lightweight HTTP receiver with HMAC-SHA256 verification, anti-replay guard, and hook dispatch:

bash
# Standalone receiver with shell hooks
census webhook serve \
  --secret-file .census-webhook-secret \
  --on-t1 'notify-send "T1 certified"' \
  --on-t2 'curl -X POST https://slack/hook -d @-'

# Or embed in watch mode for full lifecycle
census watch /data \
  --on-change 'echo "changed"' \
  --on-t1 'echo "T1 done"' \
  --on-t2 'echo "T2 anchored"' \
  --webhook-secret-file .census-webhook-secret
OptionDescription
--secret-file PATHSigning secret file (from register --save-secret)
--on-t1 CMDShell command on T1 (TSA) event (JSON payload on stdin)
--on-t2 CMDShell command on T2 (Bitcoin) event (JSON payload on stdin)
--port NListen port (default: 9514)
--bind ADDRBind address (default: 127.0.0.1 — loopback only)
--tls-cert / --tls-keyPEM files for built-in TLS (reverse proxy recommended)
--replay-window SECAnti-replay window in seconds (default: 300)

Forensic verification

Verify a saved webhook delivery is authentic and unmodified:

bash
census webhook verify-payload delivery.json \
  --signature "sha256=abc..." \
  --secret-file .census-webhook-secret

Security properties

  • HMAC-SHA256 on every delivery — Signature verified before JSON parsing. Invalid signatures are rejected (401).
  • Anti-replay guard — Bounded delivery ID deduplication (10K entries, FIFO) + timestamp window (300s). Prevents replay attacks.
  • Secret file permissions--save-secret writes with 0o600 (owner-only). Load strips comments and whitespace.
  • Loopback by default — Binds to 127.0.0.1. For public exposure, use --bind 0.0.0.0 behind a reverse proxy with TLS termination.
  • Optional built-in TLS--tls-cert / --tls-key for environments without a reverse proxy. Minimum TLS 1.2.
  • Payload size limit — 1 MB maximum. Requests exceeding this are rejected before reading the body.
  • Graceful shutdown — SIGINT/SIGTERM cleanly stops the receiver.

Configuration #

Census reads configuration from TOML files with user/project precedence:

  1. CLI flags (highest priority)
  2. Environment variables (CERTISIGMA_API_KEY, CERTISIGMA_BASE_URL)
  3. Project config (.census.toml in current directory)
  4. User config (~/.config/census/config.toml)
bash
# Create a project config template
census config init --project

# View effective configuration
census config show

# Shell completions
eval "$(census completion bash)"

Security Model #

  • Content never leaves the client — Only SHA-256 hashes are transmitted to the API. The original file content stays on your infrastructure.
  • Zero-knowledge metadata — Annotations and tag values can be encrypted client-side with AES-256-GCM before sending to the API. The server stores ciphertext only.
  • HMAC-derived lists — Third-party breach detection uses HMAC-SHA256 derivation. The third party sees opaque derived hashes, not your original inventory.
  • Manifest is local — The hash-to-filepath mapping lives on your filesystem. CertiSigma never sees file paths or directory structure.
  • Manifest encryption at rest — Manifests can be encrypted with AES-256-GCM using --encryption-key or CENSUS_ENCRYPTION_KEY env var. Encrypted files use a compact binary format with 96-bit random nonce and authenticated encryption. Auto-detected on load.
  • API key scoping — RBAC scoped keys allow read-only access for analysts with full audit trail.
  • Webhook HMAC-SHA256 — Every webhook delivery is signed with a per-webhook secret. The receiver verifies signatures before processing. Anti-replay guard prevents stored replay attacks.
  • Secret file management — Webhook secrets are written with 0o600 permissions (owner-only). Never logged, never committed. Display-once semantics at registration.
Important: API keys and webhook secrets should never be committed to source control. Use environment variables or a secrets manager. Census masks keys in config show and doctor output.

Compliance Mapping #

Census provides cryptographic evidence chains that map to regulatory requirements:

RequirementFrameworkCensus capability
Asset inventoryNIS2 Art.21, ISO 27001 A.8.1census scan + manifest
Change detectionNIS2 Art.21, DORA Art.9census integrity + differential
Incident response evidenceNIS2 Art.23, DORA Art.17census compare + forensic reports
Data integrity verificationDORA Art.11, ISO 27001 A.14census verify-manifest
Audit trailNIS2 Art.21, ISO 27001 A.12.4census audit-log (tamper-evident)
Third-party riskNIS2 Art.21, DORA Art.28Derived lists + share tokens
Data classificationISO 27001 A.8.2Structured tagging + encryption
Cryptographic controlsISO 27001 A.10, DORA Art.9T0/T1/T2 proof chain, AES-256-GCM
Supply chain integrityNIS2 Art.21(2d)census seal + verify-seal
Continuous monitoringDORA Art.9(2)census watch + webhooks + systemd
Evidence preservationISO 27001 A.16.1.7census archive + verify-archive

Architecture #

Census is a client of the CertiSigma API. It uses the published Python SDK and treats it as a black box.

ComponentDescription
CLIClick-based, 40+ commands, global flags (-v, -q, --no-color)
ManifestSQLite (WAL mode), schema v3, auto-migration from JSON
ScannerStreamed SHA-256, parallel hashing (ProcessPoolExecutor), glob filters
Watcherwatchdog + producer/consumer, debounce, batch attestation
RetryExponential backoff on 429/5xx with Retry-After header
ReportsHTML (zero deps), PDF (fpdf2), ZIP bundles with OTS proofs
AuditJSONL with SHA-256 hash chain, tail-read for last hash
WebhooksLightweight HTTP receiver, HMAC-SHA256 verification, anti-replay guard, TLS optional
ArchiveForensic evidence ZIP packages with SHA256SUMS integrity
SDK integration: Census consumes certisigma from PyPI. For API-level integration details, see the SDK documentation.

Global Options #

OptionDescription
-v / --verboseEnable debug logging
-q / --quietSuppress informational output
--log-format text|jsonLog output format
--no-colorDisable colored output (also NO_COLOR env)
--encryption-key HEX64AES-256-GCM key for manifest encryption at rest (or CENSUS_ENCRYPTION_KEY env)
--versionShow version