The complete normative reference for producers and consumers of .sdf files.
Covers container format, all four required layers, optional digital signing,
validation pipeline, versioning, and error codes.
An SDF file is a valid ZIP archive with the .sdf extension. The extension is a content-type signal, not a structural one — renaming to .zip MUST produce a valid ZIP archive.
| Path | Required | Description |
|---|---|---|
visual.pdf | MUST | Human-readable PDF layer. Self-contained. No external resources. |
data.json | MUST | Structured business data. MUST validate against schema.json. |
schema.json | MUST | JSON Schema Draft 2020-12. Bundled in archive. No external $ref URIs. |
meta.json | MUST | SDF identity and provenance record. |
signature.sig | MAY | Digital signature over the four required layers. |
vendor/* | MAY | Proprietary extensions under the vendor/ namespace. |
| Limit | Value | Error |
|---|---|---|
| Maximum size per archive entry | 50 MB (uncompressed) | SDF_ERROR_ARCHIVE_TOO_LARGE |
| Maximum total uncompressed size | 200 MB | SDF_ERROR_ARCHIVE_TOO_LARGE |
Consumers MUST process archive entries in this order:
meta.json.schema.json.data.json against schema.json.visual.pdf.signature.sig if present. meta.json is the identity and provenance record of an SDF document. It answers: what is this document, who issued it, and when was it created? It is intentionally separate from data.json so that SDF metadata can evolve independently of business document schemas.
| Field | Type | Required | Description |
|---|---|---|---|
sdf_version | string | MUST | SDF spec version, e.g. "0.1" |
document_id | string (UUID v4) | MUST | Globally unique document identifier, generated at produce time |
document_type | string | MUST | Document type, e.g. "invoice", "nomination" |
issuer | string | MUST | Human-readable name of the issuing party |
created_at | string (ISO 8601) | MUST | Creation timestamp with timezone offset |
issuer_id | string | SHOULD | Machine-readable issuer identifier (tax ID, VAT, DUNS) |
recipient | string | SHOULD | Human-readable name of the receiving party |
schema_id | string (URI) | SHOULD | URI identifying the schema in schema.json |
locale | string (BCP 47) | MAY | Document locale, e.g. "en-US", "tr-TR" |
nomination_ref | string | MAY | Reference key for nomination-to-invoice matching |
document_id MUST be a UUID v4 generated at produce time. It MUST NOT be derived from any business identifier. Invoice numbers, PO numbers, and nomination references belong in data.json.
{
"sdf_version": "0.1",
"document_id": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
"document_type": "invoice",
"issuer": "Acme Supplies GmbH",
"issuer_id": "DE123456789",
"created_at": "2026-03-15T14:30:00+01:00",
"recipient": "Global Logistics AG",
"locale": "de-DE"
} data.json carries the structured business data of the document. Its schema is defined by the bundled schema.json. SDF makes no assumptions about the shape of business data — producers define the schema, and consumers validate against it.
data.json MUST validate against the bundled schema.json before the archive is written.{"amount": "string", "currency": "ISO4217"}. Bare numeric values are prohibited.sdf_version, document_id) MUST NOT appear in data.json.// Correct monetary amount
{ "amount": "1250.00", "currency": "EUR" }
// Wrong — bare numeric value, precision loss risk
{ "total": 1250.50 }
// Correct date
{ "issue_date": "2026-03-15" }
// Wrong — numeric timestamp
{ "issue_date": 1742169600 } schema.json is a JSON Schema Draft 2020-12 document bundled inside the SDF archive. Every .sdf file is self-validating: it carries its own schema, enabling validation decades after creation without network access.
schema.json MUST be valid JSON Schema Draft 2020-12. The $schema keyword MUST be present and set to "https://json-schema.org/draft/2020-12/schema".$ref URIs MUST NOT be used. All referenced sub-schemas MUST be bundled inline using JSON Schema $defs. visual.pdf is the human-readable PDF layer. It is generated from data.json by the producer. It must be fully self-contained and security-safe.
visual.pdf.visual.pdf.
SDF supports optional digital signatures via the signature.sig file. A signed SDF document provides cryptographic assurance that the four required layers have not been modified since signing.
| Algorithm | Identifier | Recommendation |
|---|---|---|
| ECDSA P-256 | ECDSA-P256 | Recommended — compact signatures, strong security |
| RSA-2048 | RSA-2048 | Supported — for compatibility with legacy PKI |
The signature covers the canonical content of all four required entries:
signing_input = SHA256(meta.json) || SHA256(data.json) || SHA256(schema.json) || SHA256(visual.pdf)
signature = Sign(private_key, signing_input) The concatenation order is deterministic and MUST NOT vary between implementations. signature.sig itself and vendor/* entries are not included.
{
"algorithm": "ECDSA-P256",
"key_id": "tenant-key-2026-03",
"signature": "<base64url-encoded signature bytes>",
"signed_at": "2026-03-15T14:30:05+01:00"
} SDF validation is a sequential pipeline. Each step MUST succeed before the next begins. A failure at any step produces a standard error code and halts validation.
| Step | Check | Error on Failure |
|---|---|---|
| 1 | File is a valid ZIP archive | SDF_ERROR_NOT_ZIP |
| 2 | No path traversal in entry names | SDF_ERROR_INVALID_ARCHIVE |
| 3 | Size limits not exceeded | SDF_ERROR_ARCHIVE_TOO_LARGE |
| 4 | All required entries present | SDF_ERROR_MISSING_FILE |
| 5 | meta.json validates against SDF meta schema | SDF_ERROR_INVALID_META |
| 6 | schema.json is valid JSON Schema Draft 2020-12 | SDF_ERROR_INVALID_SCHEMA |
| 7 | data.json validates against schema.json | SDF_ERROR_SCHEMA_MISMATCH |
| 8 | If signature.sig present: signature is valid | SDF_ERROR_INVALID_SIGNATURE |
| Level | Description |
|---|---|
| Basic | Implements container, meta.json, data.json, schema.json, and visual.pdf. Signing not required. |
| Full | All Basic requirements plus digital signing and signature verification. |
The SDF version is declared in meta.json via the sdf_version field. The current version is 0.1.
sdf_version higher than their supported version with SDF_ERROR_UNSUPPORTED_VERSION.sdf_version (forward compatibility).All SDF error codes are defined here. Implementations MUST NOT invent new codes outside this list. New codes require a spec update via the GitHub repository.
| Code | Trigger | Typical Cause |
|---|---|---|
| SDF_ERROR_NOT_ZIP | File is not a valid ZIP archive | Wrong extension, corrupted download, truncated upload |
| SDF_ERROR_INVALID_META | meta.json fails validation | Missing required fields, wrong types, non-UUID document_id |
| SDF_ERROR_MISSING_FILE | Required archive entry absent | Incomplete producer, partially written archive |
| SDF_ERROR_SCHEMA_MISMATCH | data.json fails against schema.json | Business data does not match schema, producer bug, schema drift |
| SDF_ERROR_INVALID_SCHEMA | schema.json is not valid JSON Schema Draft 2020-12 | Schema syntax error, external $ref URI, invalid $schema value |
| SDF_ERROR_UNSUPPORTED_VERSION | sdf_version higher than consumer supports | Consumer is on older spec version than producer |
| SDF_ERROR_INVALID_SIGNATURE | Signature verification fails | Key mismatch, tampered data, wrong key_id |
| SDF_ERROR_INVALID_ARCHIVE | Path traversal or structural violation | Malformed ZIP, .. in entry path, unrecognized root-level entry |
| SDF_ERROR_ARCHIVE_TOO_LARGE | Archive exceeds size limits | Oversized PDF, ZIP bomb attempt, total > 200 MB uncompressed |
This specification uses the following normative keywords per RFC 2119:
| Keyword | Meaning |
|---|---|
| MUST | An absolute requirement of the specification. |
| MUST NOT | An absolute prohibition of the specification. |
| SHOULD | Recommended; valid reasons may exist to deviate, but implications must be understood. |
| SHOULD NOT | Not recommended; valid reasons may exist, but implications must be understood. |
| MAY | Truly optional. |