Normative Specification

SDF Format
Specification v0.1

The complete normative reference for producers and consumers of .sdf files. Covers container format, all four required layers, optional digital signing, validation pipeline, versioning, and error codes.

Status Draft v0.1
License CC-BY 4.0
Published March 2026
Issuer Etapsky Inc.
Contents
§1

Container Format

An SDF file is a valid ZIP archive with the .sdf extension. The extension is a content-type signal, not a structural one — renaming to .zip MUST produce a valid ZIP archive.

File Manifest

PathRequiredDescription
visual.pdfMUSTHuman-readable PDF layer. Self-contained. No external resources.
data.jsonMUSTStructured business data. MUST validate against schema.json.
schema.jsonMUSTJSON Schema Draft 2020-12. Bundled in archive. No external $ref URIs.
meta.jsonMUSTSDF identity and provenance record.
signature.sigMAYDigital signature over the four required layers.
vendor/*MAYProprietary extensions under the vendor/ namespace.
No files other than those listed above may appear at the archive root. Unrecognized root-level entries MUST be rejected with SDF_ERROR_INVALID_ARCHIVE.

Size Limits

LimitValueError
Maximum size per archive entry50 MB (uncompressed)SDF_ERROR_ARCHIVE_TOO_LARGE
Maximum total uncompressed size200 MBSDF_ERROR_ARCHIVE_TOO_LARGE

Consumer Processing Order

Consumers MUST process archive entries in this order:

  1. Verify the file is a valid ZIP. If not: SDF_ERROR_NOT_ZIP.
  2. Validate all entry paths (traversal check). If unsafe: SDF_ERROR_INVALID_ARCHIVE.
  3. Check size limits. If exceeded: SDF_ERROR_ARCHIVE_TOO_LARGE.
  4. Verify all required entries are present. If missing: SDF_ERROR_MISSING_FILE.
  5. Parse and validate meta.json.
  6. Parse and validate schema.json.
  7. Parse and validate data.json against schema.json.
  8. Process visual.pdf.
  9. Verify signature.sig if present.
§2

meta.json — Identity Layer

meta.json is the identity and provenance record of an SDF document. It answers: what is this document, who issued it, and when was it created? It is intentionally separate from data.json so that SDF metadata can evolve independently of business document schemas.

Fields

FieldTypeRequiredDescription
sdf_versionstringMUSTSDF spec version, e.g. "0.1"
document_idstring (UUID v4)MUSTGlobally unique document identifier, generated at produce time
document_typestringMUSTDocument type, e.g. "invoice", "nomination"
issuerstringMUSTHuman-readable name of the issuing party
created_atstring (ISO 8601)MUSTCreation timestamp with timezone offset
issuer_idstringSHOULDMachine-readable issuer identifier (tax ID, VAT, DUNS)
recipientstringSHOULDHuman-readable name of the receiving party
schema_idstring (URI)SHOULDURI identifying the schema in schema.json
localestring (BCP 47)MAYDocument locale, e.g. "en-US", "tr-TR"
nomination_refstringMAYReference key for nomination-to-invoice matching
The document_id rule: document_id MUST be a UUID v4 generated at produce time. It MUST NOT be derived from any business identifier. Invoice numbers, PO numbers, and nomination references belong in data.json.
{
  "sdf_version": "0.1",
  "document_id": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
  "document_type": "invoice",
  "issuer": "Acme Supplies GmbH",
  "issuer_id": "DE123456789",
  "created_at": "2026-03-15T14:30:00+01:00",
  "recipient": "Global Logistics AG",
  "locale": "de-DE"
}
§3

data.json — Data Layer

data.json carries the structured business data of the document. Its schema is defined by the bundled schema.json. SDF makes no assumptions about the shape of business data — producers define the schema, and consumers validate against it.

Constraints

// Correct monetary amount
{ "amount": "1250.00", "currency": "EUR" }

// Wrong — bare numeric value, precision loss risk
{ "total": 1250.50 }

// Correct date
{ "issue_date": "2026-03-15" }

// Wrong — numeric timestamp
{ "issue_date": 1742169600 }
§4

schema.json — Schema Layer

schema.json is a JSON Schema Draft 2020-12 document bundled inside the SDF archive. Every .sdf file is self-validating: it carries its own schema, enabling validation decades after creation without network access.

Rules

The offline-safe requirement is a core SDF design decision. Documents must be self-validating for the lifetime of the archive — potentially decades. External URI dependencies violate this guarantee.
§5

visual.pdf — Visual Layer

visual.pdf is the human-readable PDF layer. It is generated from data.json by the producer. It must be fully self-contained and security-safe.

Self-Containment

Security

§6

Digital Signing

SDF supports optional digital signatures via the signature.sig file. A signed SDF document provides cryptographic assurance that the four required layers have not been modified since signing.

Algorithms

AlgorithmIdentifierRecommendation
ECDSA P-256ECDSA-P256Recommended — compact signatures, strong security
RSA-2048RSA-2048Supported — for compatibility with legacy PKI

What Is Signed

The signature covers the canonical content of all four required entries:

signing_input = SHA256(meta.json) || SHA256(data.json) || SHA256(schema.json) || SHA256(visual.pdf)
signature     = Sign(private_key, signing_input)

The concatenation order is deterministic and MUST NOT vary between implementations. signature.sig itself and vendor/* entries are not included.

signature.sig Structure

{
  "algorithm": "ECDSA-P256",
  "key_id": "tenant-key-2026-03",
  "signature": "<base64url-encoded signature bytes>",
  "signed_at": "2026-03-15T14:30:05+01:00"
}
§7

Validation Pipeline

SDF validation is a sequential pipeline. Each step MUST succeed before the next begins. A failure at any step produces a standard error code and halts validation.

StepCheckError on Failure
1File is a valid ZIP archiveSDF_ERROR_NOT_ZIP
2No path traversal in entry namesSDF_ERROR_INVALID_ARCHIVE
3Size limits not exceededSDF_ERROR_ARCHIVE_TOO_LARGE
4All required entries presentSDF_ERROR_MISSING_FILE
5meta.json validates against SDF meta schemaSDF_ERROR_INVALID_META
6schema.json is valid JSON Schema Draft 2020-12SDF_ERROR_INVALID_SCHEMA
7data.json validates against schema.jsonSDF_ERROR_SCHEMA_MISMATCH
8If signature.sig present: signature is validSDF_ERROR_INVALID_SIGNATURE

Conformance Levels

LevelDescription
BasicImplements container, meta.json, data.json, schema.json, and visual.pdf. Signing not required.
FullAll Basic requirements plus digital signing and signature verification.
§8

Versioning

The SDF version is declared in meta.json via the sdf_version field. The current version is 0.1.

Version Negotiation

§9

Error Codes

All SDF error codes are defined here. Implementations MUST NOT invent new codes outside this list. New codes require a spec update via the GitHub repository.

CodeTriggerTypical Cause
SDF_ERROR_NOT_ZIPFile is not a valid ZIP archiveWrong extension, corrupted download, truncated upload
SDF_ERROR_INVALID_METAmeta.json fails validationMissing required fields, wrong types, non-UUID document_id
SDF_ERROR_MISSING_FILERequired archive entry absentIncomplete producer, partially written archive
SDF_ERROR_SCHEMA_MISMATCHdata.json fails against schema.jsonBusiness data does not match schema, producer bug, schema drift
SDF_ERROR_INVALID_SCHEMAschema.json is not valid JSON Schema Draft 2020-12Schema syntax error, external $ref URI, invalid $schema value
SDF_ERROR_UNSUPPORTED_VERSIONsdf_version higher than consumer supportsConsumer is on older spec version than producer
SDF_ERROR_INVALID_SIGNATURESignature verification failsKey mismatch, tampered data, wrong key_id
SDF_ERROR_INVALID_ARCHIVEPath traversal or structural violationMalformed ZIP, .. in entry path, unrecognized root-level entry
SDF_ERROR_ARCHIVE_TOO_LARGEArchive exceeds size limitsOversized PDF, ZIP bomb attempt, total > 200 MB uncompressed
§10

Normative Language

This specification uses the following normative keywords per RFC 2119:

KeywordMeaning
MUSTAn absolute requirement of the specification.
MUST NOTAn absolute prohibition of the specification.
SHOULDRecommended; valid reasons may exist to deviate, but implications must be understood.
SHOULD NOTNot recommended; valid reasons may exist, but implications must be understood.
MAYTruly optional.