Skip to content

Specification v1.0

Portable AI Memory (PAM) — Specification v1.0

Section titled “Portable AI Memory (PAM) — Specification v1.0”

Status: Published
Date: 2026-02-17
Authors: Daniel Gines [email protected] (https://github.com/danielgines)
License: CC BY 4.0 (specification), Apache 2.0 (schema and reference implementation)


This document defines Portable AI Memory (PAM), an interchange format for user memories generated by AI assistants. PAM enables the portability of user context, preferences, knowledge, and conversation history across any LLM provider without vendor lock-in or semantic data loss.

PAM is to AI memory what vCard is to contacts and iCalendar is to events: a universal interchange format that decouples user data from specific implementations.


AI assistants (ChatGPT, Claude, Gemini, Grok, etc.) accumulate knowledge about users over time — preferences, expertise, projects, goals, and behavioral patterns. This knowledge is stored in proprietary, undocumented formats with no interoperability between providers. Users cannot:

  • Migrate their AI context when switching providers
  • Maintain a unified identity across multiple AI assistants
  • Audit, correct, or manage memories systematically
  • Own and control their AI-generated personal knowledge

PAM defines a standardized JSON interchange format with:

  • A closed taxonomy of memory types
  • Full provenance tracking (which platform, conversation, and method produced each memory)
  • Temporal lifecycle management (creation, validity, supersession, archival)
  • Confidence scoring with decay models
  • Content hashing for deterministic deduplication
  • A semantic relations graph between memories
  • Access control for multi-agent and federation scenarios
  • Optional embeddings as a separate companion file
  • Integrity verification for corruption and tampering detection

PAM is an interchange format, not a storage format. Implementations SHOULD use databases (SQLite, PostgreSQL, vector databases, graph databases) for internal storage and MUST support export and import using this format.

The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHOULD”, “SHOULD NOT”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in RFC 2119.


A PAM export consists of one required file and optional companion files:

FileRequiredDescription
memory-store.jsonYesMain interchange file with memories, relations, conversations index, and integrity block
conversations/*.jsonNoIndividual conversation files referenced by conversations_index[].storage.ref
embeddings.jsonNoSeparate file containing vector embeddings for memory objects

Each file type is validated against its own JSON Schema Draft 2020-12 schema:

  • schemas/portable-ai-memory.schema.json — validates memory-store.json
  • schemas/portable-ai-memory-conversation.schema.json — validates conversation files (see §25)
  • schemas/portable-ai-memory-embeddings.schema.json — validates embeddings.json

{
"schema": "portable-ai-memory",
"schema_version": "1.0",
"spec_uri": "https://portable-ai-memory.org/spec/v1.0",
"export_id": "e47ac10b-58cc-4372-a567-0e02b2c3d479",
"exported_by": "system-name/1.0.0",
"export_date": "2026-02-15T22:00:00Z",
"owner": {
...
},
"memories": [
...
],
"relations": [
...
],
"conversations_index": [
...
],
"integrity": {
...
},
"export_type": "full",
"type_registry": "https://portable-ai-memory.org/types/",
"signature": {
...
}
}
FieldTypeDescription
schemastringMUST be "portable-ai-memory"
schema_versionstringSemantic version. Current: "1.0"
ownerobjectOwner identification
memoriesarrayArray of memory objects
FieldTypeDescription
spec_uristring|nullURI or URN of the specification version. Implementations MUST NOT require spec_uri to resolve over network
export_idstring|nullUnique identifier for this export (UUID v4). Enables tracking and duplicate detection
exported_bystring|nullSystem that generated the export. Format: "name/semver"
export_datestringISO 8601 timestamp of export
relationsarraySemantic relationships between memories
conversations_indexarrayLightweight conversation references
integrityobjectIntegrity verification block
export_typestring"full" or "incremental". Default: "full" (Section 16)
base_export_idstring|nullFor incremental exports: export_id of the base export (Section 16)
sincestring|nullFor incremental exports: ISO 8601 cutoff timestamp (Section 16)
type_registrystring|nullURI of official type registry (Section 19)
signatureobject|nullCryptographic signature (Section 18)

The memory object is the fundamental unit of the format. Each memory represents a discrete piece of knowledge about the user.

FieldTypeDescription
idstringUnique identifier. SHOULD be UUID v4
typestringMemory type from closed taxonomy (Section 5)
contentstringNatural language content. Primary semantic payload
content_hashstringSHA-256 of normalized content (Section 6)
temporalobjectTemporal metadata. created_at is required
provenanceobjectOrigin metadata. platform is required
FieldConditionDescription
custom_typeREQUIRED when type == "custom"Custom type identifier. MUST be null when type is not "custom"
FieldTypeDefaultDescription
statusstring"active"Lifecycle status (Section 7)
summarystring|nullnullShort summary for display
tagsarray[]Lowercase tags. Pattern: ^[a-z0-9][a-z0-9_-]*$
confidenceobjectConfidence scoring (Section 8)
accessobjectAccess control (Section 9)
embedding_refstring|nullnullReference to embeddings file (Section 12)
metadataobjectAdditional metadata (Section 10)

PAM defines a closed taxonomy of memory types. The taxonomy is extensible via the "custom" type.

TypeDescription
factObjective, verifiable information about the user
preferenceUser preference, taste, or stated desire
skillCompetency, expertise, or demonstrated ability
contextSituational or temporal context
relationshipRelation to another person, entity, or organization
goalActive objective or aspiration
instructionHow the user wants to be treated or addressed
identityPersonal identity information
environmentTechnical or physical environment details
projectActive project or initiative
customExtensible type. REQUIRES custom_type field
IF type == "custom" THEN custom_type MUST be a non-empty string
IF type != "custom" THEN custom_type MUST be null

Example:

{
"type": "custom",
"custom_type": "security_clearance"
}

The content_hash field enables deterministic deduplication across exports from different platforms.

normalize(content):
1. Trim leading and trailing whitespace
2. Convert to lowercase
3. Apply Unicode NFC normalization
4. Collapse multiple consecutive spaces to a single space
content_hash = "sha256:" + hex(SHA-256(UTF-8(normalize(content))))

Pattern: ^sha256:[a-f0-9]{64}$

Example: "sha256:a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6e7f8a9b0c1d2e3f4a5b6c7d8e9f0a1b2"


Each memory has a status field tracking its lifecycle state.

StatusDescription
activeCurrent and valid. Default state
supersededReplaced by a newer memory. temporal.superseded_by SHOULD reference the replacement
deprecatedStill valid but no longer prioritized
retractedExplicitly invalidated by the user
archivedRetained for historical purposes only
active → superseded (new information replaces old)
active → deprecated (relevance diminished)
active → retracted (user explicitly invalidates)
active → archived (user archives for history)
superseded → archived (historical retention)
deprecated → retracted (user explicitly invalidates)
deprecated → archived (historical retention)

The confidence block contains system-computed scores. This is NOT user-defined priority.

FieldTypeDescription
initialnumber [0.0, 1.0]Confidence at time of extraction
currentnumber [0.0, 1.0]Current confidence after decay and reinforcement
decay_modelstring|nullDecay model: "time_linear", "time_exponential", "none", or null
last_reinforcedstring|nullISO 8601 timestamp of last reinforcement
  • time_linear: Confidence decreases linearly with time since last reinforcement
  • time_exponential: Confidence decreases exponentially with time
  • none: No automatic decay (e.g., identity facts)

The specific decay rate is implementation-defined. PAM records the model, not the parameters.


The access block enables multi-agent and federation scenarios.

FieldTypeDefaultDescription
visibilitystring"private""private", "shared", or "public"
exportablebooleantrueWhether this memory may be included in exports
shared_witharray[]List of access grants

Each grant specifies an entity and its permissions:

{
"entity": "agent-work-assistant",
"permissions": [
"read"
]
}

Permissions: "read", "write", "delete".


The metadata block contains non-semantic additional properties. This block allows additionalProperties for extensibility.

FieldTypeDescription
languagestring|nullBCP 47 language tag. Pattern: ^[a-z]{2,3}(-[A-Z][a-z]{3})?(-[A-Z]{2})?$
domainstring|nullKnowledge domain (e.g., "technical", "personal", "professional")

Implementations MAY add custom fields to this block.


The provenance block enables auditability and cross-platform conflict resolution.

FieldTypeDescription
platformstringSource platform identifier
FieldTypeDescription
platform_user_idstring|nullUser ID on source platform
conversation_refstring|nullReference to conversations_index entry
message_refstring|nullReference to specific message
extraction_methodstring|nullHow the memory was extracted
extracted_atstring|nullISO 8601 timestamp of extraction
extractorstring|nullSystem that performed extraction
MethodDescription
llm_inferenceLLM inferred the memory from conversation
explicit_user_inputUser explicitly stated the information
api_exportExtracted from platform API/export
browser_extractionExtracted via browser automation or extension
manualManually entered by user or operator

Platform identifiers MUST be lowercase ASCII matching the pattern:

^[a-z0-9_-]{2,32}$

Identifiers SHOULD be registered in a public registry to prevent collisions.

The same identifier namespace MUST be used across all PAM schemas: provenance.platform in the memory store, conversations_index[].platform, and provider.name in the conversation schema. Use product names, not company names.

Recommended identifiers (not an exhaustive list):

chatgpt, claude, gemini, grok, perplexity, copilot, local, manual


Embeddings are OPTIONAL. They are stored in a separate embeddings.json file.

  1. Embeddings MAY be omitted entirely from an export
  2. When omitted, embedding_ref in memory objects MUST be null
  3. Consumers MUST NOT fail if embedding_ref is null or if embeddings.json is missing
  4. Consumers MAY regenerate embeddings from the content field at any time using any model
  5. The content field in the memory object is ALWAYS the authoritative source of semantic content, never the embedding
  6. Each memory object MUST have at most one corresponding embedding in the embeddings file — the memory_id field MUST be unique across all embedding objects. Implementations that maintain multiple embeddings internally (e.g., for different models) SHOULD export only the most recent or preferred embedding
{
"schema": "portable-ai-memory-embeddings",
"schema_version": "1.0",
"embeddings": [
{
"id": "emb-uuid",
"memory_id": "mem-uuid",
"model": "text-embedding-3-small",
"dimensions": 1536,
"created_at": "2026-02-15T22:00:00Z",
"vector": [
0.1,
0.2,
...
],
"storage": null
}
]
}
FieldRequiredTypeDescription
idYesstringUnique identifier. Referenced by memory.embedding_ref
memory_idYesstringID of the associated memory object
modelYesstringEmbedding model identifier
dimensionsYesintegerVector dimensionality
created_atYesstringISO 8601 timestamp
vectorNoarray|nullInline vector. Null if stored externally
storageNoobject|nullExternal storage reference

The relations array defines semantic relationships between memory objects, forming a knowledge graph.

FieldRequiredTypeDescription
idYesstringUnique identifier
fromYesstringSource memory ID
toYesstringTarget memory ID
typeYesstringRelationship type
confidenceNonumber|nullConfidence in this relationship [0.0, 1.0]
created_atYesstringISO 8601 timestamp
TypeSemantics
supportsSource provides evidence for target
contradictsSource conflicts with target
extendsSource adds detail to target
supersedesSource replaces target
related_toGeneral semantic relation
derived_fromSource was inferred from target

The conversations index provides lightweight references to conversations without embedding full message history.

Exporters MUST ensure consistency between memory.provenance.conversation_ref and the corresponding conversations_index[].derived_memories entry.

Importers SHOULD treat derived_memories as advisory and MAY reconstruct from provenance using:

for memory in memories:
conv_id = memory.provenance.conversation_ref
if conv_id:
conversations_index[conv_id].derived_memories.append(memory.id)

Full conversation data is stored externally and referenced via:

{
"storage": {
"type": "file",
"ref": "conversations/conv-001.json",
"format": "json"
}
}

Storage types: "file", "database", "object_storage", "vector_db", "uri".


The integrity block enables corruption and tampering detection.

PAM uses RFC 8785 (JSON Canonicalization Scheme — JCS) for deterministic serialization. The canonicalization field declares the method used:

ValueStandardDescription
RFC8785RFC 8785JSON Canonicalization Scheme. Default and currently only supported method

This eliminates implementation ambiguity across languages and platforms. RFC 8785 defines deterministic rules for key ordering, number serialization, string escaping, and whitespace elimination.

The checksum is computed using the following deterministic pipeline:

1. Take the memories array
2. Sort memory objects by id ascending
3. Canonicalize per RFC 8785 (JCS):
- Sort all object keys lexicographically (recursive)
- Serialize numbers per ECMAScript/IEEE 754 rules (e.g., 1.0 → 1)
- Apply RFC 8785 string escaping
- No whitespace
- UTF-8 encoding
4. Compute SHA-256 over the canonical UTF-8 bytes
5. Format as "sha256:<hex>"

IMPORTANT: Standard json.dumps() in most languages is NOT RFC 8785 compliant. Implementations MUST use a dedicated JCS library. See Appendix C for library recommendations per language.

FieldRequiredTypeDescription
canonicalizationNostringCanonicalization method. Default: "RFC8785"
checksumYesstringSHA-256 of canonicalized memories. Format: sha256:<hex>
total_memoriesYesintegerMUST equal len(memories)
integrity.total_memories MUST equal len(memories)
integrity.checksum MUST match the computed checksum of the canonicalized memories array
If integrity.canonicalization is absent, implementations MUST assume RFC8785

PAM supports both full and incremental (delta) exports for efficient synchronization.

TypeDescription
fullComplete memory store. Default. Self-contained
incrementalDelta since a previous export. Contains only new or updated memories

When export_type is "incremental":

FieldRequiredDescription
base_export_idSHOULDThe export_id of the base export this delta applies to
sinceSHOULDISO 8601 timestamp. Only memories created or updated after this time are included

Importers processing incremental exports MUST:

  1. Match base_export_id to a previously imported full export
  2. For each memory in the delta: if id exists in the base, update it; otherwise, insert it
  3. Recompute integrity.checksum after merge
  4. Memories with status: "retracted" in the delta MUST be marked as retracted in the base
  5. Importers MUST NOT physically delete memories marked as "retracted". They MUST preserve the memory object and update its status. This ensures auditability and enables undo operations

Importers MAY reject incremental exports if base_export_id does not match any known export.


PAM supports W3C Decentralized Identifiers for universal cross-platform identity resolution.

The owner.did field accepts any valid DID method:

did:key:z6MkhaXgBZDvotDkL5257faiztiGiC2QtKLGpbnnEGta2doK
did:web:example.com:user:alice
did:ion:EiAnKD8...
did:pkh:eip155:1:0xab16a96D359eC26a11e2C2b3d8f8B8942d5Bfcdb
  1. owner.did is OPTIONAL but RECOMMENDED for exports shared between systems
  2. When present, the DID MUST be resolvable to a DID Document per W3C DID Core (https://www.w3.org/TR/did-1.0/)
  3. If signature is present, signature.public_key SHOULD correspond to a verification method in the DID Document
  4. owner.id remains REQUIRED even when did is present, for backward compatibility
MethodUse CaseKey Properties
did:keySelf-contained, no resolution neededSimplest. Key is the identifier
did:webOrganization-hosted identityDNS-based, easy to set up
did:ionDecentralized, Bitcoin-anchoredMaximum decentralization
did:pkhBlockchain wallet-basedReuses existing crypto keys

PAM exports MAY be cryptographically signed to verify authenticity and detect tampering.

{
"signature": {
"algorithm": "Ed25519",
"public_key": "z6MkhaXgBZDvotDkL5257faiztiGiC2QtKLGpbnnEGta2doK",
"value": "eyJhbGciOiJFZERTQSJ9..base64url-signature",
"signed_at": "2026-02-15T22:00:01Z",
"key_id": "did:key:z6Mk...#z6Mk..."
}
}
AlgorithmTypeRecommended
Ed25519EdDSAYes — fast, small keys, side-channel resistant
ES256ECDSA P-256Yes — widely supported
ES384ECDSA P-384Optional
RS256RSA 2048+Legacy compatibility
RS384RSA 3072+Legacy compatibility
RS512RSA 4096+Legacy compatibility

The signature MUST cover not only the memories integrity but also export identity and ownership, to prevent replay attacks and export spoofing.

When signature is present (not null), the fields export_id and export_date MUST also be present and non-null. This is enforced by the schema via a conditional dependency.

The signature MUST be computed as follows:

1. Compute integrity.checksum (Section 15)
2. Construct the signature payload object:
{
"checksum": integrity.checksum,
"export_id": export_id,
"export_date": export_date,
"owner_id": owner.id
}
3. Canonicalize the payload using RFC 8785 (JCS)
4. Sign the canonical UTF-8 bytes with the private key using the specified algorithm
5. Base64url-encode the signature (RFC 4648 §5)
6. Store in signature.value

This ensures that altering memories (which changes the checksum), export_id, export_date, or owner.id will invalidate the signature. Note that changes to relations, conversations_index, or other owner fields are NOT covered by the signature payload.

1. Recompute integrity.checksum from the memories array
2. Verify computed checksum matches integrity.checksum
3. Reconstruct the signature payload object from the export
4. Canonicalize with RFC 8785
5. Decode signature.value from Base64url
6. Verify the signature against the canonical payload using signature.public_key
7. If owner.did is present, optionally resolve the DID Document and verify the key matches
  1. Signature is OPTIONAL but RECOMMENDED for exports shared between systems or users
  2. The signature payload MUST include checksum, export_id, export_date, and owner_id
  3. signature.signed_at MUST be equal to or after export_date
  4. If signature.key_id is present and owner.did is present, key_id SHOULD be a DID URL referencing a verification method in the owner’s DID Document
  5. Importers SHOULD verify signatures when present but MUST NOT reject unsigned exports

PAM provides a centralized registry for custom memory types to enable interoperability between implementations.

The type_registry root field specifies the registry URI:

{
"type_registry": "https://portable-ai-memory.org/types/"
}

The registry is a publicly accessible JSON document listing registered custom types:

{
"registry_version": "1.0",
"types": {
"security_clearance": {
"description": "Security clearance level held by the user",
"proposed_by": "my-exporter/1.0.0",
"status": "registered",
"registered_at": "2026-03-01T00:00:00Z"
},
"medical_condition": {
"description": "Known medical condition or diagnosis",
"proposed_by": "healthai/1.0.0",
"status": "registered",
"registered_at": "2026-04-15T00:00:00Z"
}
}
}
unregistered → registered → candidate → standard
StatusDescription
unregisteredCustom type used locally, not in registry
registeredListed in registry, available for cross-platform use
candidateNominated for promotion to standard taxonomy
standardPromoted to core taxonomy in a future spec version
  1. Custom types SHOULD be registered at the official registry for interoperability
  2. Implementations MUST accept any custom_type value regardless of registry status
  3. The registry is advisory, not prescriptive — implementations MUST NOT reject unregistered types
  4. Community-adopted custom types MAY be promoted to the standard taxonomy in future spec versions

20. Interoperability and Migration Compatibility Matrix

Section titled “20. Interoperability and Migration Compatibility Matrix”

IMPORTANT: The interoperability paths described in this section reflect observed export formats and extraction strategies as of the time of publication. AI providers do not natively support PAM at the time of this specification. Implementations SHOULD treat these mappings as best-effort compatibility guidance, not guaranteed or officially supported migration paths. Provider export formats may change without notice. Importers MUST be versioned and resilient to format variations.

SourceExport MethodPAM Coverage
ChatGPTconversations.json + memory promptFull: conversations, memories, preferences
ClaudeJSON export + memory prompt + memory editsFull: conversations, memories, instructions
GeminiGoogle Takeout + prompt extractionPartial: conversations; memories via prompt
CopilotPrivacy Dashboard CSVPartial: conversations only
GrokData export (grok.com settings)Full: conversations, projects, media posts, assets
PerplexityForm request + promptPartial: limited conversation access
Local LLMsDirect database accessFull: complete control
TargetMethod
ChatGPTCustom instructions, conversation priming
ClaudeMemory edits, Projects, system prompts
GeminiGems, conversation priming
Any LLMSystem prompt injection from PAM memories

PAM files contain personal information. Implementations MUST:

  • Encrypt PAM files at rest when stored locally
  • Use TLS for any network transmission of PAM files
  • Respect the access.exportable flag when generating exports
  • Not include memories marked exportable: false in exports

The content_hash uses SHA-256 for deduplication, not for cryptographic authentication. For tamper-proof verification, use the signature block (Section 18).

The provenance.platform_user_id field is OPTIONAL specifically to allow privacy-preserving exports. Implementations SHOULD allow users to strip platform identifiers before sharing.

When the signature block is present, implementations SHOULD verify it before processing the export. A failed verification SHOULD result in a warning to the user, not a silent failure.


New types are added via the "custom" type mechanism and the type registry (Section 19). If a custom type achieves broad adoption, it MAY be promoted to the standard taxonomy in a future version.

The metadata block allows additionalProperties, enabling implementations to add custom fields without breaking schema validation.

Schema versions follow semantic versioning:

  • Patch (1.0.x): Documentation clarifications, no schema changes
  • Minor (1.x.0): Backward-compatible additions (new optional fields, new enum values)
  • Major (x.0.0): Breaking changes requiring migration

A reference implementation is available for PAM. It provides:

  1. Platform extractors — Parse exports from ChatGPT, Claude, Gemini, Copilot, Grok into PAM format
  2. Validator — CLI tool for schema validation (pam validate)
  3. Converter — Export PAM memories to platform-specific import formats
  4. Integrity checker — Verify checksums and consistency rules
  5. Signature tools — Sign and verify exports (pam sign, pam verify)

This specification was informed by:

  • University of Stavanger PKG research — Krisztian Balog, Martin G. Skjæveland, and the Personal Knowledge Graph research group
  • Solid Project — Tim Berners-Lee’s vision of user-owned data stores
  • Mem0, Zep, Letta — Commercial memory layer implementations that demonstrated practical memory management patterns
  • Samsung Personal Data Engine — Production-scale personal knowledge graph deployment
  • EU Digital Markets Act — Regulatory framework driving data portability requirements
  • W3C DID Core — Decentralized identity standard enabling cross-platform identity resolution

PAM defines a companion schema for storing full conversation data referenced by conversations_index[].storage.ref. While the main memory-store schema contains extracted knowledge, the conversation schema preserves the raw dialogue from which memories were derived.

The conversation schema serves as the normalized intermediate format between provider-specific exports and the PAM memory store. The import pipeline is:

Raw Provider Export → Parse → Normalize → Conversation Schema → Extract Memories → PAM Memory Store

portable-ai-memory-conversation.schema.json — JSON Schema Draft 2020-12

DAG support: OpenAI conversations use branching (mapping with parent/children). The schema supports parent_id and children_ids per message to preserve this structure. Linear conversations (Claude, Gemini) set parent_id: null and children_ids: [].

Role normalization: Each provider uses different role names (human, Request, AI, ASSISTANT, model names). The schema normalizes to: user, assistant, system, tool. See importer-mappings.md section 6 for the full verified mapping table.

Multipart content: Messages may contain text, images, code, files. The content field supports both simple text ( type: "text") and multipart (type: "multipart" with parts[]).

Import metadata: Normalized conversations SHOULD record the importer version, source file, and source checksum. This enables debugging, re-import, and format version tracking. The import_metadata block is OPTIONAL in the schema to allow lightweight exports, but implementations SHOULD populate it for auditability.

raw_metadata: Provider-specific fields that don’t map to PAM are preserved verbatim in raw_metadata for lossless round-tripping.

See importer-mappings.md for field-by-field mappings from each provider format to the normalized conversation schema:

  • OpenAI (ChatGPT): conversations.json with DAG mapping
  • Anthropic (Claude): conversations.json with chat_messages array
  • Google (Gemini): Google Takeout MyActivity.json (single activity log array)
  • Microsoft (Copilot): Privacy Dashboard CSV
  • xAI (Grok): Data export via grok.com settings (conversations, projects, assets)

Provider export formats change without notice. Importers MUST be versioned:

importer_version: "openai-importer/2025.01"

When a format changes, create a new importer version while keeping the old one for processing older exports.


This specification document (spec.md) is licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0). You may share and adapt this document for any purpose, provided appropriate credit is given.

Full text: https://creativecommons.org/licenses/by/4.0/

The JSON Schema files (portable-ai-memory.schema.json, portable-ai-memory-conversation.schema.json, portable-ai-memory-embeddings.schema.json) and all reference implementation code are licensed under the Apache License, Version 2.0.

Full text: https://www.apache.org/licenses/LICENSE-2.0


  • Memory store: examples/example-memory-store.json
  • Conversation: examples/example-conversation.json
  • Embeddings: examples/example-embeddings.json
  • schemas/portable-ai-memory.schema.json — Main memory store schema (JSON Schema Draft 2020-12)
  • schemas/portable-ai-memory-embeddings.schema.json — Embeddings schema (JSON Schema Draft 2020-12)
  • schemas/portable-ai-memory-conversation.schema.json — Normalized conversation schema (JSON Schema Draft 2020-12)

Appendix C: Content Hash Reference Implementation

Section titled “Appendix C: Content Hash Reference Implementation”
import hashlib
import unicodedata
def normalize_content(content: str) -> str:
"""Normalize content for deterministic hashing."""
text = content.strip()
text = text.lower()
text = unicodedata.normalize("NFC", text)
text = " ".join(text.split())
return text
def compute_content_hash(content: str) -> str:
"""Compute SHA-256 hash of normalized content."""
normalized = normalize_content(content)
hash_hex = hashlib.sha256(normalized.encode("utf-8")).hexdigest()
return f"sha256:{hash_hex}"
def compute_integrity_checksum(memories: list) -> str:
"""Compute deterministic checksum of memories array using RFC 8785.
IMPORTANT: This function MUST use an RFC 8785 (JCS) compliant
serializer. Standard json.dumps() is NOT sufficient because:
- json.dumps serializes 1.0 as "1.0", RFC 8785 requires "1"
- json.dumps does not guarantee RFC 8785 Unicode escaping rules
- json.dumps number formatting differs from IEEE 754/ECMAScript
Python: pip install rfc8785
Node.js: npm install canonicalize
Go: github.com/nicktrav/canonicaljson
Java: org.erdtman:java-json-canonicalization
"""
import rfc8785
sorted_memories = sorted(memories, key=lambda m: m["id"])
canonical_bytes = rfc8785.dumps(sorted_memories)
hash_hex = hashlib.sha256(canonical_bytes).hexdigest()
return f"sha256:{hash_hex}"

WARNING: Do NOT use json.dumps(..., sort_keys=True, separators=(",", ":")) for checksum computation. It produces different output than RFC 8785 for floating-point numbers, which will result in checksum mismatches across implementations.

Appendix D: Signature Reference Implementation

Section titled “Appendix D: Signature Reference Implementation”
from cryptography.hazmat.primitives.asymmetric.ed25519 import Ed25519PrivateKey
from cryptography.hazmat.primitives import serialization
import base64
import rfc8785
def build_signature_payload(export: dict) -> bytes:
"""Construct and canonicalize the signature payload per Section 18.3.
The payload MUST include checksum, export_id, export_date, and owner_id
to prevent replay attacks and export spoofing.
"""
payload = {
"checksum": export["integrity"]["checksum"],
"export_id": export["export_id"],
"export_date": export["export_date"],
"owner_id": export["owner"]["id"]
}
return rfc8785.dumps(payload)
def sign_export(export: dict, private_key: Ed25519PrivateKey) -> str:
"""Sign an export with Ed25519 over the canonical payload."""
payload_bytes = build_signature_payload(export)
signature_bytes = private_key.sign(payload_bytes)
return base64.urlsafe_b64encode(signature_bytes).decode("ascii")
def verify_export(export: dict, signature_b64: str, public_key_bytes: bytes) -> bool:
"""Verify an Ed25519 signature over the canonical payload."""
from cryptography.hazmat.primitives.asymmetric.ed25519 import Ed25519PublicKey
public_key = Ed25519PublicKey.from_public_bytes(public_key_bytes)
payload_bytes = build_signature_payload(export)
signature_bytes = base64.urlsafe_b64decode(signature_b64)
try:
public_key.verify(signature_bytes, payload_bytes)
return True
except Exception:
return False
import rfc8785
def merge_incremental(base: dict, delta: dict) -> dict:
"""Merge an incremental export into a base export."""
base_memories = {m["id"]: m for m in base["memories"]}
for mem in delta["memories"]:
base_memories[mem["id"]] = mem # insert or update
merged = list(base_memories.values())
base["memories"] = merged
base["integrity"]["total_memories"] = len(merged)
base["integrity"]["checksum"] = compute_integrity_checksum(merged)
return base

See importer-mappings.md for complete field-by-field mappings from each provider to the normalized PAM format.