Specification v1.0
Portable AI Memory (PAM) — Specification v1.0
Section titled “Portable AI Memory (PAM) — Specification v1.0”Status: Published
Date: 2026-02-17
Authors: Daniel Gines [email protected] (https://github.com/danielgines)
License: CC BY 4.0 (specification), Apache 2.0 (schema and reference implementation)
Abstract
Section titled “Abstract”This document defines Portable AI Memory (PAM), an interchange format for user memories generated by AI assistants. PAM enables the portability of user context, preferences, knowledge, and conversation history across any LLM provider without vendor lock-in or semantic data loss.
PAM is to AI memory what vCard is to contacts and iCalendar is to events: a universal interchange format that decouples user data from specific implementations.
1. Introduction
Section titled “1. Introduction”1.1 Problem Statement
Section titled “1.1 Problem Statement”AI assistants (ChatGPT, Claude, Gemini, Grok, etc.) accumulate knowledge about users over time — preferences, expertise, projects, goals, and behavioral patterns. This knowledge is stored in proprietary, undocumented formats with no interoperability between providers. Users cannot:
- Migrate their AI context when switching providers
- Maintain a unified identity across multiple AI assistants
- Audit, correct, or manage memories systematically
- Own and control their AI-generated personal knowledge
1.2 Solution
Section titled “1.2 Solution”PAM defines a standardized JSON interchange format with:
- A closed taxonomy of memory types
- Full provenance tracking (which platform, conversation, and method produced each memory)
- Temporal lifecycle management (creation, validity, supersession, archival)
- Confidence scoring with decay models
- Content hashing for deterministic deduplication
- A semantic relations graph between memories
- Access control for multi-agent and federation scenarios
- Optional embeddings as a separate companion file
- Integrity verification for corruption and tampering detection
1.3 Scope
Section titled “1.3 Scope”PAM is an interchange format, not a storage format. Implementations SHOULD use databases (SQLite, PostgreSQL, vector databases, graph databases) for internal storage and MUST support export and import using this format.
1.4 Terminology
Section titled “1.4 Terminology”The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHOULD”, “SHOULD NOT”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in RFC 2119.
2. Format Overview
Section titled “2. Format Overview”A PAM export consists of one required file and optional companion files:
| File | Required | Description |
|---|---|---|
memory-store.json | Yes | Main interchange file with memories, relations, conversations index, and integrity block |
conversations/*.json | No | Individual conversation files referenced by conversations_index[].storage.ref |
embeddings.json | No | Separate file containing vector embeddings for memory objects |
Each file type is validated against its own JSON Schema Draft 2020-12 schema:
schemas/portable-ai-memory.schema.json— validatesmemory-store.jsonschemas/portable-ai-memory-conversation.schema.json— validates conversation files (see §25)schemas/portable-ai-memory-embeddings.schema.json— validatesembeddings.json
3. Root Structure
Section titled “3. Root Structure”{ "schema": "portable-ai-memory", "schema_version": "1.0", "spec_uri": "https://portable-ai-memory.org/spec/v1.0", "export_id": "e47ac10b-58cc-4372-a567-0e02b2c3d479", "exported_by": "system-name/1.0.0", "export_date": "2026-02-15T22:00:00Z", "owner": { ... }, "memories": [ ... ], "relations": [ ... ], "conversations_index": [ ... ], "integrity": { ... }, "export_type": "full", "type_registry": "https://portable-ai-memory.org/types/", "signature": { ... }}3.1 Required Root Fields
Section titled “3.1 Required Root Fields”| Field | Type | Description |
|---|---|---|
schema | string | MUST be "portable-ai-memory" |
schema_version | string | Semantic version. Current: "1.0" |
owner | object | Owner identification |
memories | array | Array of memory objects |
3.2 Optional Root Fields
Section titled “3.2 Optional Root Fields”| Field | Type | Description |
|---|---|---|
spec_uri | string|null | URI or URN of the specification version. Implementations MUST NOT require spec_uri to resolve over network |
export_id | string|null | Unique identifier for this export (UUID v4). Enables tracking and duplicate detection |
exported_by | string|null | System that generated the export. Format: "name/semver" |
export_date | string | ISO 8601 timestamp of export |
relations | array | Semantic relationships between memories |
conversations_index | array | Lightweight conversation references |
integrity | object | Integrity verification block |
export_type | string | "full" or "incremental". Default: "full" (Section 16) |
base_export_id | string|null | For incremental exports: export_id of the base export (Section 16) |
since | string|null | For incremental exports: ISO 8601 cutoff timestamp (Section 16) |
type_registry | string|null | URI of official type registry (Section 19) |
signature | object|null | Cryptographic signature (Section 18) |
4. Memory Object
Section titled “4. Memory Object”The memory object is the fundamental unit of the format. Each memory represents a discrete piece of knowledge about the user.
4.1 Required Fields
Section titled “4.1 Required Fields”| Field | Type | Description |
|---|---|---|
id | string | Unique identifier. SHOULD be UUID v4 |
type | string | Memory type from closed taxonomy (Section 5) |
content | string | Natural language content. Primary semantic payload |
content_hash | string | SHA-256 of normalized content (Section 6) |
temporal | object | Temporal metadata. created_at is required |
provenance | object | Origin metadata. platform is required |
4.2 Conditional Required Fields
Section titled “4.2 Conditional Required Fields”| Field | Condition | Description |
|---|---|---|
custom_type | REQUIRED when type == "custom" | Custom type identifier. MUST be null when type is not "custom" |
4.3 Optional Fields
Section titled “4.3 Optional Fields”| Field | Type | Default | Description |
|---|---|---|---|
status | string | "active" | Lifecycle status (Section 7) |
summary | string|null | null | Short summary for display |
tags | array | [] | Lowercase tags. Pattern: ^[a-z0-9][a-z0-9_-]*$ |
confidence | object | — | Confidence scoring (Section 8) |
access | object | — | Access control (Section 9) |
embedding_ref | string|null | null | Reference to embeddings file (Section 12) |
metadata | object | — | Additional metadata (Section 10) |
5. Memory Type Taxonomy
Section titled “5. Memory Type Taxonomy”PAM defines a closed taxonomy of memory types. The taxonomy is extensible via the "custom" type.
| Type | Description |
|---|---|
fact | Objective, verifiable information about the user |
preference | User preference, taste, or stated desire |
skill | Competency, expertise, or demonstrated ability |
context | Situational or temporal context |
relationship | Relation to another person, entity, or organization |
goal | Active objective or aspiration |
instruction | How the user wants to be treated or addressed |
identity | Personal identity information |
environment | Technical or physical environment details |
project | Active project or initiative |
custom | Extensible type. REQUIRES custom_type field |
5.1 Custom Type Rule
Section titled “5.1 Custom Type Rule”IF type == "custom" THEN custom_type MUST be a non-empty stringIF type != "custom" THEN custom_type MUST be nullExample:
{ "type": "custom", "custom_type": "security_clearance"}6. Content Hash Normalization
Section titled “6. Content Hash Normalization”The content_hash field enables deterministic deduplication across exports from different platforms.
6.1 Normalization Pipeline
Section titled “6.1 Normalization Pipeline”normalize(content): 1. Trim leading and trailing whitespace 2. Convert to lowercase 3. Apply Unicode NFC normalization 4. Collapse multiple consecutive spaces to a single space6.2 Hash Generation
Section titled “6.2 Hash Generation”content_hash = "sha256:" + hex(SHA-256(UTF-8(normalize(content))))6.3 Format
Section titled “6.3 Format”Pattern: ^sha256:[a-f0-9]{64}$
Example: "sha256:a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6e7f8a9b0c1d2e3f4a5b6c7d8e9f0a1b2"
7. Memory Lifecycle
Section titled “7. Memory Lifecycle”Each memory has a status field tracking its lifecycle state.
| Status | Description |
|---|---|
active | Current and valid. Default state |
superseded | Replaced by a newer memory. temporal.superseded_by SHOULD reference the replacement |
deprecated | Still valid but no longer prioritized |
retracted | Explicitly invalidated by the user |
archived | Retained for historical purposes only |
7.1 Lifecycle Transitions
Section titled “7.1 Lifecycle Transitions”active → superseded (new information replaces old)active → deprecated (relevance diminished)active → retracted (user explicitly invalidates)active → archived (user archives for history)superseded → archived (historical retention)deprecated → retracted (user explicitly invalidates)deprecated → archived (historical retention)8. Confidence Scoring
Section titled “8. Confidence Scoring”The confidence block contains system-computed scores. This is NOT user-defined priority.
| Field | Type | Description |
|---|---|---|
initial | number [0.0, 1.0] | Confidence at time of extraction |
current | number [0.0, 1.0] | Current confidence after decay and reinforcement |
decay_model | string|null | Decay model: "time_linear", "time_exponential", "none", or null |
last_reinforced | string|null | ISO 8601 timestamp of last reinforcement |
8.1 Decay Models
Section titled “8.1 Decay Models”time_linear: Confidence decreases linearly with time since last reinforcementtime_exponential: Confidence decreases exponentially with timenone: No automatic decay (e.g., identity facts)
The specific decay rate is implementation-defined. PAM records the model, not the parameters.
9. Access Control
Section titled “9. Access Control”The access block enables multi-agent and federation scenarios.
| Field | Type | Default | Description |
|---|---|---|---|
visibility | string | "private" | "private", "shared", or "public" |
exportable | boolean | true | Whether this memory may be included in exports |
shared_with | array | [] | List of access grants |
9.1 Access Grant
Section titled “9.1 Access Grant”Each grant specifies an entity and its permissions:
{ "entity": "agent-work-assistant", "permissions": [ "read" ]}Permissions: "read", "write", "delete".
10. Metadata
Section titled “10. Metadata”The metadata block contains non-semantic additional properties. This block allows additionalProperties for
extensibility.
| Field | Type | Description |
|---|---|---|
language | string|null | BCP 47 language tag. Pattern: ^[a-z]{2,3}(-[A-Z][a-z]{3})?(-[A-Z]{2})?$ |
domain | string|null | Knowledge domain (e.g., "technical", "personal", "professional") |
Implementations MAY add custom fields to this block.
11. Provenance
Section titled “11. Provenance”The provenance block enables auditability and cross-platform conflict resolution.
11.1 Required Fields
Section titled “11.1 Required Fields”| Field | Type | Description |
|---|---|---|
platform | string | Source platform identifier |
11.2 Optional Fields
Section titled “11.2 Optional Fields”| Field | Type | Description |
|---|---|---|
platform_user_id | string|null | User ID on source platform |
conversation_ref | string|null | Reference to conversations_index entry |
message_ref | string|null | Reference to specific message |
extraction_method | string|null | How the memory was extracted |
extracted_at | string|null | ISO 8601 timestamp of extraction |
extractor | string|null | System that performed extraction |
11.3 Extraction Methods
Section titled “11.3 Extraction Methods”| Method | Description |
|---|---|
llm_inference | LLM inferred the memory from conversation |
explicit_user_input | User explicitly stated the information |
api_export | Extracted from platform API/export |
browser_extraction | Extracted via browser automation or extension |
manual | Manually entered by user or operator |
11.4 Platform Identifiers
Section titled “11.4 Platform Identifiers”Platform identifiers MUST be lowercase ASCII matching the pattern:
^[a-z0-9_-]{2,32}$Identifiers SHOULD be registered in a public registry to prevent collisions.
The same identifier namespace MUST be used across all PAM schemas: provenance.platform in the memory store,
conversations_index[].platform, and provider.name in the conversation schema. Use product names, not company names.
Recommended identifiers (not an exhaustive list):
chatgpt, claude, gemini, grok, perplexity, copilot, local, manual
12. Embeddings
Section titled “12. Embeddings”Embeddings are OPTIONAL. They are stored in a separate embeddings.json file.
12.1 Normative Rules
Section titled “12.1 Normative Rules”- Embeddings MAY be omitted entirely from an export
- When omitted,
embedding_refin memory objects MUST be null - Consumers MUST NOT fail if
embedding_refis null or ifembeddings.jsonis missing - Consumers MAY regenerate embeddings from the
contentfield at any time using any model - The
contentfield in the memory object is ALWAYS the authoritative source of semantic content, never the embedding - Each memory object MUST have at most one corresponding embedding in the embeddings file — the
memory_idfield MUST be unique across all embedding objects. Implementations that maintain multiple embeddings internally (e.g., for different models) SHOULD export only the most recent or preferred embedding
12.2 Embeddings File Structure
Section titled “12.2 Embeddings File Structure”{ "schema": "portable-ai-memory-embeddings", "schema_version": "1.0", "embeddings": [ { "id": "emb-uuid", "memory_id": "mem-uuid", "model": "text-embedding-3-small", "dimensions": 1536, "created_at": "2026-02-15T22:00:00Z", "vector": [ 0.1, 0.2, ... ], "storage": null } ]}12.3 Embedding Object Fields
Section titled “12.3 Embedding Object Fields”| Field | Required | Type | Description |
|---|---|---|---|
id | Yes | string | Unique identifier. Referenced by memory.embedding_ref |
memory_id | Yes | string | ID of the associated memory object |
model | Yes | string | Embedding model identifier |
dimensions | Yes | integer | Vector dimensionality |
created_at | Yes | string | ISO 8601 timestamp |
vector | No | array|null | Inline vector. Null if stored externally |
storage | No | object|null | External storage reference |
13. Relations
Section titled “13. Relations”The relations array defines semantic relationships between memory objects, forming a knowledge graph.
13.1 Relation Object
Section titled “13.1 Relation Object”| Field | Required | Type | Description |
|---|---|---|---|
id | Yes | string | Unique identifier |
from | Yes | string | Source memory ID |
to | Yes | string | Target memory ID |
type | Yes | string | Relationship type |
confidence | No | number|null | Confidence in this relationship [0.0, 1.0] |
created_at | Yes | string | ISO 8601 timestamp |
13.2 Relation Types
Section titled “13.2 Relation Types”| Type | Semantics |
|---|---|
supports | Source provides evidence for target |
contradicts | Source conflicts with target |
extends | Source adds detail to target |
supersedes | Source replaces target |
related_to | General semantic relation |
derived_from | Source was inferred from target |
14. Conversations Index
Section titled “14. Conversations Index”The conversations index provides lightweight references to conversations without embedding full message history.
14.1 Consistency Rule
Section titled “14.1 Consistency Rule”Exporters MUST ensure consistency between memory.provenance.conversation_ref and the corresponding
conversations_index[].derived_memories entry.
Importers SHOULD treat derived_memories as advisory and MAY reconstruct from provenance using:
for memory in memories: conv_id = memory.provenance.conversation_ref if conv_id: conversations_index[conv_id].derived_memories.append(memory.id)14.2 Storage Reference
Section titled “14.2 Storage Reference”Full conversation data is stored externally and referenced via:
{ "storage": { "type": "file", "ref": "conversations/conv-001.json", "format": "json" }}Storage types: "file", "database", "object_storage", "vector_db", "uri".
15. Integrity Verification
Section titled “15. Integrity Verification”The integrity block enables corruption and tampering detection.
15.1 Canonicalization
Section titled “15.1 Canonicalization”PAM uses RFC 8785 (JSON Canonicalization Scheme — JCS) for deterministic serialization. The canonicalization field
declares the method used:
| Value | Standard | Description |
|---|---|---|
RFC8785 | RFC 8785 | JSON Canonicalization Scheme. Default and currently only supported method |
This eliminates implementation ambiguity across languages and platforms. RFC 8785 defines deterministic rules for key ordering, number serialization, string escaping, and whitespace elimination.
15.2 Checksum Computation
Section titled “15.2 Checksum Computation”The checksum is computed using the following deterministic pipeline:
1. Take the memories array2. Sort memory objects by id ascending3. Canonicalize per RFC 8785 (JCS): - Sort all object keys lexicographically (recursive) - Serialize numbers per ECMAScript/IEEE 754 rules (e.g., 1.0 → 1) - Apply RFC 8785 string escaping - No whitespace - UTF-8 encoding4. Compute SHA-256 over the canonical UTF-8 bytes5. Format as "sha256:<hex>"IMPORTANT: Standard json.dumps() in most languages is NOT RFC 8785 compliant. Implementations MUST use a dedicated
JCS library. See Appendix C for library recommendations per language.
15.3 Integrity Block Fields
Section titled “15.3 Integrity Block Fields”| Field | Required | Type | Description |
|---|---|---|---|
canonicalization | No | string | Canonicalization method. Default: "RFC8785" |
checksum | Yes | string | SHA-256 of canonicalized memories. Format: sha256:<hex> |
total_memories | Yes | integer | MUST equal len(memories) |
15.4 Validation
Section titled “15.4 Validation”integrity.total_memories MUST equal len(memories)integrity.checksum MUST match the computed checksum of the canonicalized memories arrayIf integrity.canonicalization is absent, implementations MUST assume RFC878516. Incremental Exports
Section titled “16. Incremental Exports”PAM supports both full and incremental (delta) exports for efficient synchronization.
16.1 Export Types
Section titled “16.1 Export Types”| Type | Description |
|---|---|
full | Complete memory store. Default. Self-contained |
incremental | Delta since a previous export. Contains only new or updated memories |
16.2 Incremental Export Fields
Section titled “16.2 Incremental Export Fields”When export_type is "incremental":
| Field | Required | Description |
|---|---|---|
base_export_id | SHOULD | The export_id of the base export this delta applies to |
since | SHOULD | ISO 8601 timestamp. Only memories created or updated after this time are included |
16.3 Merge Rules
Section titled “16.3 Merge Rules”Importers processing incremental exports MUST:
- Match
base_export_idto a previously imported full export - For each memory in the delta: if
idexists in the base, update it; otherwise, insert it - Recompute
integrity.checksumafter merge - Memories with
status: "retracted"in the delta MUST be marked as retracted in the base - Importers MUST NOT physically delete memories marked as
"retracted". They MUST preserve the memory object and update its status. This ensures auditability and enables undo operations
Importers MAY reject incremental exports if base_export_id does not match any known export.
17. Decentralized Identity (DID)
Section titled “17. Decentralized Identity (DID)”PAM supports W3C Decentralized Identifiers for universal cross-platform identity resolution.
17.1 Owner DID
Section titled “17.1 Owner DID”The owner.did field accepts any valid DID method:
did:key:z6MkhaXgBZDvotDkL5257faiztiGiC2QtKLGpbnnEGta2doKdid:web:example.com:user:alicedid:ion:EiAnKD8...did:pkh:eip155:1:0xab16a96D359eC26a11e2C2b3d8f8B8942d5Bfcdb17.2 Normative Rules
Section titled “17.2 Normative Rules”owner.didis OPTIONAL but RECOMMENDED for exports shared between systems- When present, the DID MUST be resolvable to a DID Document per W3C DID Core (https://www.w3.org/TR/did-1.0/)
- If
signatureis present,signature.public_keySHOULD correspond to a verification method in the DID Document owner.idremains REQUIRED even whendidis present, for backward compatibility
17.3 Recommended DID Methods
Section titled “17.3 Recommended DID Methods”| Method | Use Case | Key Properties |
|---|---|---|
did:key | Self-contained, no resolution needed | Simplest. Key is the identifier |
did:web | Organization-hosted identity | DNS-based, easy to set up |
did:ion | Decentralized, Bitcoin-anchored | Maximum decentralization |
did:pkh | Blockchain wallet-based | Reuses existing crypto keys |
18. Cryptographic Signatures
Section titled “18. Cryptographic Signatures”PAM exports MAY be cryptographically signed to verify authenticity and detect tampering.
18.1 Signature Block
Section titled “18.1 Signature Block”{ "signature": { "algorithm": "Ed25519", "public_key": "z6MkhaXgBZDvotDkL5257faiztiGiC2QtKLGpbnnEGta2doK", "value": "eyJhbGciOiJFZERTQSJ9..base64url-signature", "signed_at": "2026-02-15T22:00:01Z", "key_id": "did:key:z6Mk...#z6Mk..." }}18.2 Supported Algorithms
Section titled “18.2 Supported Algorithms”| Algorithm | Type | Recommended |
|---|---|---|
Ed25519 | EdDSA | Yes — fast, small keys, side-channel resistant |
ES256 | ECDSA P-256 | Yes — widely supported |
ES384 | ECDSA P-384 | Optional |
RS256 | RSA 2048+ | Legacy compatibility |
RS384 | RSA 3072+ | Legacy compatibility |
RS512 | RSA 4096+ | Legacy compatibility |
18.3 Signature Computation
Section titled “18.3 Signature Computation”The signature MUST cover not only the memories integrity but also export identity and ownership, to prevent replay attacks and export spoofing.
When signature is present (not null), the fields export_id and export_date MUST also be present and non-null.
This is enforced by the schema via a conditional dependency.
The signature MUST be computed as follows:
1. Compute integrity.checksum (Section 15)2. Construct the signature payload object: { "checksum": integrity.checksum, "export_id": export_id, "export_date": export_date, "owner_id": owner.id }3. Canonicalize the payload using RFC 8785 (JCS)4. Sign the canonical UTF-8 bytes with the private key using the specified algorithm5. Base64url-encode the signature (RFC 4648 §5)6. Store in signature.valueThis ensures that altering memories (which changes the checksum), export_id, export_date, or owner.id will
invalidate the signature. Note that changes to relations, conversations_index, or other owner fields are NOT
covered by the signature payload.
18.4 Verification
Section titled “18.4 Verification”1. Recompute integrity.checksum from the memories array2. Verify computed checksum matches integrity.checksum3. Reconstruct the signature payload object from the export4. Canonicalize with RFC 87855. Decode signature.value from Base64url6. Verify the signature against the canonical payload using signature.public_key7. If owner.did is present, optionally resolve the DID Document and verify the key matches18.5 Normative Rules
Section titled “18.5 Normative Rules”- Signature is OPTIONAL but RECOMMENDED for exports shared between systems or users
- The signature payload MUST include
checksum,export_id,export_date, andowner_id signature.signed_atMUST be equal to or afterexport_date- If
signature.key_idis present andowner.didis present,key_idSHOULD be a DID URL referencing a verification method in the owner’s DID Document - Importers SHOULD verify signatures when present but MUST NOT reject unsigned exports
19. Type Registry
Section titled “19. Type Registry”PAM provides a centralized registry for custom memory types to enable interoperability between implementations.
19.1 Registry URI
Section titled “19.1 Registry URI”The type_registry root field specifies the registry URI:
{ "type_registry": "https://portable-ai-memory.org/types/"}19.2 Registry Structure
Section titled “19.2 Registry Structure”The registry is a publicly accessible JSON document listing registered custom types:
{ "registry_version": "1.0", "types": { "security_clearance": { "description": "Security clearance level held by the user", "proposed_by": "my-exporter/1.0.0", "status": "registered", "registered_at": "2026-03-01T00:00:00Z" }, "medical_condition": { "description": "Known medical condition or diagnosis", "proposed_by": "healthai/1.0.0", "status": "registered", "registered_at": "2026-04-15T00:00:00Z" } }}19.3 Type Lifecycle
Section titled “19.3 Type Lifecycle”unregistered → registered → candidate → standard| Status | Description |
|---|---|
unregistered | Custom type used locally, not in registry |
registered | Listed in registry, available for cross-platform use |
candidate | Nominated for promotion to standard taxonomy |
standard | Promoted to core taxonomy in a future spec version |
19.4 Normative Rules
Section titled “19.4 Normative Rules”- Custom types SHOULD be registered at the official registry for interoperability
- Implementations MUST accept any
custom_typevalue regardless of registry status - The registry is advisory, not prescriptive — implementations MUST NOT reject unregistered types
- Community-adopted custom types MAY be promoted to the standard taxonomy in future spec versions
20. Interoperability and Migration Compatibility Matrix
Section titled “20. Interoperability and Migration Compatibility Matrix”IMPORTANT: The interoperability paths described in this section reflect observed export formats and extraction strategies as of the time of publication. AI providers do not natively support PAM at the time of this specification. Implementations SHOULD treat these mappings as best-effort compatibility guidance, not guaranteed or officially supported migration paths. Provider export formats may change without notice. Importers MUST be versioned and resilient to format variations.
20.1 Observed Export Sources
Section titled “20.1 Observed Export Sources”| Source | Export Method | PAM Coverage |
|---|---|---|
| ChatGPT | conversations.json + memory prompt | Full: conversations, memories, preferences |
| Claude | JSON export + memory prompt + memory edits | Full: conversations, memories, instructions |
| Gemini | Google Takeout + prompt extraction | Partial: conversations; memories via prompt |
| Copilot | Privacy Dashboard CSV | Partial: conversations only |
| Grok | Data export (grok.com settings) | Full: conversations, projects, media posts, assets |
| Perplexity | Form request + prompt | Partial: limited conversation access |
| Local LLMs | Direct database access | Full: complete control |
20.2 Known Import Strategies
Section titled “20.2 Known Import Strategies”| Target | Method |
|---|---|
| ChatGPT | Custom instructions, conversation priming |
| Claude | Memory edits, Projects, system prompts |
| Gemini | Gems, conversation priming |
| Any LLM | System prompt injection from PAM memories |
21. Security Considerations
Section titled “21. Security Considerations”21.1 Data Sensitivity
Section titled “21.1 Data Sensitivity”PAM files contain personal information. Implementations MUST:
- Encrypt PAM files at rest when stored locally
- Use TLS for any network transmission of PAM files
- Respect the
access.exportableflag when generating exports - Not include memories marked
exportable: falsein exports
21.2 Content Hash Security
Section titled “21.2 Content Hash Security”The content_hash uses SHA-256 for deduplication, not for cryptographic authentication. For tamper-proof verification,
use the signature block (Section 18).
21.3 Privacy
Section titled “21.3 Privacy”The provenance.platform_user_id field is OPTIONAL specifically to allow privacy-preserving exports. Implementations
SHOULD allow users to strip platform identifiers before sharing.
21.4 Signature Security
Section titled “21.4 Signature Security”When the signature block is present, implementations SHOULD verify it before processing the export. A failed
verification SHOULD result in a warning to the user, not a silent failure.
22. Extensibility
Section titled “22. Extensibility”22.1 Memory Types
Section titled “22.1 Memory Types”New types are added via the "custom" type mechanism and the type registry (Section 19). If a custom type achieves
broad adoption, it MAY be promoted to the standard taxonomy in a future version.
22.2 Metadata
Section titled “22.2 Metadata”The metadata block allows additionalProperties, enabling implementations to add custom fields without breaking
schema validation.
22.3 Schema Versioning
Section titled “22.3 Schema Versioning”Schema versions follow semantic versioning:
- Patch (1.0.x): Documentation clarifications, no schema changes
- Minor (1.x.0): Backward-compatible additions (new optional fields, new enum values)
- Major (x.0.0): Breaking changes requiring migration
23. Reference Implementation
Section titled “23. Reference Implementation”A reference implementation is available for PAM. It provides:
- Platform extractors — Parse exports from ChatGPT, Claude, Gemini, Copilot, Grok into PAM format
- Validator — CLI tool for schema validation (
pam validate) - Converter — Export PAM memories to platform-specific import formats
- Integrity checker — Verify checksums and consistency rules
- Signature tools — Sign and verify exports (
pam sign,pam verify)
24. Acknowledgments
Section titled “24. Acknowledgments”This specification was informed by:
- University of Stavanger PKG research — Krisztian Balog, Martin G. Skjæveland, and the Personal Knowledge Graph research group
- Solid Project — Tim Berners-Lee’s vision of user-owned data stores
- Mem0, Zep, Letta — Commercial memory layer implementations that demonstrated practical memory management patterns
- Samsung Personal Data Engine — Production-scale personal knowledge graph deployment
- EU Digital Markets Act — Regulatory framework driving data portability requirements
- W3C DID Core — Decentralized identity standard enabling cross-platform identity resolution
25. Normalized Conversation Format
Section titled “25. Normalized Conversation Format”PAM defines a companion schema for storing full conversation data referenced by conversations_index[].storage.ref.
While the main memory-store schema contains extracted knowledge, the conversation schema preserves the raw dialogue from
which memories were derived.
25.1 Purpose
Section titled “25.1 Purpose”The conversation schema serves as the normalized intermediate format between provider-specific exports and the PAM memory store. The import pipeline is:
Raw Provider Export → Parse → Normalize → Conversation Schema → Extract Memories → PAM Memory Store25.2 Schema File
Section titled “25.2 Schema File”portable-ai-memory-conversation.schema.json — JSON Schema Draft 2020-12
25.3 Key Design Decisions
Section titled “25.3 Key Design Decisions”DAG support: OpenAI conversations use branching (mapping with parent/children). The schema supports parent_id and
children_ids per message to preserve this structure. Linear conversations (Claude, Gemini) set parent_id: null and
children_ids: [].
Role normalization: Each provider uses different role names (human, Request, AI, ASSISTANT, model names).
The
schema normalizes to: user, assistant, system, tool. See importer-mappings.md section 6 for the full verified
mapping table.
Multipart content: Messages may contain text, images, code, files. The content field supports both simple text (
type: "text") and multipart (type: "multipart" with parts[]).
Import metadata: Normalized conversations SHOULD record the importer version, source file, and source checksum. This
enables debugging, re-import, and format version tracking. The import_metadata block is OPTIONAL in the schema to
allow lightweight exports, but implementations SHOULD populate it for auditability.
raw_metadata: Provider-specific fields that don’t map to PAM are preserved verbatim in raw_metadata for lossless
round-tripping.
25.4 Provider Import Mappings
Section titled “25.4 Provider Import Mappings”See importer-mappings.md for field-by-field mappings from each provider format to the normalized conversation schema:
- OpenAI (ChatGPT):
conversations.jsonwith DAG mapping - Anthropic (Claude):
conversations.jsonwithchat_messagesarray - Google (Gemini): Google Takeout
MyActivity.json(single activity log array) - Microsoft (Copilot): Privacy Dashboard CSV
- xAI (Grok): Data export via grok.com settings (conversations, projects, assets)
25.5 Importer Versioning Rule
Section titled “25.5 Importer Versioning Rule”Provider export formats change without notice. Importers MUST be versioned:
importer_version: "openai-importer/2025.01"When a format changes, create a new importer version while keeping the old one for processing older exports.
26. License
Section titled “26. License”26.1 Specification
Section titled “26.1 Specification”This specification document (spec.md) is licensed under the Creative Commons Attribution 4.0 International License (CC
BY 4.0). You may share and adapt this document for any purpose, provided appropriate credit is given.
Full text: https://creativecommons.org/licenses/by/4.0/
26.2 Schema and Reference Implementation
Section titled “26.2 Schema and Reference Implementation”The JSON Schema files (portable-ai-memory.schema.json, portable-ai-memory-conversation.schema.json,
portable-ai-memory-embeddings.schema.json) and all reference
implementation code are licensed under the Apache License, Version 2.0.
Full text: https://www.apache.org/licenses/LICENSE-2.0
Appendix A: Complete Example
Section titled “Appendix A: Complete Example”- Memory store:
examples/example-memory-store.json - Conversation:
examples/example-conversation.json - Embeddings:
examples/example-embeddings.json
Appendix B: JSON Schema Files
Section titled “Appendix B: JSON Schema Files”schemas/portable-ai-memory.schema.json— Main memory store schema (JSON Schema Draft 2020-12)schemas/portable-ai-memory-embeddings.schema.json— Embeddings schema (JSON Schema Draft 2020-12)schemas/portable-ai-memory-conversation.schema.json— Normalized conversation schema (JSON Schema Draft 2020-12)
Appendix C: Content Hash Reference Implementation
Section titled “Appendix C: Content Hash Reference Implementation”import hashlibimport unicodedata
def normalize_content(content: str) -> str: """Normalize content for deterministic hashing.""" text = content.strip() text = text.lower() text = unicodedata.normalize("NFC", text) text = " ".join(text.split()) return text
def compute_content_hash(content: str) -> str: """Compute SHA-256 hash of normalized content.""" normalized = normalize_content(content) hash_hex = hashlib.sha256(normalized.encode("utf-8")).hexdigest() return f"sha256:{hash_hex}"
def compute_integrity_checksum(memories: list) -> str: """Compute deterministic checksum of memories array using RFC 8785.
IMPORTANT: This function MUST use an RFC 8785 (JCS) compliant serializer. Standard json.dumps() is NOT sufficient because: - json.dumps serializes 1.0 as "1.0", RFC 8785 requires "1" - json.dumps does not guarantee RFC 8785 Unicode escaping rules - json.dumps number formatting differs from IEEE 754/ECMAScript
Python: pip install rfc8785 Node.js: npm install canonicalize Go: github.com/nicktrav/canonicaljson Java: org.erdtman:java-json-canonicalization """ import rfc8785
sorted_memories = sorted(memories, key=lambda m: m["id"]) canonical_bytes = rfc8785.dumps(sorted_memories) hash_hex = hashlib.sha256(canonical_bytes).hexdigest() return f"sha256:{hash_hex}"WARNING: Do NOT use json.dumps(..., sort_keys=True, separators=(",", ":")) for checksum computation. It produces
different output than RFC 8785 for floating-point numbers, which will result in checksum mismatches across
implementations.
Appendix D: Signature Reference Implementation
Section titled “Appendix D: Signature Reference Implementation”from cryptography.hazmat.primitives.asymmetric.ed25519 import Ed25519PrivateKeyfrom cryptography.hazmat.primitives import serializationimport base64import rfc8785
def build_signature_payload(export: dict) -> bytes: """Construct and canonicalize the signature payload per Section 18.3.
The payload MUST include checksum, export_id, export_date, and owner_id to prevent replay attacks and export spoofing. """ payload = { "checksum": export["integrity"]["checksum"], "export_id": export["export_id"], "export_date": export["export_date"], "owner_id": export["owner"]["id"] } return rfc8785.dumps(payload)
def sign_export(export: dict, private_key: Ed25519PrivateKey) -> str: """Sign an export with Ed25519 over the canonical payload.""" payload_bytes = build_signature_payload(export) signature_bytes = private_key.sign(payload_bytes) return base64.urlsafe_b64encode(signature_bytes).decode("ascii")
def verify_export(export: dict, signature_b64: str, public_key_bytes: bytes) -> bool: """Verify an Ed25519 signature over the canonical payload.""" from cryptography.hazmat.primitives.asymmetric.ed25519 import Ed25519PublicKey public_key = Ed25519PublicKey.from_public_bytes(public_key_bytes) payload_bytes = build_signature_payload(export) signature_bytes = base64.urlsafe_b64decode(signature_b64) try: public_key.verify(signature_bytes, payload_bytes) return True except Exception: return FalseAppendix E: Incremental Export Merge
Section titled “Appendix E: Incremental Export Merge”import rfc8785
def merge_incremental(base: dict, delta: dict) -> dict: """Merge an incremental export into a base export.""" base_memories = {m["id"]: m for m in base["memories"]}
for mem in delta["memories"]: base_memories[mem["id"]] = mem # insert or update
merged = list(base_memories.values()) base["memories"] = merged base["integrity"]["total_memories"] = len(merged) base["integrity"]["checksum"] = compute_integrity_checksum(merged) return baseAppendix F: Provider Import Mappings
Section titled “Appendix F: Provider Import Mappings”See importer-mappings.md for complete field-by-field mappings from each provider to the normalized PAM format.