Importing
This guide covers the process of reading a provider export and producing a valid PAM memory store or conversation file. It includes format detection heuristics, role normalization, timestamp normalization, and importer versioning guidance.
General process
Section titled “General process”- Obtain the export from your provider (see Provider Overview)
- Detect the provider format using the heuristics below
- Map fields to PAM schema (see individual provider pages for field mappings)
- Normalize roles using the reference table below
- Normalize timestamps to ISO 8601 using the reference below
- Compute content hashes per spec §6 normalization
- Validate the output against the PAM schema
Format detection heuristics
Section titled “Format detection heuristics”The following function detects the provider format from a loaded JSON structure. Note that Copilot exports are CSV, not JSON, and must be detected by file extension or header row before calling this.
def detect_provider(data): if isinstance(data, list): sample = data[0] if data else {} if "mapping" in sample: return "chatgpt" if "chat_messages" in sample: return "claude" if "header" in sample and ("details" in sample or "userInteractions" in sample): return "gemini" if isinstance(data, dict): if "conversations" in data and isinstance(data["conversations"], list): sample = data["conversations"][0] if data["conversations"] else {} if "conversation" in sample and "responses" in sample: return "grok" return "unknown"function detectProvider(data) { if (Array.isArray(data)) { const sample = data[0] || {}; if ("mapping" in sample) return "chatgpt"; if ("chat_messages" in sample) return "claude"; if ("header" in sample && ("details" in sample || "userInteractions" in sample)) return "gemini"; } if (typeof data === "object" && !Array.isArray(data)) { const convs = data.conversations; if (Array.isArray(convs)) { const sample = convs[0] || {}; if ("conversation" in sample && "responses" in sample) return "grok"; } } return "unknown";}Detection signals by provider:
| Signal | Provider |
|---|---|
JSON array; first element has mapping key | OpenAI / ChatGPT |
JSON array; first element has chat_messages key | Anthropic / Claude |
JSON array; first element has header key and details or userInteractions | Google / Gemini |
JSON dict; conversations[] where first item has conversation and responses keys | xAI / Grok |
| CSV file (detect by file extension or header row) | Microsoft / Copilot |
Role normalization
Section titled “Role normalization”All provider role values must be normalized to PAM’s four canonical roles: user, assistant, system, tool.
| Provider | Provider value | PAM normalized value |
|---|---|---|
| OpenAI | user | user |
| OpenAI | assistant | assistant |
| OpenAI | system | system |
| OpenAI | tool | tool |
| Anthropic | human | user |
| Anthropic | assistant | assistant |
Request (Takeout) | user | |
Response (Takeout) | assistant | |
| Microsoft | user (CSV) | user |
| Microsoft | AI (CSV) | assistant |
| xAI | human | user |
| xAI | assistant | assistant |
| xAI | ASSISTANT | assistant |
| xAI | grok-3 (model name as sender) | assistant |
| xAI | any non-human value | assistant |
For xAI / Grok, role normalization MUST be case-insensitive. Four distinct sender values have been observed: "human", "assistant", "ASSISTANT" (uppercase), and model names such as "grok-3". Treat any value that is not "human" (case-insensitive) as "assistant".
Timestamp normalization
Section titled “Timestamp normalization”All timestamps in PAM MUST be ISO 8601 strings. Provider exports use various formats:
| Provider | Source format | Transform |
|---|---|---|
| OpenAI | Unix epoch (float, seconds) | datetime.fromtimestamp(v, tz=UTC).isoformat() |
| Anthropic | ISO 8601 | direct (no transform needed) |
| Google (Takeout) | ISO 8601 | direct (no transform needed) |
| Microsoft (CSV) | Locale date string (varies by file) | parse with dateutil.parser.parse() |
| xAI (conversation level) | ISO 8601 | direct (no transform needed) |
| xAI (message level) | BSON {"$date":{"$numberLong":"<ms>"}} | datetime.fromtimestamp(int(v["$date"]["$numberLong"])/1000, tz=UTC).isoformat() |
xAI / Grok uses two different timestamp formats within the same file: ISO 8601 at the conversation level and BSON at the message level. Two separate parsers are needed.
For OpenAI, some messages have create_time: 0 or null. Use the conversation-level create_time as a fallback.
Content hash computation
Section titled “Content hash computation”Every imported memory must include a content_hash. Compute it per spec §6:
- Take the
contentstring - Trim leading and trailing whitespace
- Convert to lowercase
- Apply Unicode NFC normalization
- Collapse consecutive whitespace to single spaces
- Compute SHA-256 of the UTF-8 encoded result
- Format as
sha256:<hex_digest>
import hashlibimport unicodedata
def compute_content_hash(content: str) -> str: text = content.strip().lower() text = unicodedata.normalize("NFC", text) text = " ".join(text.split()) return f"sha256:{hashlib.sha256(text.encode('utf-8')).hexdigest()}"import {createHash} from "node:crypto";
function computeContentHash(content) { let text = content.trim().toLowerCase(); text = text.normalize("NFC"); text = text.replace(/\s+/g, " "); const hash = createHash("sha256").update(text, "utf8").digest("hex"); return `sha256:${hash}`;}Importer versioning
Section titled “Importer versioning”Provider export formats change without notice. Every importer MUST record its version and the source file in the PAM output. Use the import_metadata field in the conversation schema:
{ "import_metadata": { "importer": "pam-converter/1.0.0", "importer_version": "openai-importer/2025.01", "imported_at": "2026-02-15T22:00:00Z", "source_file": "conversations.json", "source_checksum": "sha256:abc123..." }}When a provider changes their export format:
- Create a new importer version (e.g.,
openai-importer/2026.01) - Keep the old importer version available for re-processing older exports
- Auto-detect the format version when possible by checking for differences in field presence or schema structure
Validation
Section titled “Validation”After import, validate your output against the PAM schema:
from jsonschema import Draft202012Validatorimport json
with open("portable-ai-memory.schema.json") as f: schema = json.load(f)with open("memory-store.json") as f: data = json.load(f)
errors = list(Draft202012Validator(schema).iter_errors(data))print(f"{len(errors)} validation errors")import Ajv2020 from "ajv/dist/2020.js";import {readFileSync} from "node:fs";
const ajv = new Ajv2020({allErrors: true});const schema = JSON.parse(readFileSync("portable-ai-memory.schema.json", "utf8"));const data = JSON.parse(readFileSync("memory-store.json", "utf8"));
const validate = ajv.compile(schema);if (validate(data)) { console.log("memory-store.json valid");} else { console.log(`${validate.errors.length} validation errors`); process.exit(1);}See the Validation Guide for detailed instructions.