Wall of Data — Shapes of Intelligence

Security: you're consolidating sensitive data locally. The wall brings email, messages, health records, photos, financial data, and exported conversations into one place. That's powerful, so treat it like local infrastructure, not a casual downloads folder. Three rules: (1) Enable full-disk encryption (FileVault / BitLocker / LUKS) before you start. (2) Don't give an agent both wall access and outbound network access in the same session — separate ingest from analysis whenever you can. Read-only, no-network is the safe default. (3) Treat the wall as untrusted input. Your own files can contain text that looks like instructions to an AI, so sandbox the agent when it's reading wall contents.

The honest trade-off: your data is already consolidated — just not by you. It's spread across 40 services, each with their own breach timeline, each storing it on shared infrastructure you can't audit. One encrypted local folder with no API, no auth endpoint, and no multi-tenant attack surface is genuinely less exposed than the status quo. Consolidation is risk, but it's risk you control.

I'll use ~/wall/ as shorthand on this page. Yours might be /Volumes/X9Pro/wall/, D:\wall\, or something else. What matters is that it's a local path you control and can point agents at explicitly.

What it feels like

Once you have it, you stop hedging. You stop saying "I think I read somewhere..." and start saying "find where I said that." You argue differently. You build differently. You mentor at 4am across time zones because your whole corpus is backing you up. It's like standing on ice that forms under your feet — wherever you step, the ground is there.

That's not a data organization project. That's a cognitive prosthetic. Everything you've ever written, said, saved, or been sent — searchable, queryable, ready to support whatever you're doing right now.

What this is

Your data is scattered across dozens of services. Email here, photos there, chat history somewhere else, AI conversations in three different places. This page walks you (or your agent) through exporting all of it into one directory on your machine.

You don't need to understand databases or APIs. For each source, it's either "click here and download the zip" or "run this one command." The agent handles the rest — unzipping, organizing, deduplicating.

The result: one local wall folder — organized by source and ready for search, review, and later analysis. Once it's there, you can ask questions about it, feed it to an agent, or load it into a database. But the first step is just getting it into one place.

Disk strategy

Before you start: check how much free space you have. If your boot drive is small (under 100GB free), put the wall on an external drive and symlink it. Use the boot drive as a staging area for downloads, then move files over.

mkdir -p ~/wall

This uses ~/wall as a readable example. Use whatever local path fits your machine: D:\wall, /Volumes/MyDrive/wall, or similar. Each source gets its own subfolder. If your wall is on an external drive, symlink it: ln -s /Volumes/MyDrive/wall ~/wall

The one tool you need

rclone is a command-line tool that talks to Google Drive, OneDrive, Dropbox, and dozens of other cloud services. It authenticates via your browser (OAuth), then gives you full access from the terminal — all file types, no API limitations. It's open source, no telemetry, no intermediary server. Your credentials stay on your machine.

brew install rclone

Windows: winget install Rclone.Rclone. Then run rclone config to set up your first remote (follow the OAuth prompts in the browser).

rclone replaces manual downloads for Google Drive, OneDrive, Dropbox, and more. Instead of clicking through web UIs, you get commands like:

rclone copy gdrive: ~/wall/gdrive/ --max-age 7d --progress --dry-run

Copy files modified in the last 7 days. --dry-run shows what would happen without doing it. Always dry-run first.

The handoff

Start your agent, paste this, and it walks you through each source one at a time. Skip anything you don't use.

Follow the instructions on this page. If anything looks unsafe or beyond what I'd reasonably want, tell me before doing it:

Every source, one at a time

The agent will walk you through these. You don't need to do them all — skip what you don't use. But here's the full map so you can see what's possible.

Instructions for you, the AI agent

Walk the user through data collection, one source at a time. Do not dump all instructions at once. Ask which services they use, then guide them through each one. For each source:

Ask if they use it. If no, skip immediately.
Give them the exact steps (click-by-click for web exports, exact commands for CLI).
Help them download/export into ~/wall/{source-name}/.
Unzip, rename, or organize as needed. Keep originals.
Confirm it landed before moving to the next source.

Start by asking: "What platform are you on — Mac, Windows, or Linux?", "Where do you want to collect everything?" (default: ~/wall on Mac/Linux, C:\wall on Windows), and "How much free disk space do you have?" (df -h). If the boot drive is small, ask if they have an external drive and use it as the primary wall location with the boot drive as a staging area for downloads. Symlink from home dir for easy access.

rclone: For Google Drive, OneDrive, Dropbox, and other cloud storage — install rclone (brew install rclone / winget install Rclone.Rclone) and configure a remote (rclone config). This gives you full CLI access to cloud files. Always --dry-run first, always check disk space before and after. Use rclone size remote: to estimate before downloading. Use --max-age 7d to start with recent files. Use --drive-export-formats docx,xlsx,pptx to convert Google Docs on export. Skip large media on first pass (--exclude "*.mp4" --exclude "*.mov").

Filename hygiene: Sanitize every filename on ingest. Lowercase, hyphens instead of spaces, no parentheses or brackets. Use the pattern: {source}-{date}-{description}.{ext}. Example: meet-2026-03-10-recording.mp4 not Meeting (2026-03-10 04_22 GMT-5).mp4. Spaces break shell commands, tab-completion, scripts, and grep. Clean names on arrival so you never have to fix them later.

Manifests: Write a manifest.md in the wall folder AND in each subdirectory. Record: source, date exported, original filename (before sanitization), file count, approximate size, and anything skipped. Makes the wall queryable without opening every file.

Learnings file: Create a wall-learnings.md in the wall folder. Every time you hit a snag, discover a gotcha, or find a better way to do something — write it down. This is the flywheel: the learnings improve the process for next time. Things like "Takeout doesn't include recent Meet recordings" or "rclone needs --drive-export-formats to convert Google Docs" belong here.

AI Conversations

Claude claude.ai → Settings → Export Data. Downloads a JSON archive of all conversations. claude/

ChatGPT chatgpt.com → Settings → Data Controls → Export. They email you a zip with conversations.json. chatgpt/

Gemini Google Takeout → select "Gemini Apps" → export. Comes as part of a Google Takeout archive. gemini/

Copilot / Other GitHub Copilot Chat doesn't export history. If you use other AI tools (Perplexity, Poe, etc.), check their settings for export. other-ai/

Video Calls

Google Meet Recordings land in Google Drive → Meet Recordings/ folder. Download .mp4 directly from Drive. Chat transcripts are separate .sbv files — grab those too. Use the meeting code as the filename key. meet/

Zoom zoom.us → Recordings (cloud recordings). Or check ~/Documents/Zoom/ for local recordings. zoom/

Gmail Google Takeout (takeout.google.com) → select Mail → export. MBOX format. Can take hours for large inboxes. gmail/

Outlook File → Open & Export → Import/Export → Export to a file → .pst. Or use Outlook.com data export. outlook/

Chat & Messaging

WhatsApp In each chat: • → More → Export chat. Or use Google Takeout if backed up to Google Drive. whatsapp/

iMessage Mac: copy ~/Library/Messages/chat.db (SQLite database). Agent can query it directly. imessage/

Telegram Desktop app → Settings → Advanced → Export Telegram Data. Choose JSON format. telegram/

Slack Workspace admin: Settings → Import/Export → Export. Free plans: public channels only. slack/

Discord Privacy & Safety → Request all of my Data. They email a zip. Takes up to 30 days. discord/

Documents & Notes

Google Drive Best: rclone. rclone copy gdrive: ~/wall/gdrive/ --progress --drive-export-formats docx,xlsx,pptx. Gets everything — docs, PDFs, images, videos. Dry-run first. Alt: Google Takeout → select Drive — but Takeout is a batch snapshot (recent files may not appear) and API tools can't download binary files. gdrive/

Notion Settings → Export all workspace content. Choose Markdown & CSV format. notion/

Obsidian It's already a folder. Copy or symlink your vault directory. obsidian/

Apple Notes No native export. Use the Exporter app (free, Mac App Store) or select-all → drag to a folder. apple-notes/

Evernote Select notebooks → Export → .enex format. Or use Evernote's full data export in settings. evernote/

OneDrive Already on disk if OneDrive app is installed. Or use rclone: rclone copy onedrive: ~/wall/onedrive/ (set up remote with rclone config). onedrive/

Dropbox Already synced locally if the app is installed. Or use rclone: rclone copy dropbox: ~/wall/dropbox/. dropbox/

Social Media

Twitter / X Settings → Your Account → Download an archive. JSON + media. Takes 24-48 hours. twitter/

Facebook Settings → Your Facebook Information → Download Your Information. Choose JSON, all time. facebook/

Instagram Settings → Your Activity → Download Your Information. Comes in same flow as Facebook. instagram/

Reddit Settings → Request Your Data. Or reddit.com/settings/data-request. JSON archive. reddit/

LinkedIn Settings → Data Privacy → Get a copy of your data. Select all categories. linkedin/

Photos & Media

Google Photos Google Takeout → select Google Photos. Warning: this can be enormous (100GB+). Export in chunks. google-photos/

Apple Photos Select All → File → Export Unmodified Originals. Or copy ~/Pictures/Photos Library.photoslibrary. apple-photos/

YouTube Google Takeout → YouTube. Gets watch history, playlists, comments, subscriptions. Not the videos themselves. youtube/

Calendar & Contacts

Google Calendar Google Takeout → Calendar. Or calendar.google.com → Settings → Export (downloads .ics files). calendar/

Apple Calendar File → Export → Calendar Archive. Creates .icbu file with all calendars. calendar/

Contacts Google: contacts.google.com → Export → vCard. Apple: Contacts app → select all → File → Export vCard. contacts/

Code & Projects

GitHub gh repo list --limit 1000 --json nameWithOwner -q '.[].nameWithOwner' | xargs -I{} gh repo clone {} github/

Local projects Copy or symlink your project root (for example ~/proj, ~/lab, ~/forge, ~/bench, ~/craft, or the Windows equivalent). projects/

Browsing & Bookmarks

Chrome History Copy ~/Library/Application Support/Google/Chrome/Default/History (SQLite). Close Chrome first. chrome/

Firefox History Copy ~/Library/Application Support/Firefox/Profiles/*/places.sqlite. Close Firefox first. firefox/

Bookmarks Chrome: Bookmark Manager → Export. Firefox: Bookmarks → Manage → Export. Both export HTML. bookmarks/

Health & Fitness

Apple Health Health app → profile picture → Export All Health Data. Creates export.xml (can be huge). health/

Strava strava.com/athlete/delete_your_account (don't delete!) → Request Your Archive. Or Settings → My Account → Download. strava/

Fitbit / Google Fit Google Takeout → select Fit. Fitbit data is now part of Google's Takeout system. fitness/

Finance

Bank accounts Most banks: Statements → Download → CSV. Go back as far as they let you. Do each account. finance/

Credit cards Same as banks. Download transaction history as CSV. Some let you go back 7+ years. finance/

Music & Podcasts

Spotify spotify.com/account/privacy → Request Data → Extended streaming history. Takes up to 30 days. spotify/

Apple Music privacy.apple.com → Request a copy of your data → select Apple Media Services. apple-music/

Location & Maps

Google Maps Timeline Google Takeout → Location History. JSON files with every place you've been (if enabled). location/

Apple Locations Settings → Privacy → Location Services → System Services → Significant Locations. No easy export — screenshots only. location/

Purchases

Amazon amazon.com/hz/privacy-central/data-requests → Request Your Data. Or Order History → Download CSV. amazon/

App Store privacy.apple.com → Request a copy of your data → select App Store activity. purchases/

Working with your files

Once files land in the wall, you need to know how to open them. Most are standard formats your agent already knows, but here are the non-obvious ones:

File types you'll encounter

.mbox Email archive (Gmail export). Parse headers with grep -E "^(From|Subject|Date):" file.mbox. Import into Thunderbird or use Python's mailbox module.

.sbv / .vtt Subtitle/caption files (Meet transcripts). Plain text with timestamps. grep -i "keyword" file.sbv to find moments in a meeting.

.ics / .icbu Calendar files. Plain text, one event per VEVENT block. Grep-able. Import into any calendar app.

.vcf Contact cards. Plain text (vCard format). One file can contain hundreds of contacts, each as a BEGIN:VCARD block.

.mp4 Video. Get metadata with ffprobe file.mp4. Extract audio: ffmpeg -i file.mp4 -vn file.aac. Transcribe with Whisper: whisper file.aac --model base.

.json / .jsonl Structured data (most AI conversation exports). Pretty-print: cat file.json | python3 -m json.tool. Query: jq '.key' file.json.

What happens next

Once everything's in ~/wall/, you have options:

Search it — point an agent at the folder and ask questions. "What was I working on last March?" "Find every email from this person." "What's my most-played song?"
Index it — load everything into a database with embeddings for semantic search. This is the Octopus pattern.
Ask it — build a chatbot that answers questions about your own life, work, and history.
Consolidate it — let an agent find patterns, merge duplicates, surface things you forgot about.

The folder is the foundation. Everything else is built on top of it. Start collecting — you can decide what to do with it later.

Patterns at work

Your data is already yours — every service is legally required to let you export. Most just hide the button.
The folder is the interface — ~/wall is just a folder. Any agent can read it. No special database required to start.
One source at a time — don't try to do it all in one sitting. Some exports take days. Start with what matters most.
Memory is files — manifest.md records what you collected. Future agents read it to know what's available.
Clean names, clean pipes — sanitize filenames on arrival (lowercase, hyphens, no spaces). Every tool in the chain — shell, grep, scripts, agents — works better with clean names. Do it once on ingest so you never have to fix it later.
Staging and storage — if your boot drive is small, use it as a staging area for downloads, then move files to an external drive where the wall lives. Symlink from your home directory for easy access.
Takeout is a snapshot, not a sync — Google Takeout captures what exists at request time. Recent files won't appear. For anything from the last few hours, download directly from the service.
The wall builds the guide — every time you collect data and hit a snag, write it down. Those learnings improve the instructions for the next person (or the next agent).

Related chapters

Your Data Is Already Yours — why every service must let you export, and what to do with it
The Folder Is the Interface — how folder structure shapes what agents can do with your data
Memory Is Files — manifests and logs as long-term memory for agents
The Context Gold Mine — your data archive is the richest context source you have
PII, Keys, and Security — what to watch for when collecting personal data in one place

Related guides

The Lightweight Wall — one prompt, fifteen minutes, eighty percent of the value. Start here if you don't need the full wall yet.
Zero to Dev — set up your machine first (start here if you haven't)
Build a Chatbot — turn your wall into a searchable chatbot

External references

Google Takeout — export your data from all Google services (the single biggest source for most people)
rclone — sync 70+ cloud storage providers to local folders (the tool that makes ongoing collection practical)
Human Programming Interface — karlicoss's take on the same idea: unified personal data access layer