What family archives are for, now that the AI can read them
I have three collections of letters sitting in my house.
The oldest is a stack from my great-grandparents, written between 1910 and 1913. The second is from my grandparents, written during World War II. The third is from my father — letters he wrote from the moment I was born until the moment his mother died.
Three generations, each one writing to the next, each one leaving behind something I’ve never fully read.
If you’ve ever inherited a box of family paper, you know the problem. A few hundred pages of handwriting is not a casual Saturday read; it’s a project. Genealogists charge by the hour. Archival services are priced for institutions. Cloud AI could do it, but I’m not putting my grandparents’ wartime letters into someone else’s data center.
So for most of my life, the answer has been someday. Someday I’ll retire and sit at the kitchen table with a magnifying glass. Someday I’ll find the time.
Something shifted in the last year. I can now read the entire archive at my own desk, on hardware I already own, with no one else in the loop. Not “kind of read it” — actually transcribe it, analyze it, and turn a thousand pages of cursive into something I can search, annotate, and print into a book. In hours. On a machine that’s been sitting in my office for years doing other work.
That’s a different kind of someday.
I built a small pipeline to do it. Three stages, all local: a vision model reads the scans and transcribes them; a text model pulls out who and when and where and drafts a year chapter; a typesetter compiles everything into a keepsake PDF with the scan on the left and the transcription on the right. No cloud calls. No API bills. No family letters leaving the house.
The scale is around fifteen hundred pages across the larger archive. The transcription work a human would have measured in months is measured in hours on my desk. Once the hardware is already there, the out-of-pocket cost is close to zero.
Here’s the line I keep coming back to: I want to use this technology to have an opportunity in my lifetime to dig into all this material, understand it, gain some wisdom from it, and share it with future generations.
In my lifetime. That’s the part that’s new. Not “eventually.” Not “if I’m lucky in retirement.” Now.
This thread is also part of why Tractor and Silo exists in the first place.
Tractor and Silo is the software I’m building for life tracking — a way to capture experiences, people, and places so the record of your own life is actually readable later. My exposure to the family archive shaped a lot of how it got designed. Watching what my ancestors left behind — and what they didn’t, and what I couldn’t easily get to — changed how I think about making a personal record that holds up over decades and hands.
My ancestors had paper, ink, and the post. I have a database and a phone. The instinct is the same; the tools are what changed. The archive in my office and the app on my phone are the same project, separated by a century — and that parallel isn’t something I stumbled on. It’s what I had in mind when I started building the software.
This work is starting to show up at institutional scale too. Last year, Imperial War Museums, Capgemini, and Google Cloud used Gemini to transcribe 20,000 hours of oral history from veterans and civilians — work that would have taken a team 22 years by hand, done in weeks. What’s new is that the same shape of work has come down to a scale an individual can run at home.
My personal mantra for writing is that I want to write about what it’s like living in the world at this time. This is a case study of that — a technology revolution where we’re suddenly capable of things that weren’t possible at any price a few years ago.
A box of letters from 1910 is a record of what it was like living in the world at that time. The fact that I can finally read them is a record of what it’s like living in the world now. Both are worth keeping.
In the next post , I’ll walk through how I run all of this on a Mac Studio that isn’t new anymore.