Hear me out... Epstein AI

Python RAG LLM Self-hosted — min read Watch on YouTube Try it live
Hear me out... Epstein AI

Overview

The Epstein court archive is enormous, decades-spanning, and split across motions, exhibits, depositions, and unsealed releases. Reading it end-to-end is not realistic. The point of this project is not to draw new conclusions about anybody, but to make the corpus actually queryable. Retrieval-augmented generation is the exact right shape of tool: pull the relevant excerpts from the documents, hand them to a language model, force it to cite the source.

How it's built

  • · Pulled every publicly released PDF — court releases, dockets, exhibits, depositions
  • · OCR'd the scanned ones; a lot of the older exhibits are image-only and unsearchable without it
  • · Chunked by structural boundary where the PDF has one, by token window otherwise, with file + page metadata preserved
  • · Embedded the chunks and stored them in a local vector database
  • · At query time: retrieve top-K chunks, build a prompt that demands citations, run through a local LLM via Ollama
  • · Wired in as a dedicated assistant inside Open WebUI, alongside the other personas

What it does well, what it doesn't

  • Finds where something is mentioned across thousands of pages, fast — the actual high-value use case
  • Surfaces the file name and page reference with every answer, so you can verify it manually
  • OCR quality on older scanned exhibits is the real ceiling on retrieval, not the model
  • Cannot draw conclusions the documents don't already support, and shouldn't pretend to — that's not RAG's job

Stack

Python Ollama Open WebUI Vector DB PyMuPDF Tesseract
Live
Epstein AI runs inside Open WebUI on the same self-hosted stack as the rest of the projects. Access is gated through Patreon or the Telegram bot.
More Research
Also
I analyzed 11,463 D&D Monsters, all Spells and Items. You are playing wrong.
(←) Previous
To Predict the Apocalypse I Stole NASA's Asteroid Data