MindForge - My Digital Garden Project

MindForge turns raw study material — a PDF textbook chapter, a Markdown note, a DOCX handout — into things you can actually learn from: summaries, flashcards, a concept map, and quizzes with spaced repetition. Upload a document, an AI pipeline processes it, and you study the output.

Repo: github.com/marcin-jasinski/mindforge

Companion piece: Why MindForge Looks the Way It Does — the design decisions and rejected alternatives behind the architecture below.

The core value loop

It exists to solve a personal problem I’ve encountered when taking part in the AI_Devs 4 course: raw learning materials are hard to internalize on their own and simply reading them is not enough to really learn something. Automating the conversion into study artifacts, and pairing that with active recall through quizzes, lets the learner focus on learning instead of just reading the same text over and over again.

Architecture, in brief

MindForge is built on Hexagonal Architecture (Ports and Adapters). The domain core (pure Java 21 — records, sealed interfaces, zero framework imports) knows nothing about databases, HTTP, or LLM providers. Everything external is reached through a port interface, with concrete adapters wired at the edges.

Uploading a document hits a thin REST controller, which hands off to an application service and returns immediately. The real work happens on a background thread: parse the file into content blocks, run it through a pipeline of AI agents (preprocessor, relevance guard, summarizer, flashcard generator, concept mapper, quiz generator, quiz evaluator), and checkpoint each step’s output. PostgreSQL is the single source of truth; Neo4j holds a derived, rebuildable graph projection of the concept map for fast neighborhood queries; Caffeine handles in-process caching.

The why behind each of those choices is explained in another article - Why MindForge Looks the Way It Does.

Where the project stands today

I am developing this project after hours, this is its current state:

Phase 0 — Scaffolding ✅ Maven project, hexagonal package tree, Docker Compose for local dev, StubAIGateway test double.
Phase 1 — Domain layer ✅ Every core entity, value object, and domain event as a pure Java record.
Phase 2 — Infrastructure foundation ✅ Flyway migrations, JPA entities, and repository adapters for documents and artifacts.
Phase 2b — Persistence cleanup & DTO foundation 🔄 nearly done. Persistence split into clean sub-packages with MapStruct mappers; the API-facing DTO layer and OpenAPI config are scaffolded. One checklist item remains: verifying GET /v3/api-docs returns valid OpenAPI 3.1 JSON.

Everything from Phase 3 onward — the AI gateway, document parsing, the agent pipeline itself, the Neo4j projection, the API layer, quiz/flashcard services, RAG chat, and the Angular frontend — is designed (see the implementation plan and roadmap docs) but not yet built.