Why MindForge Looks the Way It Does

MindForge has a fairly opinionated architecture for a solo-developer project. This post describes why behind it — the decisions I’ve made based on my experience, the alternatives rejected, and the trade-offs I’ve accepted. Every section below summarizes a full ADR living in the docs; read those for the complete record.

Companion piece: MindForge - My Digital Garden Project — what the project does and where it stands today.

Why hexagonal architecture instead of the standard Spring layered pattern

The conventional Spring Boot shape — Controller → Service → Repository — was rejected. At MindForge’s designed complexity (a few AI agents, multiple data stores, multiple planned delivery channels, an async pipeline), I knew I needed a clear separation of business logic from all external systems and dependencies. I’ve decided to strictly enforce boundries (with ArchUnits, etc.) even though it means creating a few more classes and interfaces as compared to the traditional layered pattern.

What this buys: application-layer use cases are tested against port interfaces with just Mocks. Adding a new AI agent, document parser, or delivery channel (Discord, Slack) means writing one new adapter class; the orchestrator and domain layer are never touched.

What it costs: more files and more indirection than a layered CRUD app would need — a cost worth paying here at this level of complexity.

Why Java + Spring Boot instead of Python or Node

Simply put - because this is my main tech-stack. Andy Hunt and Dave Thomas gave an example of this in The Pragmatic Programmer, experienced tutors and architects from Bottega IT Minds define this as one of the core architectural drivers - you choose solutions and languages that you are familliar with (unless of course your goal is to learn a new language, but that’s not the case here).

Being honest, this is the second version of MindForge - the first one was written in Python with FastAPI. And it worked quite well in fact, but problems started occuring when it stopped working. Even though I know Python I would really struggle to actually comprehend the error and fix it by myself. All I could do is just ask GH Copilot to fix it for me. I’ve decided that it’s not the way I want to work.

I know that Java and Spring AI are nowhere near as developed in terms of integration with AI Agents as Python is, but the truth is (and again, this comes from people much smarter than me), 80% of an AI-agentic application does not really differ from a traditional software. The remaining 20% is the integration with AI Agents, the whole non-deterministic fun, but other than that - all rules that I’ve learned so far apply also in this context.

Why Spring MVC with virtual threads instead of WebFlux

The pipeline’s workload is long-running, blocking I/O — LLM calls, database writes, file parsing. Reactive streams exist to manage backpressure, and this workload has none to manage. Spring WebFlux was rejected: Mono/Flux operator chains are meaningfully harder to read, stack traces span reactor internals and become painful to debug, and there’s no throughput ceiling at personal-tool scale that virtual threads don’t already clear. Virtual threads give the same practical throughput as WebFlux while keeping the code as ordinary, straight-line, blocking Java — which matters directly for the reviewability constraint above.

Why Spring AI + OpenRouter instead of a direct provider SDK

Calling a provider SDK (OpenAI’s Java SDK, Anthropic’s SDK) directly from agent code was rejected outright — it would lock every agent to one provider, and migrating between models (say, GPT-4o to Claude) would mean code changes scattered across every agent. OpenRouter already aggregates every major provider behind one API key and one OpenAI-compatible endpoint, so Spring AI’s client configuration doesn’t change between OpenRouter (production) and LM Studio (local dev, no API key needed) — switching is one environment variable (SPRING_AI_OPENAI_BASE_URL), never a code change.

LangChain4j was also considered and rejected: more verbose API, weaker structured-output ergonomics, and shallower integration with Spring’s DI compared to Spring AI.

This is also where the ModelTier abstraction (SMALL / LARGE / VISION) comes from: agents ask for a tier, never a model name, and the gateway resolves the actual model string from config. That indirection exists specifically to enforce a cost discipline — deterministic logic first, then the small/cheap model, then the large model only when synthesis actually requires it — as a rule agents can’t route around.

Why PostgreSQL + pgvector instead of a dedicated vector database

Pinecone, Qdrant, Weaviate, and Chroma were all considered and rejected for the same underlying reason: at MindForge’s expected corpus size (hundreds to low thousands of documents), a dedicated vector store adds a service, a client library, and a backup/recovery concern that buys nothing at this scale. A managed option (Pinecone) additionally puts embedding data outside the primary datastore and adds an external SLA dependency for no benefit.

pgvector collapses retrieval to a single connection pool: embeddings live in the same tables and transaction boundary as document_artifacts, so an artifact and its embedding are written atomically. hnsw indexing is more than sufficient for the expected data volume. If the corpus ever outgrows pgvector, the escape hatch is a new adapter behind VectorStorePort — no domain or application code changes.

Why Neo4j is a projection, not a source of truth

Making Neo4j a primary store for concepts was rejected specifically because it would require either cross-store write transactions (PostgreSQL + Neo4j in the same commit) or eventual-consistency management between two sources of truth — both are failure modes that are hard to reason about, for a graph that contains nothing not already derivable from PostgreSQL’s artifacts. Keeping concept relationships in PostgreSQL only was also rejected: WITH RECURSIVE for multi-hop concept-neighborhood queries is markedly more expensive and harder to write than Cypher (Neo4j’s declarative query language), and native graph traversal is the entire reason to include Neo4j at all.

The resolution: Neo4j is populated exclusively by consuming outbox events emitted when a concept map artifact is persisted to PostgreSQL (using Spring Modulith Event Publication Registry). It is never written to directly, never read from during writes, and can be discarded and rebuilt from PostgreSQL at any time via a backfill CLI command. If the Neo4j adapter is down, events simply queue in the outbox — no distributed transaction or saga required, and the concept graph view degrades gracefully rather than failing the pipeline.

Why Caffeine instead of Redis

Redis earns its keep when cache state must survive process restarts or be shared across multiple instances. MindForge runs as a single instance and everything the cache holds (computed aggregations, frequently-read query results) is cheap to rebuild from PostgreSQL on restart — so Redis was rejected as an extra managed service to operate, monitor, and pay for with no corresponding benefit at this scale. EhCache was also considered and rejected: more configuration overhead than Caffeine for no advantage, where Caffeine is already Spring Boot’s recommended in-process default.

The trade-off is that cache entries are lost on restart (acceptable), and horizontal scaling in the future would require either a distributed cache or a cache-aside pattern. The CachePort interface exists precisely so that a Redis adapter can be swapped in later without touching use-case logic.

Why the frontend and backend deploy separately

The Angular SPA deploys to Vercel; Spring Boot runs as a Docker container on Railway — two platforms instead of one Spring Boot container serving both (which is still used as a build-time fallback, and for local/staging). The split was chosen for production because Vercel’s global CDN, automatic PR preview URLs, and zero-config HTTPS aren’t worth re-implementing inside the backend container. Netlify was considered and rejected in favor of Vercel for tighter GitHub PR preview integration; Fly.io and Render were considered and rejected in favor of Railway because Railway offers PostgreSQL and Neo4j as managed add-ons in the same project dashboard, keeping the whole backend topology in one place.

The common thread

Nearly every rejection above has the same shape: this option is only worth its operational cost at a scale MindForge isn’t at. A dedicated vector database, a second source of truth for the graph, a distributed cache, a reactive runtime — each would be the right call for a multi-tenant production system with real concurrent load. For a personal-use tool built and reviewed by one person, the simpler option that’s still swappable later (behind a port interface) wins every time.

MindForge - Digital Garden Architecture

Share to WeChat