RAG with Spring AI
A hands-on series building Retrieval-Augmented Generation (RAG) systems with Spring AI. Each post maps to a runnable demo in the rag-spring-ai project — from a minimal retrieval pipeline to production patterns like multi-format ingestion, conversational memory, advisor composition, structured output, and function calling. Everything runs locally with Ollama and Docker Compose.
Metadata Filtering with Spring AI: A WHERE Clause for Your Vector Store
In the last post we fixed context pollution by giving every document domain its own VectorStore. One bucket for FAQ, one for legal, one for tech, one for HR — done. The router picks one and the LLM gets a...
Read more →Multi-Document RAG with Spring AI: Multiple Collections, Smart Routing, and Cleaner Top-K
Every demo in this series so far has lived inside one cosy little vector store. We dumped a CloudFlow FAQ into it, asked some questions, and got nice answers. That’s also pretty much how every “your first RAG app” tutorial...
Read more →Function Calling in Spring AI: Letting the LLM Press the Buttons
So far in this series the LLM has been a very polite librarian — we ask it questions, it goes to the vector store, it reads us a nicely worded answer. That’s RAG. It’s great. It’s also, eventually, not enough....
Read more →Structured Output in Spring AI: Turning LLM Prose into Typed Java Records
Up to now, every demo in this series has happily returned a String from the LLM and called it a day. That’s fine when you’re building a chatbot — humans are great at reading prose. It is not fine when...
Read more →Advisors in Spring AI: Composing RAG, Memory, and Safety as a Pipeline
So far in this series we’ve quietly been using a feature without ever really stopping to look at it. Every demo — basic RAG, ingestion, vector store ops, chat memory — has been built around ChatClient and these little things...
Read more →Chat with Memory in Spring AI: Conversational RAG That Actually Remembers
So far in this series we’ve built a basic RAG pipeline, loaded a few different document formats, and poked at the vector store directly to understand what retrieval actually returns.
Read more →Vector Store Operations with Spring AI: Similarity Search, Thresholds, and Embedding Inspection
In the first post we built a basic RAG pipeline, and in the second post we explored different ways to ingest documents. Both times, we let QuestionAnswerAdvisor handle the retrieval for us — it searched the vector store, grabbed the...
Read more →Document Ingestion with Spring AI: Loading Text, JSON, and Custom Chunks into Your RAG Pipeline
In the first post we built a basic RAG system — one text file, default chunking, done. It worked great for a quick demo, but real-world documents don’t come in neat .txt files. You’ll deal with JSON exports, PDFs, maybe...
Read more →Basic RAG with Spring AI: Build a Grounded Q&A System from Scratch
Large language models are impressive, but they have a fundamental limitation: they can only work with what they learned during training. Ask about your company’s internal docs, last week’s release notes, or anything after the training cutoff — and the...
Read more →