● refactoringbuildingsamridhlimbu.com/projects/govchat · v0.1
GovChat
● RAG · 2025Retrieval-Augmented Generation platform for natural-language querying of government information. Ask a policy question, get a grounded answer with source citations — no hallucination, explicit fallback when the answer isn't in the corpus.
threshold
hallucination guard
Context
Built in 2025. The problem: government information is accurate but structurally inaccessible — spread across PDF documents, department portals, and static pages with no search layer that understands intent. GovChat ingests the documents, indexes them into a vector store, and answers questions with grounded, cited responses. The retrieval threshold means the model declines to answer rather than guessing when confidence is low — critical for policy queries where a wrong answer has real consequences.
Timeline
2025
Problem framing
Government information is fragmented across PDF documents, static pages, and department portals. Finding a specific ruling, policy, or eligibility condition means navigating bureaucratic structure instead of asking a question.
2025
RAG pipeline
Ingest pipeline: scrape and parse government documents → chunk → embed → store in vector DB. Query pipeline: embed question → retrieve top-k chunks → pass to LLM with context → return grounded answer.
2025
Grounding and citation
Answers cite the source document and section. If the retrieved chunks don't contain an answer, the model says so — no hallucination of policy details.
Key technical decisions
01rag over fine-tuning › fine-tuned llm
Government policy changes. A fine-tuned model is a snapshot; a RAG system with a re-indexed document store stays current. Retrieval also gives you source citations — essential for policy queries.
02chunk-level citations › document-level citations
Citing the document is not enough — government documents are long and dense. Chunk-level retrieval lets the answer point to the specific section, so users can verify without reading hundreds of pages.
03explicit "i don't know" › best-effort answer
For government policy, a confident wrong answer is worse than no answer. The retrieval threshold filters low-confidence chunks; if nothing clears the bar, the model declines to answer.
Query pipeline
kairos/scheduler.pypy
1# query pipeline
2query_embedding = embedder.encode(user_question)
3chunks = vector_db.similarity_search(
4 query_embedding, k=5, threshold=0.75
5)
6if not chunks:
7 return "I don't have a reliable source for that."
8answer = llm.generate(context=chunks, question=user_question)
The similarity threshold is the critical gate. If retrieved chunks don't clear 0.75 cosine similarity, the model returns a decline rather than attempting an answer from low-confidence context. For government policy, that's the right trade-off — a non-answer is less harmful than a confident wrong one.
Stack
LanguagePython
RAGembedding model · vector database · LLM
Ingestiondocument parser · chunking pipeline