● refactoringbuildingsamridhlimbu.com/projects/govchat · v0.1

❯ cd projects/govchat

GovChat

● RAG · 2025

Retrieval-Augmented Generation platform for natural-language querying of government information. Ask a policy question, get a grounded answer with source citations — no hallucination, explicit fallback when the answer isn't in the corpus.

frontend api

RAG

retrieval strategy

threshold

hallucination guard

chunk-level

citations

Context

Built in 2025. The problem: government information is accurate but structurally inaccessible — spread across PDF documents, department portals, and static pages with no search layer that understands intent. GovChat ingests the documents, indexes them into a vector store, and answers questions with grounded, cited responses. The retrieval threshold means the model declines to answer rather than guessing when confidence is low — critical for policy queries where a wrong answer has real consequences.

Timeline

2025

Problem framing

Government information is fragmented across PDF documents, static pages, and department portals. Finding a specific ruling, policy, or eligibility condition means navigating bureaucratic structure instead of asking a question.

2025

RAG pipeline

Ingest pipeline: scrape and parse government documents → chunk → embed → store in vector DB. Query pipeline: embed question → retrieve top-k chunks → pass to LLM with context → return grounded answer.

2025

Grounding and citation

Answers cite the source document and section. If the retrieved chunks don't contain an answer, the model says so — no hallucination of policy details.

Key technical decisions

rag over fine-tuning › fine-tuned llm

Government policy changes. A fine-tuned model is a snapshot; a RAG system with a re-indexed document store stays current. Retrieval also gives you source citations — essential for policy queries.

chunk-level citations › document-level citations

Citing the document is not enough — government documents are long and dense. Chunk-level retrieval lets the answer point to the specific section, so users can verify without reading hundreds of pages.

explicit "i don't know" › best-effort answer

For government policy, a confident wrong answer is worse than no answer. The retrieval threshold filters low-confidence chunks; if nothing clears the bar, the model declines to answer.

Query pipeline

kairos/scheduler.pypy

1# query pipeline
2query_embedding = embedder.encode(user_question)
3chunks = vector_db.similarity_search(
4    query_embedding, k=5, threshold=0.75
5)
6if not chunks:
7    return "I don't have a reliable source for that."
8answer = llm.generate(context=chunks, question=user_question)

The similarity threshold is the critical gate. If retrieved chunks don't clear 0.75 cosine similarity, the model returns a decline rather than attempting an answer from low-confidence context. For government policy, that's the right trade-off — a non-answer is less harmful than a confident wrong one.

Stack

LanguagePython

RAGembedding model · vector database · LLM

Ingestiondocument parser · chunking pipeline

frontend api ← back to projects

1# query pipeline

2query_embedding = embedder.encode(user_question)

3chunks = vector_db.similarity_search(

4 query_embedding, k=5, threshold=0.75

6if not chunks:

7 return "I don't have a reliable source for that."

8answer = llm.generate(context=chunks, question=user_question)