Hacker News: ambitious_potat

Why the hell is this showing up

ambitious_potat — Mon, 09 Feb 2026 17:32:38 +0000

Article URL: https://github.com/adityaprasad-sudo/Explore-Singapore

Comments URL: https://news.ycombinator.com/item?id=46948124

Points: 4

# Comments: 1

New comment by ambitious_potat in "Show HN: The biggest achievement of my life so far"

ambitious_potat — Mon, 09 Feb 2026 11:32:13 +0000

Appreciate the words man

New comment by ambitious_potat in "Show HN: The biggest achievement of my life so far"

ambitious_potat — Mon, 09 Feb 2026 10:06:03 +0000

Oh hey i forgot to mention here's the live demo:- https://adityaprasad-sudo.github.io/Explore-Singapore/

New comment by ambitious_potat in "Ask HN: What are you working on? (February 2026)"

ambitious_potat — Mon, 09 Feb 2026 05:51:54 +0000

I built an Legaltech for Singapore with RAG architecture and triple llm backup logic

GitHub:- https://github.com/adityaprasad-sudo/Explore-Singapore Live demo :- https://adityaprasad-sudo.github.io/Explore-Singapore/

Show HN: The biggest achievement of my life so far

ambitious_potat — Sun, 08 Feb 2026 19:21:10 +0000

Hello everyone,

I have always loved coding and in the couple I was thinking of making an open source project and it turned out to be awesome I hope you guys like it.

I present Explore Singapore which I created as an open-source intelligence engine to execute retrieval-augmented generation (RAG) on Singapore's public policy documents and legal statutes and historical archives.

The objective required building a domain-specific search engine which enables LLM systems to decrease errors by using government documents as their exclusive information source.

What my Project does :- basically it provides legal information faster and reliable(due to RAG) without going through long PDFs of goverment websites and helps travellers get insights faster about Singapore.

Target Audience:- Python developers who keep hearing about "RAG" and AI agents but haven't build one yet or building one and are stuck somewhere also Singaporean people(obviously!)

Comparison:- RAW LLM vs RAG based LLM to test the rag implementation i compared output of my logic code against the standard(gemini/Arcee AI/groq) and custom system instructions with rag(gemini/Arcee AI/groq) results were shocking query:- "can I fly in a drone in public park" standard llm response :- ""gave generic advice about "checking local laws" and safety guidelines"" Customized llm with RAG :- ""cited the air navigation act,specified the 5km no fly zones,and linked to the CAAS permit page"" the difference was clear and it was sure that the ai was not hallucinating.

Ingestion:- I have the RAG Architecture about 594 PDFs about Singaporian laws and acts which rougly contains 33000 pages.

How did I do it :- I used google Collab to build vector database and metadata which nearly took me 1 hour to do so ie convert PDFs to vectors.

How accurate is it:- It's still in development phase but still it provides near accurate information as it contains multi query retrieval ie if a user asks ("ease of doing business in Singapore") the logic would break the keywords "ease", "business", "Singapore" and provide the required documents from the PDFs with the page number also it's a little hard to explain but you can check it on my webpage.Its not perfect but hey i am still learning.

The Tech Stack: Ingestion: Python scripts using PyPDF2 to parse various PDF formats. Embeddings: Hugging Face BGE-M3(1024 dimensions) Vector Database: FAISS for similarity search. Orchestration: LangChain. Backend: Flask Frontend: React and Framer.

The RAG Pipeline operates through the following process: Chunking: The source text is divided into chunks of 150 with an overlap of 50 tokens to maintain context across boundaries. Retrieval: When a user asks a question (e.g., "What is the policy on HDB grants?"), the system queries the vector database for the top k chunks (k=1). Synthesis: The system adds these chunks to the prompt of LLMs which produces the final response that includes citation information. Why did I say llms :- because I wanted the system to be as non crashable as possible so I am using gemini as my primary llm to provide responses but if it fails to do so due to api requests or any other reasons the backup model(Arcee AI trinity large) can handle the requests.

Don't worry :- I have implemented different system instructions for different models so that result is a good quality product.

Current Challenges: I am working on optimizing the the ranking strategy of the RAG architecture. I would value insights from anyone who has encountered RAG returning unrelevant documents.

Feedbacks are the backbone of improving a platform so they are most

Repository:- https://github.com/adityaprasad-sudo/Explore-Singapore

Comments URL: https://news.ycombinator.com/item?id=46937543

Points: 9

# Comments: 5

New comment by ambitious_potat in "Show HN: I built a RAG engine to search Singaporean laws"

ambitious_potat — Sat, 07 Feb 2026 05:09:05 +0000

Yep I implemented triple models as a backup so good of you that you actually read the code thanks dude for such kind words

New comment by ambitious_potat in "Show HN: I built a RAG engine to search Singaporean laws"

ambitious_potat — Sat, 07 Feb 2026 05:08:18 +0000

Thank you for such great words now the question When I built the vector databse I also created a metadata file that has all the info about pages and topics. Again thanks dude

Show HN: I built a RAG engine to search Singaporean laws

ambitious_potat — Sat, 07 Feb 2026 03:58:51 +0000

I built a "Triple Failover" RAG for Singapore Laws, then rewrote the logic based on your feedback.

Hi everyone!

I’m a student developer. Recently, I created Explore Singapore, a RAG-based search engine that scrapes about 20,000 pages of Singaporean government acts and laws.

I recently posted the MVP and received some tough but essential feedback about hallucinations and query depth. I took that feedback, focused on improvements, and just released Version 2.

Here is how I upgraded the system from a basic RAG to a production-grade one.

The Design & UI I aimed to avoid a dull government website.

Design: Heavily inspired by Apple’s minimalist style.

Tech: Custom frontend interacting with a Python backend.

The V2 Engineering Overhaul

The community challenged me on three main points. Here’s how I addressed them:

1. The "Personality" Fix Issue: I use a "Triple Failover" system with three models as backup. When the main model failed, the backups sounded entirely different.

The Solution: I added Dynamic System Instructions. Now, if the backend switches to Model B, it uses a specific prompt designed for Model B’s features, making it mimic the structure and tone of the primary model. The user never notices the change.

2. The "Deep Search" Fix Issue: A simple semantic search for "Starting a business" misses related laws like "Tax" or "Labor" acts.

The Solution: I implemented Multi-Query Retrieval (MQR). An LLM now intercepts your query. It breaks it down into sub-intents (e.g., “Business Registration,” “Corporate Tax,” “Employment Rules”). It searches for all of them at the same time and combines the results.

Result: Much richer, context-aware answers.

3. The "Hallucination" Fix Issue: Garbage In, Garbage Out. If FAISS retrieves a bad document, the LLM produces inaccurate information.

The Solution: I added a Cross-Encoder Re-Ranking layer.

Step 1: FAISS grabs the top 10 results.

Step 2: A specialized Cross-Encoder model evaluates them for relevance.

Step 3: Irrelevant parts are removed before they reach the Chat LLM.

The Tech Stack *

Embeddings: BGE-M3 (Running locally)

Vector DB: FAISS

Backend: Python + Custom Triple-Model Failover

Logic: Multi-Query + Re-Ranking (New in V2)

Try it out

I am still learning. I’d love to hear your thoughts on the new logic.

Live Demo: https://adityaprasad-sudo.github.io/Explore-Singapore/

GitHub Repo: https://github.com/adityaprasad-sudo/Explore-Singapore

Feedback, especially on the failover speed, is welcome!

Comments URL: https://news.ycombinator.com/item?id=46921180

Points: 5

# Comments: 4