Hacker News: pkhodiyar

New comment by pkhodiyar in "Microsoft builds MacBook Pro rival with NVIDIA-powered Surface Laptop Ultra"

pkhodiyar — Tue, 02 Jun 2026 13:06:05 +0000

is this based on ARM? or x64

New comment by pkhodiyar in "Ask HN: Who wants to be hired? (June 2026)"

pkhodiyar — Tue, 02 Jun 2026 06:07:07 +0000

Not hired, looking for funding for vaquill.ai and quilldraft.com , a solo developer got a handful of paying users at 99$, so its PMF, stable product.

New comment by pkhodiyar in "macOS needs its grid back"

pkhodiyar — Tue, 02 Jun 2026 04:59:01 +0000

there is a project that makes macOS alt+tab look like windows grids (if anyone coming from there), its all something alt_tabs or something

Agent Credential Brokers in 2026

pkhodiyar — Thu, 21 May 2026 13:58:12 +0000

Article URL: https://authsome.ai/blog/top-agent-proxy-tools-what-to-know

Comments URL: https://news.ycombinator.com/item?id=48222665

Points: 4

# Comments: 0

Running AI agents without losing my keys

pkhodiyar — Mon, 18 May 2026 10:18:31 +0000

Article URL: https://zriyansh.medium.com/running-agents-without-losing-my-keys-a-month-with-authsome-039690fe5e6f

Comments URL: https://news.ycombinator.com/item?id=48177527

Points: 4

# Comments: 0

New comment by pkhodiyar in "Authsome – open-source local auth proxy for AI agents"

pkhodiyar — Tue, 28 Apr 2026 15:52:37 +0000

Every agent I've built starts the same way. Paste an API key into .env, export it, hope it doesn't end up in a log or a subprocess env dump. token expires and something quietly breaks. We've all been there

so I wrote authsome. The bit I think is actually interesting is the run command:

  authsome run -- python my_agent.py

It launches the child behind a local auth proxy and the proxy intercepts outbound HTTPS and injects Auth headers at request time. the child process never has the secret in its environment, so it can't leak through os.environ, ps -e, or anything that dumps a subprocess env and the agent code doesn't change as well.

the tokens are stored locally, encrypted, and refreshed before they expire. Oauth flows for interactive and headless, plus a browser bridge for API-key providers. There is a cli for pulling headers directly when you don't want the proxy.

the proxy only sees traffic that goes through it, so libraries that pin their own CA bundle slip past, also the streaming uploads and long-lived connections probably have edge cases I haven't hit. It's still alpha, v0.2.1.

Most interested in feedback on the proxy approach itself, that's the part I'm least sure about.

https://github.com/manojbajaj95/authsome

Authsome – open-source local auth proxy for AI agents

pkhodiyar — Tue, 28 Apr 2026 15:52:37 +0000

Article URL: https://github.com/manojbajaj95/authsome

Comments URL: https://news.ycombinator.com/item?id=47936190

Points: 7

# Comments: 3

Show HN: API for 13M+ Indian court cases with citation graphs and vector search

pkhodiyar — Tue, 14 Apr 2026 13:52:55 +0000

Article URL: https://www.vaquill.ai

Comments URL: https://news.ycombinator.com/item?id=47765679

Points: 2

# Comments: 0

New comment by pkhodiyar in "[dead]"

pkhodiyar — Wed, 18 Mar 2026 12:47:55 +0000

So I sat down one day thinking this sucks, there isn't any platform that solves this problem for lawyers who are not Supreme Court or High Court Related as most companies build for them (more like the middle kid who gets ignored.

So built this, let me know what you guys think.

This covers: - ITAT (Income Tax Appellate Tribunal) - CESTAT (Customs, Excise & Service Tax Appellate Tribunal) - GST AAR (GST Authority for Advance Rulings)

- NCLT (National Company Law Tribunal) - IBBI (Insolvency & Bankruptcy Board of India) - DRT (Debt Recovery Tribunal) - SAT (Securities Appellate Tribunal) - CCI (Competition Commission of India)

- NGT (National Green Tribunal) - APTEL (Appellate Tribunal for Electricity)

- TDSAT (Telecom Disputes Settlement & Appellate Tribunal) - CAT (Central Administrative Tribunal) - AFT (Armed Forces Tribunal) - RERA (Real Estate Regulatory Authority)

Would love to pick your brains

Show HN: LegalTech – A curated list of tools and software

pkhodiyar — Thu, 12 Mar 2026 15:11:39 +0000

Article URL: https://github.com/Vaquill-AI/awesome-legaltech

Comments URL: https://news.ycombinator.com/item?id=47351809

Points: 3

# Comments: 0

New comment by pkhodiyar in "Ask HN: What Are You Working On? (December 2025)"

pkhodiyar — Sun, 14 Dec 2025 18:50:33 +0000

working on https://socdefenders.ai, reddit + HN for cybersecurity

Show HN: Reddit and HN for Cybersecurity [Free]

pkhodiyar — Sun, 14 Dec 2025 18:50:15 +0000

Article URL: https://www.socdefenders.ai/

Comments URL: https://news.ycombinator.com/item?id=46265631

Points: 1

# Comments: 0

New comment by pkhodiyar in "[dead]"

pkhodiyar — Mon, 22 Sep 2025 21:37:58 +0000

quick tldr; We are doing a live 60 minutes AMA with folks from Microsoft, Pinecone, Santiago and Alden (CEO CustomGPT.ai) on MCP, sounds interesting? Register.

The goal is to educate about MCP, answer questions, and cover use cases: RAG + MCP, IDEs + MCP, etc. We’ll have live demos, Pinecone folks talking about what they are up to, and much more fun!

If you have been early in the MCP race, this would surely be worth your time.

Why might this interest you?

Model Context Protocol (MCP) is a low-level JSON-RPC protocol for passing structured context and tools to an LLM. Instead of gluing prompts together, you expose one JSON endpoint for a tool (and it takes care of tons of API endpoints for that tool).

MCP is just REST for LLMs! It really is that simple!

We plan to show a live demo of a working MCP, preferably hosted one, setting up configs, with Claude.

We will also answer any questions!

Featured Speakers:

1. Michael Kistler - Principal Program Manager at Microsoft

2. Arjun Patel - Senior Developer Advocate at Pinecone

3. Santiago (https://www.linkedin.com/in/svpino/) - Computer scientist and teaches hard-core Machine Learning; will walk you through Why do we need MCP?, Before MCP vs. After MCP, Architecture, Primitives, and Advantages.

4. Alden Do Rosario - will dissect the RAG + MCP pipeline we run in prod, live demo.

Format: - 3×10 min tech talks (protocol, integration, case study) - 10 min panel on lessons learned - 20 min open Q&A - bring tough questions

When: - Date: Sept 25, 02 PM ET

Registration (free, no spam): LINK http://customgpt.ai/mcp-ama-hn

Code sample, and infra diagrams will be posted after the session. AMA during and after the call - hope to see HN folks there.

New comment by pkhodiyar in "[dead]"

pkhodiyar — Thu, 04 Sep 2025 15:45:24 +0000

We've been using CustomGPT.ai's RAG API for client projects but kept rebuilding the same UI components. Today we're open-sourcing our complete implementation (think ChatGPT interface open sourced).

No vendor lock-in. No telemetry. No "premium" features.

We built this because we needed it. We're sharing it because you need it too.

What does it mean to you? 1. Free and Ready to use UI like ChatGPT with Voice 2. 100% Customizable for you to build on top of it 3. Dev community to fix bugs and feature requests 4. Why create RAG from scratch when you can use free templates?

Technical details: 1. Next.js 14 + TypeScript + Zustand for state 2. Proxy architecture keeps API keys server-side 3. Proper SSE streaming with cleanup and error boundaries 4. Voice: OpenAI Whisper STT + TTS (6 voices) 5. Three deployment modes: widget.js bundle, iframe, or standalone 6. PWA support with service worker 7. Dark mode + full mobile responsiveness.

Interesting challenges solved: 1. Concurrent message streams without memory leaks 2. Widget state isolation when multiple instances on the same page, 100% customizable. 3. CORS handling for cross-domain embedding 4. Citation preview just like ChatGPT

Deployment options: 1. Vercel/Netlify (one-click) 2. Railway/Render 3. Docker 4. Google Apps Script (single file, 20k req/day free for select social RAG AI bots)

Also includes 9 social platform bots (Slack, Discord, Telegram, etc.) that connect to the same CustomGPT.ai backend.

Code: github.com/Poll-The-People/customgpt-starter-kit

Demo: starterkit.customgpt.ai (10-min free trial or BYO key)

MIT licensed. No telemetry. No premium tiers.

We built this for ourselves but figured others might find it useful. Feedback welcome.

Awesome-RAG GitHub

pkhodiyar — Thu, 10 Jul 2025 20:56:27 +0000

Article URL: https://github.com/Poll-The-People/awesome-rag

Comments URL: https://news.ycombinator.com/item?id=44525514

Points: 2

# Comments: 0

New comment by pkhodiyar in "[dead]"

pkhodiyar — Thu, 22 May 2025 17:55:48 +0000

quick tldr; We are doing a live 60 minutes AMA with 3 industry experts on MCP, sounds interesting? Register.

The goal is to educate about MCP, answer questions, and cover use cases: RAG + MCP, IDEs + MCP, etc. We’ll have live demos, Pinecone folks talking about what they are up to, and much more fun!

If you have been early in the MCP race, this would surely be worth your time.

Why might this interest you?

MCP is just REST for LLMs! It really is that simple!

We plan to show a live demo of a working MCP, preferably hosted one, setting up configs, with Claude.

We will also answer any questions!

Featured Speakers: 1. Santiago (https://www.linkedin.com/in/svpino/) - Computer scientist and teaches hard-core Machine Learning; will walk you through Why do we need MCP?, Before MCP vs. After MCP, Architecture, Primitives, and Advantages.

2. Alden Do Rosario (CustomGPT.ai CEO) - will dissect the RAG + MCP pipeline we run in prod, live demo.

3. Roy Miara, (https://www.linkedin.com/in/roy-miara-73776a56/) Director of Machine Learning, Pinecone, will talk about what Pinecone is upto with MCP.

Format: - 3×10 min tech talks (protocol, integration, case study) - 10 min panel on lessons learned - 20 min open Q&A - bring tough questions

When: - Date: May 29, 01 PM ET | | May 30 At 1:30 AM IST | Thu May 29 At 8:00 PM UTC - Registration (free, no spam): LINK https://lu.ma/gr6eqznl

Code sample, and infra diagrams will be posted after the session. AMA during and after the call - hope to see HN folks there.

101x Airbyte, 11x Estuary, Postgres to Iceberg

pkhodiyar — Thu, 08 May 2025 11:40:35 +0000

Hi HN, we've been developing OLake, an open-source connector specifically designed for replicating data from PostgreSQL into Apache Iceberg. We recently ran some detailed benchmarks comparing its performance and cost against several popular data movement tools: Fivetran, Debezium (using the memiiso setup mentioned), Estuary, and Airbyte.

We wanted to share the results, as they show OLake performing very competitively, often exceeding the speed of both open-source and commercial alternatives, while offering the cost advantages of a self-hosted open-source solution.

The benchmarks covered both full initial loads and Change Data Capture (CDC) on a large dataset (billions of rows for full load, tens of millions of changes for CDC) over a 24-hour window.

Link to entire benchmark postgres - https://olake.io/docs/connectors/postgres/benchmarks

For full loads, OLake achieved throughput of around 46,262 rows/sec, processing over 4 billion rows in 24 hours.

This was essentially on par with Fivetran (46,395 RPS) and significantly faster than Debezium (14,839 RPS - 3.1x slower), Estuary (3,982 RPS - 11.6x slower on a smaller processed dataset), and Airbyte (457 RPS - 101x slower before it failed the long test).

The most striking results were in CDC performance.

For processing 50 million changes, OLake completed the task in 22.5 minutes at 36,982 rows/sec. Fivetran took 31 minutes (1.4x slower), Debezium took 60 minutes (2.7x slower), Estuary took 4.5 hours (12x slower), and Airbyte took 23 hours (63x slower).

This indicates OLake delivers significantly lower latency for propagating changes from PostgreSQL to Iceberg.

On the cost side, OLake is open source and self-hosted. The cost is simply the infrastructure. Running the benchmarks on a substantial VM (64 vcpus, 128 GiB memory) for 24 hours cost less than $75.

Comparing this to the vendor list prices for the data synced in the tests: Fivetran's full load cost $7,446 ($1.86/M rows), Estuary's full load cost $4,462 ($12.97/M rows), Airbyte Cloud's partial full load cost $5,560 ($438.8/M rows).

For CDC, Fivetran cost $2,257 ($45.14/M rows), Estuary cost $22.72 ($0.45/M rows), and Airbyte Cloud cost $148.95 ($2.98/M rows).

While Estuary shows a low per-row cost for CDC in this specific test, the overall picture strongly favors the predictable, infra-based cost of self-hosted OLake, especially for large-scale replication.

In summary, these benchmarks suggest OLake can match or exceed the speed of leading proprietary tools for PostgreSQL to Iceberg replication, offers superior CDC latency compared to all tested alternatives, and provides a significantly lower and more predictable cost structure due to being open source and self-hosted.

You can find more details on the benchmarks and the tool itself in our documentation.

Happy to discuss the results and our approach.

Comments URL: https://news.ycombinator.com/item?id=43925173

Points: 5

# Comments: 0

Show HN: We launched hosted MCP and RAG (no infra, we host)

pkhodiyar — Mon, 05 May 2025 15:30:39 +0000

Hey HN,

We just put a fully managed Hosted MCP Server in front of our production-grade RAG stack. It solves the two things (will talk about that at end) that kept biting us (and most dev teams) when wiring agents to private data.

How does it work? (# 30-second flow)

→ CustomGPT Console → Deploy → MCP → Enable → Grab the generated endpoint + JSON schema → Add to MCP aware client

Point any MCP-aware tool (e.g. dozens of Agentic AI and workflow tools like n8n and Zapier; IDEs like Cursor; ChatGPT w/ MCP plugin, Anthropic’s Claude, etc.) at the endpoint.

So it's basically bringing RAG to MCP.

Back to the 2 things I talked about above are:

1. Agent Answer Accuracy a.k.a RAG accuracy – we benchmark at the top of public leaderboards for “business-doc” retrieval & no hallucination.

2. Ops drag – no k8s, patch cycles, or 3 a.m. TLS renewals. We host, autoscale, and watch the graphs.

Included in every CustomGPT.ai plan (free-trial friendly). Happy to share perf metrics or answer architecture questions.

Ask me anything!

Comments URL: https://news.ycombinator.com/item?id=43896154

Points: 3

# Comments: 0

Debezium to olake.io – PhysicsWallah switch for CDC

pkhodiyar — Wed, 30 Apr 2025 12:44:45 +0000

We recently hosted a small online meetup at OLake where a Data Engineer at PhysicsWallah, walked through why his team dropped Debezium and moved to OLake’s “MongoDB → Iceberg” pipeline.

Video (29 min): https://www.youtube.com/watch?v=qqtE_BrjVkM

If you are someone who prefer text, here’s the quick TLDR;

Why Debezium became a drag for them: 1. Long full loads on multi-million-row MongoDB collections, and any failure meant restarting from scratch 2. Kafka and Connect infrastructure felt heavy when the end goal was “Parquet/Iceberg on S3” 3. Handling heterogeneous arrays required custom SMTs 4. Continuous streaming only; they still had to glue together ad-hoc batch pulls for some workflows 5. Ongoing schema drift demanded extra code to keep Iceberg tables aligned

What changed with OLake? -> Writes directly from MongoDB (and friends) into Apache Iceberg, no message broker in between

-> Two modes: full load for the initial dump, then CDC for ongoing changes — exposed by a single flag in the job config -> Automatic schema evolution: new MongoDB fields appear as nullable columns; complex sub-docs land as JSON strings you can parse later

-> Resumable, chunked full loads: a pod crash resumes instead of restarting

-> Runs as either a Kubernetes CronJob or an Airflow task; config is one YAML/JSON file.

Their stack in one line: MongoDB → OLake writer → Iceberg on S3 → Spark jobs → Trino / occasional Redshift, all orchestrated by Airflow and/or K8s.

Posting here because many of us still bolt Kafka onto CDC just to land files. If you only need Iceberg tables, a simpler path might exist now. Curious to hear others’ experiences with broker-less CDC tools.

(Disclaimer: I work on OLake and hosted the meetup, but the talk is purely technical.)

Check out github repo - https://github.com/datazip-inc/olake

Comments URL: https://news.ycombinator.com/item?id=43844411

Points: 3

# Comments: 1

New comment by pkhodiyar in "Your OpenAI Project and RAG"

pkhodiyar — Fri, 25 Apr 2025 16:43:43 +0000

hey folks, Priyansh this side, I just put together a list of all the tools I could find that are openai endpoint compatible, and so now this gives you the power use your openai based project and add RAG functionality.

if I missed some tools, feel free to jot them below