<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: ij23</title><link>https://news.ycombinator.com/user?id=ij23</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Fri, 26 Jun 2026 06:42:54 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=ij23" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by ij23 in "LiteLLM Migrates to Rust"]]></title><description><![CDATA[
<p>LiteLLM maintainer here. Some context on why we are doing this<p>Over the past year we've heard the same thing from our users and community, they want the fastest and litest AI gateway.<p>This change allows us to address two of the most common problems we hear from users latency spikes under load and memory leaks/OOM kills that take pods down<p>We believe a Rust hot path is faster and bounded in memory, so those whole classes of issues go away.<p>It will be a gradual, non-breaking change. The Python SDK and proxy stay exactly the same, under the hood they start calling the Rust binary through PyO3, one component at a time, each proven in production before the next. The sub-1ms figure is gateway overhead (what we add on top of the upstream call), and we're aiming for a sub-100MB binary. Happy to share benchmark methodology if folks want to poke at it.<p>The whole gateway will be running on Rust by December 1, 2026.<p>Full announcement: <a href="https://docs.litellm.ai/blog/litellm-rust-launch">https://docs.litellm.ai/blog/litellm-rust-launch</a></p>
]]></description><pubDate>Tue, 23 Jun 2026 16:30:32 +0000</pubDate><link>https://news.ycombinator.com/item?id=48647489</link><dc:creator>ij23</dc:creator><comments>https://news.ycombinator.com/item?id=48647489</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48647489</guid></item><item><title><![CDATA[Show HN: LiteHarness – One SDK for Claude Agent, OpenAI Agent, Pi AI]]></title><description><![CDATA[
<p>We built this library because agent harnesses were too fragmented and we needed a simple abstraction to call multiple coding-agent SDKs.<p>lite-harness has one function - query()<p>import { query } from "@lite-harness/sdk";<p>for await (const message of query({
  prompt: "Fix the failing test",
  options: {
    // swap harness between: "claude-agent", "openai-agents", "pi-ai"
    harness: "openai-agents",
    model: "gpt-5.5",
  },
})) {
  console.log(message);
}</p>
<hr>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=48379288">https://news.ycombinator.com/item?id=48379288</a></p>
<p>Points: 2</p>
<p># Comments: 2</p>
]]></description><pubDate>Wed, 03 Jun 2026 02:42:32 +0000</pubDate><link>https://github.com/LiteLLM-Labs/lite-harness</link><dc:creator>ij23</dc:creator><comments>https://news.ycombinator.com/item?id=48379288</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48379288</guid></item><item><title><![CDATA[LiteLLM Agent Platform: Run Claude Code/Codex On-Prem Sandboxes and Vaults]]></title><description><![CDATA[
<p>Article URL: <a href="https://github.com/BerriAI/litellm-agent-platform">https://github.com/BerriAI/litellm-agent-platform</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=48155595">https://news.ycombinator.com/item?id=48155595</a></p>
<p>Points: 3</p>
<p># Comments: 0</p>
]]></description><pubDate>Sat, 16 May 2026 00:22:04 +0000</pubDate><link>https://github.com/BerriAI/litellm-agent-platform</link><dc:creator>ij23</dc:creator><comments>https://news.ycombinator.com/item?id=48155595</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48155595</guid></item><item><title><![CDATA[New comment by ij23 in "Tell HN: Litellm 1.82.7 and 1.82.8 on PyPI are compromised"]]></title><description><![CDATA[
<p>Hi all, Ishaan from LiteLLM here (LiteLLM maintainer)<p>The compromised PyPI packages were litellm==1.82.7 and litellm==1.82.8. Those packages have now been removed from PyPI.
We have confirmed that the compromise originated from the Trivy dependency used in our CI/CD security scanning workflow.
All maintainer accounts have been rotated. The new maintainer accounts are @krrish-berri-2 and @ishaan-berri.
Customers running the official LiteLLM Proxy Docker image were not impacted. That deployment path pins dependencies in requirements.txt and does not rely on the compromised PyPI packages.
We are pausing new LiteLLM releases until we complete a broader supply-chain review and confirm the release path is safe.<p>From a customer exposure standpoint, the key distinction is deployment path. Customers running the standard LiteLLM Proxy Docker deployment path were not impacted by the compromised PyPI packages.<p>The primary risk is to any environment that installed the LiteLLM Python package directly from PyPI during the affected window, particularly versions 1.82.7 or 1.82.8. Any customer with an internal workflow that performs a direct or unpinned pip install litellm should review that path immediately.<p>We are actively investigating full scope and blast radius. Our immediate next steps include:<p>reviewing all BerriAI repositories for impact,
scanning CircleCI builds to understand blast radius and mitigate it,
hardening release and publishing controls, including maintainership and credential governance,
and strengthening our incident communication process for enterprise customers.<p>We have also engaged Google’s Mandiant security team and are actively working with them on the investigation and remediation.</p>
]]></description><pubDate>Tue, 24 Mar 2026 22:28:36 +0000</pubDate><link>https://news.ycombinator.com/item?id=47510417</link><dc:creator>ij23</dc:creator><comments>https://news.ycombinator.com/item?id=47510417</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47510417</guid></item><item><title><![CDATA[New comment by ij23 in "Open-Swarm – use 100 LLMs on OpenAI swarm framework"]]></title><description><![CDATA[
<p>OpenAI's multi-agent framework swarm only supports models from OpenAI.<p>OpenSwarm uses LiteLLM to add support for any LLM AnthropicAI, MistralAI, Ollama, Huggingface, GroqInc, Replicate</p>
]]></description><pubDate>Sat, 12 Oct 2024 18:42:13 +0000</pubDate><link>https://news.ycombinator.com/item?id=41821374</link><dc:creator>ij23</dc:creator><comments>https://news.ycombinator.com/item?id=41821374</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41821374</guid></item><item><title><![CDATA[Open-Swarm – use 100 LLMs on OpenAI swarm framework]]></title><description><![CDATA[
<p>Article URL: <a href="https://github.com/marcusschiesser/open-swarm">https://github.com/marcusschiesser/open-swarm</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=41821373">https://news.ycombinator.com/item?id=41821373</a></p>
<p>Points: 2</p>
<p># Comments: 1</p>
]]></description><pubDate>Sat, 12 Oct 2024 18:42:13 +0000</pubDate><link>https://github.com/marcusschiesser/open-swarm</link><dc:creator>ij23</dc:creator><comments>https://news.ycombinator.com/item?id=41821373</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41821373</guid></item><item><title><![CDATA[New comment by ij23 in "Show HN: Self-Hostable Algolia DocSearch Replacement"]]></title><description><![CDATA[
<p>Canary is awesome! we use Canary for our doc search at LiteLLM (you can see it here: <a href="https://docs.litellm.ai/docs/" rel="nofollow">https://docs.litellm.ai/docs/</a>)<p>It's really useful to be able to specify the search space for a specific query (example: Canary allows search for the query "sagemaker" on our docs or on our github issues )</p>
]]></description><pubDate>Sat, 12 Oct 2024 09:11:24 +0000</pubDate><link>https://news.ycombinator.com/item?id=41817712</link><dc:creator>ij23</dc:creator><comments>https://news.ycombinator.com/item?id=41817712</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41817712</guid></item><item><title><![CDATA[New comment by ij23 in "Show HN: I built an OSS alternative to Azure OpenAI services"]]></title><description><![CDATA[
<p>hi i'm the maintainer of litellm 
- we persist rate limits, they're written to a DB: <a href="https://docs.litellm.ai/docs/proxy/virtual_keys" rel="nofollow noreferrer">https://docs.litellm.ai/docs/proxy/virtual_keys</a><p>- LiteLLM Proxy IS Exactly Compatible with the OpenAI SDK</p>
]]></description><pubDate>Thu, 14 Dec 2023 12:20:52 +0000</pubDate><link>https://news.ycombinator.com/item?id=38640584</link><dc:creator>ij23</dc:creator><comments>https://news.ycombinator.com/item?id=38640584</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=38640584</guid></item><item><title><![CDATA[New comment by ij23 in "Are Open-Source Large Language Models Catching Up?"]]></title><description><![CDATA[
<p>I'm the LiteLLM maintainer, can you elaborate what you're looking for us to do here?</p>
]]></description><pubDate>Fri, 01 Dec 2023 18:30:52 +0000</pubDate><link>https://news.ycombinator.com/item?id=38490382</link><dc:creator>ij23</dc:creator><comments>https://news.ycombinator.com/item?id=38490382</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=38490382</guid></item><item><title><![CDATA[FastRepl – open-source evals for RAG, Agents]]></title><description><![CDATA[
<p>Article URL: <a href="https://github.com/repllabs/fastrepl">https://github.com/repllabs/fastrepl</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=37826322">https://news.ycombinator.com/item?id=37826322</a></p>
<p>Points: 3</p>
<p># Comments: 0</p>
]]></description><pubDate>Mon, 09 Oct 2023 22:37:16 +0000</pubDate><link>https://github.com/repllabs/fastrepl</link><dc:creator>ij23</dc:creator><comments>https://news.ycombinator.com/item?id=37826322</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=37826322</guid></item><item><title><![CDATA[React Library to Build Dashboards]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.tremor.so/docs/getting-started/installation">https://www.tremor.so/docs/getting-started/installation</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=37578779">https://news.ycombinator.com/item?id=37578779</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Wed, 20 Sep 2023 00:02:34 +0000</pubDate><link>https://www.tremor.so/docs/getting-started/installation</link><dc:creator>ij23</dc:creator><comments>https://news.ycombinator.com/item?id=37578779</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=37578779</guid></item><item><title><![CDATA[Open Interpreter: Code Interpreter in your terminal, running locally(100 LLMs)]]></title><description><![CDATA[
<p>Article URL: <a href="https://github.com/KillianLucas/open-interpreter">https://github.com/KillianLucas/open-interpreter</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=37447678">https://news.ycombinator.com/item?id=37447678</a></p>
<p>Points: 4</p>
<p># Comments: 0</p>
]]></description><pubDate>Sat, 09 Sep 2023 17:14:58 +0000</pubDate><link>https://github.com/KillianLucas/open-interpreter</link><dc:creator>ij23</dc:creator><comments>https://news.ycombinator.com/item?id=37447678</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=37447678</guid></item><item><title><![CDATA[EvaDB – SQL Queries Using Hugging Face, Open AI, Ultralytics, PyTorch]]></title><description><![CDATA[
<p>Article URL: <a href="https://github.com/georgia-tech-db">https://github.com/georgia-tech-db</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=37438632">https://news.ycombinator.com/item?id=37438632</a></p>
<p>Points: 6</p>
<p># Comments: 1</p>
]]></description><pubDate>Fri, 08 Sep 2023 19:58:12 +0000</pubDate><link>https://github.com/georgia-tech-db</link><dc:creator>ij23</dc:creator><comments>https://news.ycombinator.com/item?id=37438632</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=37438632</guid></item><item><title><![CDATA[Show HN: LiteLLM - Open source library A/B test LLMs in Production]]></title><description><![CDATA[
<p>Hello Hacker News,<p>Stop relying on benchmarks and easily test LLMs in production. 
Try it here: <a href="https://admin.litellm.ai/" rel="nofollow noreferrer">https://admin.litellm.ai/</a><p>LiteLLM allows you to simplify calling any LLM as a drop in replacement for gpt-3.5-turbo<p>We're launching `completion_with_split_tests` to easily A/B test all LLMs.<p>Example usage - 1 function:
completion_with_split_tests(<p><pre><code>  models={
 "claude-2": 0.4, 
 "gpt-3.5-turbo": 0.6
  }, 

  messages=messages,

  temperature=temperature</code></pre>
)<p>For each completion call we allow you to:<p>- Control/Modify LLM configs (prompt, temperature, max_tokens etc without needing to edit code)<p>- Easily swap in/out 100+ LLMs without redeploying code<p>- View Input/Outputs for each LLM on our UI<p>- Retry requests with an alternate LLM<p>Happy completion()!</p>
<hr>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=37362967">https://news.ycombinator.com/item?id=37362967</a></p>
<p>Points: 2</p>
<p># Comments: 0</p>
]]></description><pubDate>Sat, 02 Sep 2023 16:25:38 +0000</pubDate><link>https://litellm.vercel.app/docs/tutorials/ab_test_llms</link><dc:creator>ij23</dc:creator><comments>https://news.ycombinator.com/item?id=37362967</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=37362967</guid></item><item><title><![CDATA[New comment by ij23 in "Llama2 on Replicate faster than ChatGPT?"]]></title><description><![CDATA[
<p>Ran some testing and discovered llama2 on replicate is faster than chatgpt!<p>Code - <a href="https://github.com/BerriAI/litellm/blob/main/cookbook/Evaluating_LLMs.ipynb">https://github.com/BerriAI/litellm/blob/main/cookbook/Evalua...</a><p>Are others seeing similar results?</p>
]]></description><pubDate>Wed, 16 Aug 2023 20:30:37 +0000</pubDate><link>https://news.ycombinator.com/item?id=37153343</link><dc:creator>ij23</dc:creator><comments>https://news.ycombinator.com/item?id=37153343</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=37153343</guid></item><item><title><![CDATA[Llama2 on Replicate faster than ChatGPT?]]></title><description><![CDATA[
<p>Article URL: <a href="https://github.com/BerriAI/litellm/blob/main/cookbook/Evaluating_LLMs.ipynb">https://github.com/BerriAI/litellm/blob/main/cookbook/Evaluating_LLMs.ipynb</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=37153342">https://news.ycombinator.com/item?id=37153342</a></p>
<p>Points: 1</p>
<p># Comments: 2</p>
]]></description><pubDate>Wed, 16 Aug 2023 20:30:37 +0000</pubDate><link>https://github.com/BerriAI/litellm/blob/main/cookbook/Evaluating_LLMs.ipynb</link><dc:creator>ij23</dc:creator><comments>https://news.ycombinator.com/item?id=37153342</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=37153342</guid></item><item><title><![CDATA[New comment by ij23 in "Show HN: liteLLM Proxy Server: 50+ LLM Models, Error Handling, Caching"]]></title><description><![CDATA[
<p>What local/in-K8-cluster models servers would you recommend adding ?<p>Should we add support for llama.cpp and vllm.ai in the proxy server ? Or should we assume you can host them on your own infra and the proxy server requests your hosted model ?</p>
]]></description><pubDate>Sat, 12 Aug 2023 05:21:15 +0000</pubDate><link>https://news.ycombinator.com/item?id=37097261</link><dc:creator>ij23</dc:creator><comments>https://news.ycombinator.com/item?id=37097261</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=37097261</guid></item><item><title><![CDATA[New comment by ij23 in "Show HN: liteLLM Proxy Server: 50+ LLM Models, Error Handling, Caching"]]></title><description><![CDATA[
<p>Yes, you use your own API keys. You can set them as env variables. 
Either set them as os.environ['OPENAI_API_KEY'] or set them in .env files:
<a href="https://litellm.readthedocs.io/en/latest/supported/" rel="nofollow noreferrer">https://litellm.readthedocs.io/en/latest/supported/</a></p>
]]></description><pubDate>Sat, 12 Aug 2023 05:19:28 +0000</pubDate><link>https://news.ycombinator.com/item?id=37097251</link><dc:creator>ij23</dc:creator><comments>https://news.ycombinator.com/item?id=37097251</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=37097251</guid></item><item><title><![CDATA[Show HN: liteLLM Proxy Server: 50+ LLM Models, Error Handling, Caching]]></title><description><![CDATA[
<p>Hello hacker news,<p>I’m the maintainer of liteLLM() - package to simplify input/output to OpenAI, Azure, Cohere, Anthropic, Hugging face API Endpoints: <a href="https://github.com/BerriAI/litellm/">https://github.com/BerriAI/litellm/</a><p>We’re open sourcing our implementation of liteLLM proxy: <a href="https://github.com/BerriAI/litellm/blob/main/cookbook/proxy-server/readme.md">https://github.com/BerriAI/litellm/blob/main/cookbook/proxy-...</a><p>TLDR: It has one API endpoint /chat/completions and standardizes input/output for 50+ LLM models + handles logging, error tracking, caching, streaming<p>What can liteLLM proxy do?
- It’s a central place to manage all LLM provider integrations<p>- Consistent Input/Output Format
    - Call all models using the OpenAI format: completion(model, messages)
    - Text responses will always be available at ['choices'][0]['message']['content']<p>- Error Handling Using Model Fallbacks (if GPT-4 fails, try llama2)<p>- Logging - Log Requests, Responses and Errors to Supabase, Posthog, Mixpanel, Sentry, Helicone<p>- Token Usage & Spend - Track Input + Completion tokens used + Spend/model<p>- Caching - Implementation of Semantic Caching<p>- Streaming & Async Support - Return generators to stream text responses<p>You can deploy liteLLM to your own infrastructure using Railway, GCP, AWS, Azure<p>Happy completion() !</p>
<hr>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=37095542">https://news.ycombinator.com/item?id=37095542</a></p>
<p>Points: 140</p>
<p># Comments: 34</p>
]]></description><pubDate>Sat, 12 Aug 2023 00:08:13 +0000</pubDate><link>https://github.com/BerriAI/litellm/blob/main/cookbook/proxy-server/readme.md</link><dc:creator>ij23</dc:creator><comments>https://news.ycombinator.com/item?id=37095542</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=37095542</guid></item><item><title><![CDATA[Show HN: LiteLLM -Open-Source Library for Anthropic,Azure,OpenAI, etc. API Calls]]></title><description><![CDATA[
<p>Needed a simple way to call multiple LLM providers. LiteLLM provides 2 functions - `completion` and `embedding`; and guarantees consistent input/output formats across all providers. That's it!</p>
<hr>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=37048149">https://news.ycombinator.com/item?id=37048149</a></p>
<p>Points: 5</p>
<p># Comments: 1</p>
]]></description><pubDate>Tue, 08 Aug 2023 12:41:22 +0000</pubDate><link>https://litellm.ai/</link><dc:creator>ij23</dc:creator><comments>https://news.ycombinator.com/item?id=37048149</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=37048149</guid></item></channel></rss>