<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: g023</title><link>https://news.ycombinator.com/user?id=g023</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Sat, 23 May 2026 01:41:02 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=g023" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by g023 in "DeepSeek makes the V4 Pro price discount permanent"]]></title><description><![CDATA[
<p>I use DeepSeek v4 flash with CoPilot and it works pretty good.</p>
]]></description><pubDate>Fri, 22 May 2026 21:38:06 +0000</pubDate><link>https://news.ycombinator.com/item?id=48242017</link><dc:creator>g023</dc:creator><comments>https://news.ycombinator.com/item?id=48242017</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48242017</guid></item><item><title><![CDATA[New comment by g023 in "DeepSeek makes the V4 Pro price discount permanent"]]></title><description><![CDATA[
<p>If anyone is looking to hook it up to copilot, I made a proxy script to handle the connection a bit back that might be handy: <a href="https://gist.github.com/g023/c2bb7b540ffe64cee76023f18f6f9365" rel="nofollow">https://gist.github.com/g023/c2bb7b540ffe64cee76023f18f6f936...</a></p>
]]></description><pubDate>Fri, 22 May 2026 21:36:55 +0000</pubDate><link>https://news.ycombinator.com/item?id=48241996</link><dc:creator>g023</dc:creator><comments>https://news.ycombinator.com/item?id=48241996</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48241996</guid></item><item><title><![CDATA[New comment by g023 in "[dead]"]]></title><description><![CDATA[
<p>Terminal-based chat application powered by *locally installed* llama.cpp, featuring an auto-managed server backend, reasoning modes, and 7 built-in filesystem tools for interactive AI assistance with a focus on only allowing read only agentic access to system and only offline-focused agentic commands (for now). Powered by the g023/g023-Qwen3.5-9B-GGUF:IQ2_M model.</p>
]]></description><pubDate>Sun, 19 Apr 2026 05:45:02 +0000</pubDate><link>https://news.ycombinator.com/item?id=47822080</link><dc:creator>g023</dc:creator><comments>https://news.ycombinator.com/item?id=47822080</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47822080</guid></item><item><title><![CDATA[New comment by g023 in "Cerebras S-1"]]></title><description><![CDATA[
<p>We need more personal level AI solutions instead of so much corporate centered solutions.</p>
]]></description><pubDate>Sat, 18 Apr 2026 00:09:43 +0000</pubDate><link>https://news.ycombinator.com/item?id=47811953</link><dc:creator>g023</dc:creator><comments>https://news.ycombinator.com/item?id=47811953</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47811953</guid></item><item><title><![CDATA[Local Model Router: Ollama/OpenAI-compat bridges for local LLMs via llama.cpp]]></title><description><![CDATA[
<p>A high-performance local LLM server providing drop-in API compatibility with Ollama and OpenAI, built on llama.cpp's llama-server. Features automatic VRAM management, Hugging Face integration, and modular architecture. Unlike Ollama which bundles its own inference engine, LMR leverages the battle-tested llama.cpp backend while providing familiar APIs and intelligent model management.<p>https://github.com/g023/localmodelrouter</p>
<hr>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47809550">https://news.ycombinator.com/item?id=47809550</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Fri, 17 Apr 2026 19:19:55 +0000</pubDate><link>https://news.ycombinator.com/item?id=47809550</link><dc:creator>g023</dc:creator><comments>https://news.ycombinator.com/item?id=47809550</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47809550</guid></item><item><title><![CDATA[New comment by g023 in "The local LLM ecosystem doesn’t need Ollama"]]></title><description><![CDATA[
<p>I've started creating <a href="https://github.com/g023/localmodelrouter/" rel="nofollow">https://github.com/g023/localmodelrouter/</a> which offers Ollama like functionality but as a single .py file with minimal dependencies and more focus on letting llama.cpp handle the dirty work.</p>
]]></description><pubDate>Fri, 17 Apr 2026 05:53:14 +0000</pubDate><link>https://news.ycombinator.com/item?id=47802823</link><dc:creator>g023</dc:creator><comments>https://news.ycombinator.com/item?id=47802823</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47802823</guid></item><item><title><![CDATA[New comment by g023 in "[dead]"]]></title><description><![CDATA[
<p>HarnessHarvester generates executable Python harnesses from natural language task descriptions, executes them in a sandboxed environment, reviews them with multi-faceted LLM judges, and repairs failures using branching strategies. It includes two autonomous modes: autolearn (continuous discovery loop) and autoimprove (iterative enhancement of existing harnesses). This concept is designed to be an offline first harness/scaffolding builder where you get the harness instead of some remote api.</p>
]]></description><pubDate>Fri, 17 Apr 2026 02:12:09 +0000</pubDate><link>https://news.ycombinator.com/item?id=47801814</link><dc:creator>g023</dc:creator><comments>https://news.ycombinator.com/item?id=47801814</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47801814</guid></item><item><title><![CDATA[New comment by g023 in "Show HN: Standalone TurboQuant KV Cache Inference"]]></title><description><![CDATA[
<p>A single file, python based, minimal/recognizable dependencies, turboquant playground, barebones af, with some easy to access globals to experiment with at top of 'run_tquant.py'. Test model is a 1.77B model that I altered by duplicating a layer in a Qwen3 1.7B model. Probably work fine with the regular Qwen3 1.7B model as well, but for right now I'm just working with my surgically altered one while I work on the script.</p>
]]></description><pubDate>Tue, 07 Apr 2026 00:30:33 +0000</pubDate><link>https://news.ycombinator.com/item?id=47669210</link><dc:creator>g023</dc:creator><comments>https://news.ycombinator.com/item?id=47669210</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47669210</guid></item><item><title><![CDATA[New comment by g023 in "Show HN: Standalone TurboQuant KV Cache Inference"]]></title><description><![CDATA[
<p>I had some issues in the original, but had to jump away for a bit here to do some backups (weak). Anyways, I updated to make the necessary fixes, and also made some more tweaking values at top to play with and dialed in the params for the more this specific model a bit more. I will start testing with some other models here as my next step in this little experiment. Thanks for the interest. Feel free to try latest version and run the interactive mode to chat it up with the model and get a feedback on the results as you go. If you have any suggestions, let me know. I'm trying to keep this one as barebones as possible to make it easier for others to port to other languages, or integrate into other uses more easily.<p>edit: just added Mirostat v2 to clean up repetitive output from the model</p>
]]></description><pubDate>Tue, 07 Apr 2026 00:24:16 +0000</pubDate><link>https://news.ycombinator.com/item?id=47669165</link><dc:creator>g023</dc:creator><comments>https://news.ycombinator.com/item?id=47669165</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47669165</guid></item><item><title><![CDATA[Show HN: Standalone TurboQuant KV Cache Inference]]></title><description><![CDATA[
<p>Implements TurboQuant (ICLR 2026, arXiv:2504.19874) KV cache compression
directly inside a Transformers inference script. All algorithms are self-contained. Minimal dependencies.<p>- uses <a href="https://huggingface.co/g023/Qwen3-1.77B-g023" rel="nofollow">https://huggingface.co/g023/Qwen3-1.77B-g023</a> as the demonstration model (throw model files in Qwen3-BEST folder)</p>
<hr>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47633195">https://news.ycombinator.com/item?id=47633195</a></p>
<p>Points: 3</p>
<p># Comments: 4</p>
]]></description><pubDate>Fri, 03 Apr 2026 22:31:31 +0000</pubDate><link>https://github.com/g023/turboquant</link><dc:creator>g023</dc:creator><comments>https://news.ycombinator.com/item?id=47633195</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47633195</guid></item><item><title><![CDATA[Show HN: An offline first focused agentic CLI application powered by Ollama]]></title><description><![CDATA[
<p>An offline first focused agentic CLI application powered by local Ollama models. Integrated advanced memory management. Minimal dependencies. MIT licensed.<p>DEFAULT USES A 4b Qwen 3.5 MODEL.<p><a href="https://github.com/g023/ai_cli/" rel="nofollow">https://github.com/g023/ai_cli/</a></p>
<hr>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47522348">https://news.ycombinator.com/item?id=47522348</a></p>
<p>Points: 4</p>
<p># Comments: 0</p>
]]></description><pubDate>Wed, 25 Mar 2026 19:56:17 +0000</pubDate><link>https://news.ycombinator.com/item?id=47522348</link><dc:creator>g023</dc:creator><comments>https://news.ycombinator.com/item?id=47522348</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47522348</guid></item><item><title><![CDATA[New comment by g023 in "G023's Agentic Chat with Memory and Python Power"]]></title><description><![CDATA[
<p>A sophisticated multi-level reasoning engine with agentic memory, tool integration, and user control modes. Built as a (primarily) single-file Python program using vanilla Python and local LLM integration. Powered by Ollama API and utilizes a 1.77B Qwen3 variant that has layer 21 duplicated. Different layers of memory for agentic processes. Has ability to oversee command execution or yolo mode. MIT Licensed.</p>
]]></description><pubDate>Fri, 20 Mar 2026 19:29:40 +0000</pubDate><link>https://news.ycombinator.com/item?id=47459449</link><dc:creator>g023</dc:creator><comments>https://news.ycombinator.com/item?id=47459449</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47459449</guid></item><item><title><![CDATA[Show HN: G023's Agentic Chat with Memory and Python Power]]></title><description><![CDATA[
<p>Article URL: <a href="https://github.com/g023/g023_agentic_chat">https://github.com/g023/g023_agentic_chat</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47459448">https://news.ycombinator.com/item?id=47459448</a></p>
<p>Points: 1</p>
<p># Comments: 1</p>
]]></description><pubDate>Fri, 20 Mar 2026 19:29:40 +0000</pubDate><link>https://github.com/g023/g023_agentic_chat</link><dc:creator>g023</dc:creator><comments>https://news.ycombinator.com/item?id=47459448</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47459448</guid></item><item><title><![CDATA[New comment by g023 in "Nvidia releases 8B model with learned 8x KV cache compression"]]></title><description><![CDATA[
<p>I made a smaller sized version <a href="https://huggingface.co/g023/Qwen3-8B-DMS-8x-4bit-NF4" rel="nofollow">https://huggingface.co/g023/Qwen3-8B-DMS-8x-4bit-NF4</a></p>
]]></description><pubDate>Sat, 31 Jan 2026 09:46:32 +0000</pubDate><link>https://news.ycombinator.com/item?id=46835045</link><dc:creator>g023</dc:creator><comments>https://news.ycombinator.com/item?id=46835045</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46835045</guid></item><item><title><![CDATA[New comment by g023 in "Wall Street sees AI bubble coming and is betting on what pops it"]]></title><description><![CDATA[
<p>I'm thinking that the bubble will be the vortex caused by an abundance of power that becomes freely available locally due to the AI datacenters moving to space.</p>
]]></description><pubDate>Mon, 15 Dec 2025 16:58:17 +0000</pubDate><link>https://news.ycombinator.com/item?id=46277040</link><dc:creator>g023</dc:creator><comments>https://news.ycombinator.com/item?id=46277040</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46277040</guid></item><item><title><![CDATA[New comment by g023 in "The Rise of Computer Games, Part I: Adventure"]]></title><description><![CDATA[
<p>Something about the modern day fails to match the feelings of when MUDs were in their prime. With text you can describe so much more than a picture can paint. You can visualize a smell, a taste, or a feeling in text, but it doesn't translate well when you have graphics painting your imagination for you.</p>
]]></description><pubDate>Mon, 15 Dec 2025 16:43:19 +0000</pubDate><link>https://news.ycombinator.com/item?id=46276834</link><dc:creator>g023</dc:creator><comments>https://news.ycombinator.com/item?id=46276834</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46276834</guid></item><item><title><![CDATA[Show HN: G023's OllamaMan – Web-based OS for managing Ollama servers]]></title><description><![CDATA[
<p>g023's OllamaMan - Ollama Manager OS style GUI management of ollama Server - Open Source LLM Management using PHP/JS/Sqlite and a web browser. Integrated apps for chat, terminal viewer, model management (can pull or delete; huggingface gguf supported too). Advanced model creation now implemented. Supports images in chat and speech to text/history/etc. Lots of quick access buttons, and auto sets itself up if db doesn't exist. Adding more bells and whistles as I go. Not meant for a public facing folder, so protect as you see necessary. Open source, BSD 3-Clause so have fun and thanks for giving it a glance. I just made it yesterday, so I'm sorry if it has some rough bits as I'll fudge those into place hopefully shortly (menus on the topbar/settings tabs). Right now fully functional.</p>
<hr>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=46268382">https://news.ycombinator.com/item?id=46268382</a></p>
<p>Points: 3</p>
<p># Comments: 0</p>
]]></description><pubDate>Sun, 14 Dec 2025 23:40:35 +0000</pubDate><link>https://github.com/g023/g023-OllamaMan</link><dc:creator>g023</dc:creator><comments>https://news.ycombinator.com/item?id=46268382</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46268382</guid></item></channel></rss>