<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: abdullin</title><link>https://news.ycombinator.com/user?id=abdullin</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Wed, 15 Apr 2026 00:09:53 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=abdullin" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by abdullin in "Ask HN: What Are You Working On? (April 2026)"]]></title><description><![CDATA[
<p>I built a platform to learn how to build personal AI agents and test them with fast feedback. It is free for individuals and small teams.<p>Platform deterministically generates tasks, creates environments for them, observes AI agents and then scores them (not LLM as a judge).<p>We just ran a worldwide hackathon  (800 engineers across 80 cities). Ended up creating more than 1 million runtimes (each task runs in its own environment) and crashing the platform halfway.<p>104 tasks from the challenge on building a personal and trustworthy AI agent are open now for everyone.<p><a href="https://bitgn.com/" rel="nofollow">https://bitgn.com/</a><p>To get started faster you can use a simple SGR Next Step agent: <a href="https://github.com/bitgn/sample-agents" rel="nofollow">https://github.com/bitgn/sample-agents</a></p>
]]></description><pubDate>Mon, 13 Apr 2026 07:28:26 +0000</pubDate><link>https://news.ycombinator.com/item?id=47748867</link><dc:creator>abdullin</dc:creator><comments>https://news.ycombinator.com/item?id=47748867</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47748867</guid></item><item><title><![CDATA[New comment by abdullin in "Why I love NixOS"]]></title><description><![CDATA[
<p>I liked NixOS pre-LLM era, since it allowed me to manage a couple of servers in a reproducible way. Ability to reboot back to a stable configuration felt like magic.<p>Nowadays I love it, since I can let Codex manage the servers for me.<p>“Here is the flake, here is nix module for the server, here is the project source code. Now change all of that so that wildcard certificates work and requests land through systemd socket on a proper go mux endpoint. Don’t come back until you verify it as working”<p>5 minutes later it came back.</p>
]]></description><pubDate>Mon, 23 Mar 2026 06:46:32 +0000</pubDate><link>https://news.ycombinator.com/item?id=47486152</link><dc:creator>abdullin</dc:creator><comments>https://news.ycombinator.com/item?id=47486152</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47486152</guid></item><item><title><![CDATA[New comment by abdullin in "Ask HN: What Are You Working On? (Nov 2025)"]]></title><description><![CDATA[
<p>Yep, exactly the same concept. Except not live-streaming, but giving out a lot of multi-step tasks that require reasoning and adaptation.<p>Here is a screenshot of a test task: <a href="https://www.linkedin.com/posts/abdullin_ddd-ai-sgr-here-is-how-automated-activity-7393633105187733504-I2Vf" rel="nofollow">https://www.linkedin.com/posts/abdullin_ddd-ai-sgr-here-is-h...</a><p>Although… since I record all interactions, could replay all them as if they were streamed.</p>
]]></description><pubDate>Mon, 10 Nov 2025 15:46:43 +0000</pubDate><link>https://news.ycombinator.com/item?id=45877033</link><dc:creator>abdullin</dc:creator><comments>https://news.ycombinator.com/item?id=45877033</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45877033</guid></item><item><title><![CDATA[New comment by abdullin in "Ask HN: What Are You Working On? (Nov 2025)"]]></title><description><![CDATA[
<p>I’m working on a platform to run a friendly competition in “who builds the best reasoning AI Agent”.<p>Each participating team (got 300 signups so far) will get a set of text tasks and a set of simulated APIs to solve them.<p>For instance the task (a typical chatbot task) could say something like: “Schedule 30m knowledge exchange next week between the most experienced Python expert in the company and 3-5 people that are most interested in learning it “<p>AI agent will have to solve through this by using a set of simulated APIs and playing a bit of calendar Tetris (in this case - Calendar API, Email API, SkillWill API).<p>Since API instances are simulated and isolated (per team per task), it becomes fairly easy to automatically check correctness of each solution and rank different agents in a global leaderboard.<p>Code of agents stays external, but participants fill and submit brief questionnaires about their architectures.<p>By benchmarking different agentic implementations on the same tasks - we get to see patterns in performance, accuracy and costs of various architectures.<p>Codebase of the platform is written mostly in golang (to support thousands of concurrent simulations). I’m using coding agents (Claude Code and Codex) for exploration and easy coding tasks, but the core has still to be handcrafted.</p>
]]></description><pubDate>Mon, 10 Nov 2025 06:55:45 +0000</pubDate><link>https://news.ycombinator.com/item?id=45873164</link><dc:creator>abdullin</dc:creator><comments>https://news.ycombinator.com/item?id=45873164</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45873164</guid></item><item><title><![CDATA[New comment by abdullin in "Ask HN: How can ChatGPT serve 700M users when I can't run one GPT-4 locally?"]]></title><description><![CDATA[
<p>> Inference is (mostly) stateless<p>Quite the opposite. Context caching requires state (K/V cache) close to the VRAM. Streaming requires state. Constrained decoding (known as Structured Outputs) also requires state.</p>
]]></description><pubDate>Sat, 09 Aug 2025 13:20:36 +0000</pubDate><link>https://news.ycombinator.com/item?id=44846255</link><dc:creator>abdullin</dc:creator><comments>https://news.ycombinator.com/item?id=44846255</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44846255</guid></item><item><title><![CDATA[New comment by abdullin in "Show HN: Conductor, a Mac app that lets you run a bunch of Claude Codes at once"]]></title><description><![CDATA[
<p>Is it similar to what OpenAI Codex does with isolated environments per agent run?</p>
]]></description><pubDate>Sun, 20 Jul 2025 18:07:49 +0000</pubDate><link>https://news.ycombinator.com/item?id=44627696</link><dc:creator>abdullin</dc:creator><comments>https://news.ycombinator.com/item?id=44627696</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44627696</guid></item><item><title><![CDATA[New comment by abdullin in "Ask HN: Any active COBOL devs here? What are you working on?"]]></title><description><![CDATA[
<p>In systems like that you can record human interactions with the old version, replay against the new one and compare outcomes.<p>Is there a delta? Debug and add a unit test to capture the bug. Then fix and move to the next delta.</p>
]]></description><pubDate>Fri, 18 Jul 2025 16:48:17 +0000</pubDate><link>https://news.ycombinator.com/item?id=44606921</link><dc:creator>abdullin</dc:creator><comments>https://news.ycombinator.com/item?id=44606921</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44606921</guid></item><item><title><![CDATA[New comment by abdullin in "Ask HN: Any active COBOL devs here? What are you working on?"]]></title><description><![CDATA[
<p>I grew to like migration  projects like that.<p>Currently working on migration of 30yo ERP without tests in Progress to Kotlin+PostgreSQL.<p>AI agents don’t care which code to read or convert into tests. They just need an automated feedback loop and some human oversight.</p>
]]></description><pubDate>Fri, 18 Jul 2025 13:53:47 +0000</pubDate><link>https://news.ycombinator.com/item?id=44604667</link><dc:creator>abdullin</dc:creator><comments>https://news.ycombinator.com/item?id=44604667</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44604667</guid></item><item><title><![CDATA[Tracking PR volume from AI coding agents]]></title><description><![CDATA[
<p>Article URL: <a href="https://prarena.ai">https://prarena.ai</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=44353302">https://news.ycombinator.com/item?id=44353302</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Mon, 23 Jun 2025 07:25:58 +0000</pubDate><link>https://prarena.ai</link><dc:creator>abdullin</dc:creator><comments>https://news.ycombinator.com/item?id=44353302</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44353302</guid></item><item><title><![CDATA[New comment by abdullin in "Andrej Karpathy: Software in the era of AI [video]"]]></title><description><![CDATA[
<p>I think there are two different layers that get frequently mixed.<p>(1) LLMs as models - just the weights and an inference engine. These are just tools like hammers. There is a wide variety of models, starting from transparent and useless IBM Granite models, to open-weights Llama/Qwen to proprietary.<p>(2) AI products that are built on top of LLMs (agents, RAG, search, reasoning etc). This is how people decide to use LLMs.<p>How these products display results - with or without citations, with or without attribution - is determined by the product design.<p>It takes more effort to design a system that properly attributes all bits of information to the sources, but it is doable. As long as product teams are willing to invest that effort.</p>
]]></description><pubDate>Sat, 21 Jun 2025 11:34:11 +0000</pubDate><link>https://news.ycombinator.com/item?id=44336695</link><dc:creator>abdullin</dc:creator><comments>https://news.ycombinator.com/item?id=44336695</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44336695</guid></item><item><title><![CDATA[New comment by abdullin in "Andrej Karpathy: Software in the era of AI [video]"]]></title><description><![CDATA[
<p>Yes. I believe, the experience will get better. Plus more AI vendors will catch up with OpenAI and offer similar experiences in their products.<p>It will just take a few months.</p>
]]></description><pubDate>Sat, 21 Jun 2025 11:27:56 +0000</pubDate><link>https://news.ycombinator.com/item?id=44336655</link><dc:creator>abdullin</dc:creator><comments>https://news.ycombinator.com/item?id=44336655</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44336655</guid></item><item><title><![CDATA[New comment by abdullin in "Andrej Karpathy: Software in the era of AI [video]"]]></title><description><![CDATA[
<p>Here is another way to look at the problem.<p>There is a team of 5 people that are passionate about their indigenous language and want to preserve it from disappearing. They are using AI+Coding tools to:<p>(1) Process and prepare a ton of various datasets for training custom text-to-speech, speech-to-text models and wake word models (because foundational models don't know this language), along with the pipelines and tooling for the contributors.<p>(2) design and develop an embedded device (running ESP32-S3) to act as a smart speaker running on the edge<p>(3) design and develop backend in golang to orchestrate hundreds of these speakers<p>(4) a whole bunch of Python agents (essentially glorified RAGs over folklore, stories)<p>(5) a set of websites for teachers to create course content and exercises, making them available to these edge devices<p>All that, just so that kids in a few hundred kindergartens and schools would be able to practice their own native language, listen to fairy tales, songs or ask questions.<p>This project was acknowledged by the UN (AI for Good programme). They are now extending their help to more disappearing languages.<p>None of that was possible before. This sounds like a good progress to me.<p>Edit: added newlines.</p>
]]></description><pubDate>Thu, 19 Jun 2025 21:46:13 +0000</pubDate><link>https://news.ycombinator.com/item?id=44322806</link><dc:creator>abdullin</dc:creator><comments>https://news.ycombinator.com/item?id=44322806</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44322806</guid></item><item><title><![CDATA[New comment by abdullin in "Andrej Karpathy: Software in the era of AI [video]"]]></title><description><![CDATA[
<p>It is actually funny that current AI+Coding tools benefit a lot from domain context and other information along the lines of Domain-Driven Design (which was inspired by the pattern language of C. Alexander).<p>A few teams have started incorporating `CONTEXT.MD` into module descriptions to leverage this.</p>
]]></description><pubDate>Thu, 19 Jun 2025 13:17:36 +0000</pubDate><link>https://news.ycombinator.com/item?id=44318438</link><dc:creator>abdullin</dc:creator><comments>https://news.ycombinator.com/item?id=44318438</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44318438</guid></item><item><title><![CDATA[New comment by abdullin in "Andrej Karpathy: Software in the era of AI [video]"]]></title><description><![CDATA[
<p>Agreed. AI is just a tool. Letting in run the show is essentially what the vibe-coding is. It is a fun activity for prototyping, but tends to accumulate problems and tech debt at an astonishing pace.<p>Code, manually crafted by professionals, will almost always beat AI-driven code in quality. Yet, one has still to find such professionals and wait for them to get the job done.<p>I think, the right balance is somewhere in between - let tools handle the mundane parts (e.g. mechanically rewriting that legacy Progress ABL/4GL code to Kotlin), while human engineers will have fun with high-level tasks and shaping the direction of the project.</p>
]]></description><pubDate>Thu, 19 Jun 2025 12:58:15 +0000</pubDate><link>https://news.ycombinator.com/item?id=44318282</link><dc:creator>abdullin</dc:creator><comments>https://news.ycombinator.com/item?id=44318282</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44318282</guid></item><item><title><![CDATA[New comment by abdullin in "Andrej Karpathy: Software in the era of AI [video]"]]></title><description><![CDATA[
<p>Exactly!<p>This is why there has to be "write me a detailed implementation plan" step in between. Which files is it going to change, how, what are the gotchas, which tests will be affected or added etc.<p>It is easier to review one document and point out missing bits, than chase the loose ends.<p>Once the plan is done and good, it is usually a smooth path to the PR.</p>
]]></description><pubDate>Thu, 19 Jun 2025 12:52:57 +0000</pubDate><link>https://news.ycombinator.com/item?id=44318227</link><dc:creator>abdullin</dc:creator><comments>https://news.ycombinator.com/item?id=44318227</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44318227</guid></item><item><title><![CDATA[New comment by abdullin in "Andrej Karpathy: Software in the era of AI [video]"]]></title><description><![CDATA[
<p>I guess, it depends on the case and the approach.<p>It works really nice with the following approach (distilled from experiences reported by multiple companies)<p>(1) Augment codebase with explanatory texts that describe individual modules, interfaces and interactions (something that is needed for the humans anyway)<p>(2) Provide Agent.MD that describes the approach/style/process that the AI agent must take. It should also describe how to run all tests.<p>(3) Break down the task into smaller features. For each feature - ask first to write a detailed implementation plan (because it is easier to review the plan than 1000 lines of changes. spread across a dozen files)<p>(4) Review the plan and ask to improve it, if needed. When ready - ask to draft an actual pull request<p>(5) The system will automatically use all available tests/linting/rules before writing the final PR. Verify and provide feedback, if some polish is needed.<p>(6) Launch multiple instances of "write me an implementation plan" and "Implement this plan" task, to pick the one that looks the best.<p>This is very similar to git-driven development of large codebases by distributed teams.<p>Edit: added newlines</p>
]]></description><pubDate>Thu, 19 Jun 2025 12:50:16 +0000</pubDate><link>https://news.ycombinator.com/item?id=44318193</link><dc:creator>abdullin</dc:creator><comments>https://news.ycombinator.com/item?id=44318193</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44318193</guid></item><item><title><![CDATA[New comment by abdullin in "Andrej Karpathy: Software in the era of AI [video]"]]></title><description><![CDATA[
<p>Running tests is already an engineering problem.<p>In one of the systems (supply chain SaaS) we invested so much effort in having good tests in a simulated environment, that we could run full-stack tests at kHz. Roughly ~5k tests per second or so on a laptop.</p>
]]></description><pubDate>Thu, 19 Jun 2025 12:44:44 +0000</pubDate><link>https://news.ycombinator.com/item?id=44318148</link><dc:creator>abdullin</dc:creator><comments>https://news.ycombinator.com/item?id=44318148</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44318148</guid></item><item><title><![CDATA[New comment by abdullin in "Andrej Karpathy: Software in the era of AI [video]"]]></title><description><![CDATA[
<p>Claude's approach is currently a bit dated.<p>Cursor.sh agents or especially OpenAI Codex illustrate that a tool doesn't need to keep on stuffing context window with irrelevant information in order to make progress on a task.<p>And if really needed, engineers report that Gemini Pro 2.5 keeps on working fine within 200k-500k token context. Above that - it is better to reset the context.</p>
]]></description><pubDate>Thu, 19 Jun 2025 12:42:20 +0000</pubDate><link>https://news.ycombinator.com/item?id=44318129</link><dc:creator>abdullin</dc:creator><comments>https://news.ycombinator.com/item?id=44318129</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44318129</guid></item><item><title><![CDATA[New comment by abdullin in "Andrej Karpathy: Software in the era of AI [video]"]]></title><description><![CDATA[
<p>A simple rule applies: "No matter what tool created the code, you are still responsible for what you merge into main".<p>As such, task of verification, still falls on hands of engineers.<p>Given that and proper processes, modern tooling works nicely with codebases ranging from 10k LOC (mixed embedded device code with golang backends and python DS/ML) to 700k LOC (legacy enterprise applications from the mainframe era)</p>
]]></description><pubDate>Thu, 19 Jun 2025 12:40:12 +0000</pubDate><link>https://news.ycombinator.com/item?id=44318111</link><dc:creator>abdullin</dc:creator><comments>https://news.ycombinator.com/item?id=44318111</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44318111</guid></item><item><title><![CDATA[New comment by abdullin in "Andrej Karpathy: Software in the era of AI [video]"]]></title><description><![CDATA[
<p>Humans tend to lack inhumane patience.</p>
]]></description><pubDate>Thu, 19 Jun 2025 11:12:38 +0000</pubDate><link>https://news.ycombinator.com/item?id=44317515</link><dc:creator>abdullin</dc:creator><comments>https://news.ycombinator.com/item?id=44317515</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44317515</guid></item></channel></rss>