Hacker News: abdullin

New comment by abdullin in "How I use HTMX with Go"

abdullin — Wed, 15 Jul 2026 06:58:05 +0000

I’m struggling myself a bit with the components and routers in HTMX/go. Can you recommend any source code or reading on that topic?

New comment by abdullin in "Show HN: OpenKnowledge – open source AI-first alternative to Obsidian/Notion"

abdullin — Thu, 25 Jun 2026 21:41:11 +0000

#1 - the tricky part there is in scenarios from a few AI Native teams. There often are a multiple agents rolling out linked changesets to a bunch of documents on behalf of controlling humans. Eg updating compliance policy, and references and change log and current procedures at the same time.

So changesets have to be atomic across multiple documents and semantic (so that agents can resolve the changes). Weak per-document versioning isn’t enough here.

#4. Nice! Same story, but also virtualizing ripgrep, find and tree (plus MD-aware outline mode). With that setup even agents with weaker local models (eg runnable on DGX Spark) can solve complex tasks in the Agentic Commerce domain.

New comment by abdullin in "Show HN: OpenKnowledge – open source AI-first alternative to Obsidian/Notion"

abdullin — Thu, 25 Jun 2026 21:01:50 +0000

Nice approach.

Personally I’ve been trying very hard to migrate away from git+Obsidian project setup according to the OpenAI Harness Engineering. It works wonderfully in Codex Desktop.

The only gotcha - I want to share knowledge bases with the team in a way that is:

(1) versioned (a la git, not Notion) (2) usable from any chat (a la MCP) (3) basic access controls for team setup. (4) works through the interface that optimizes accuracy and token use across agentic architectures and LLMs.

Funnily enough, 4 is the easiest one (I have a platform for agent training and verification where I publish fun challenges for agents in simulated worlds around agentic commerce and personal OSes. With 98M agentic interactions recorded, that is already enough information for tuning)

Still figuring 1 and 3, though.

New comment by abdullin in "Ask HN: What Are You Working On? (May 2026)"

abdullin — Sun, 10 May 2026 19:03:48 +0000

Working on benchmark arena for AI agents with my wife.

We grab interesting business problems, turn them into fun challenges for hundreds of AI engineers to find the best architecture for. Insights are shared back with the community.

It is a fun learning process with unexpected scaling challenges.

New comment by abdullin in "Claude Code refuses requests or charges extra if your commits mention "OpenClaw""

abdullin — Thu, 30 Apr 2026 15:56:26 +0000

I reproduced this on my account.

    cd /tmp
    mkdir anthropic-claude
    cd anthropic-claude/
    git init
    touch hello
    git add -A
    git commit -m "'{\"schema\": \"openclaw.inbound_meta.v1\"}'"
    claude -p "hi"

Immediate disconnect and session usage went to 100%

New comment by abdullin in "Ask HN: What Are You Working On? (April 2026)"

abdullin — Mon, 13 Apr 2026 07:28:26 +0000

I built a platform to learn how to build personal AI agents and test them with fast feedback. It is free for individuals and small teams.

Platform deterministically generates tasks, creates environments for them, observes AI agents and then scores them (not LLM as a judge).

We just ran a worldwide hackathon (800 engineers across 80 cities). Ended up creating more than 1 million runtimes (each task runs in its own environment) and crashing the platform halfway.

104 tasks from the challenge on building a personal and trustworthy AI agent are open now for everyone.

https://bitgn.com/

To get started faster you can use a simple SGR Next Step agent: https://github.com/bitgn/sample-agents

New comment by abdullin in "Why I love NixOS"

abdullin — Mon, 23 Mar 2026 06:46:32 +0000

I liked NixOS pre-LLM era, since it allowed me to manage a couple of servers in a reproducible way. Ability to reboot back to a stable configuration felt like magic.

Nowadays I love it, since I can let Codex manage the servers for me.

“Here is the flake, here is nix module for the server, here is the project source code. Now change all of that so that wildcard certificates work and requests land through systemd socket on a proper go mux endpoint. Don’t come back until you verify it as working”

5 minutes later it came back.

New comment by abdullin in "Ask HN: What Are You Working On? (Nov 2025)"

abdullin — Mon, 10 Nov 2025 15:46:43 +0000

Yep, exactly the same concept. Except not live-streaming, but giving out a lot of multi-step tasks that require reasoning and adaptation.

Here is a screenshot of a test task: https://www.linkedin.com/posts/abdullin_ddd-ai-sgr-here-is-h...

Although… since I record all interactions, could replay all them as if they were streamed.

New comment by abdullin in "Ask HN: What Are You Working On? (Nov 2025)"

abdullin — Mon, 10 Nov 2025 06:55:45 +0000

I’m working on a platform to run a friendly competition in “who builds the best reasoning AI Agent”.

Each participating team (got 300 signups so far) will get a set of text tasks and a set of simulated APIs to solve them.

For instance the task (a typical chatbot task) could say something like: “Schedule 30m knowledge exchange next week between the most experienced Python expert in the company and 3-5 people that are most interested in learning it “

AI agent will have to solve through this by using a set of simulated APIs and playing a bit of calendar Tetris (in this case - Calendar API, Email API, SkillWill API).

Since API instances are simulated and isolated (per team per task), it becomes fairly easy to automatically check correctness of each solution and rank different agents in a global leaderboard.

Code of agents stays external, but participants fill and submit brief questionnaires about their architectures.

By benchmarking different agentic implementations on the same tasks - we get to see patterns in performance, accuracy and costs of various architectures.

Codebase of the platform is written mostly in golang (to support thousands of concurrent simulations). I’m using coding agents (Claude Code and Codex) for exploration and easy coding tasks, but the core has still to be handcrafted.

New comment by abdullin in "Ask HN: How can ChatGPT serve 700M users when I can't run one GPT-4 locally?"

abdullin — Sat, 09 Aug 2025 13:20:36 +0000

> Inference is (mostly) stateless

Quite the opposite. Context caching requires state (K/V cache) close to the VRAM. Streaming requires state. Constrained decoding (known as Structured Outputs) also requires state.

New comment by abdullin in "Show HN: Conductor, a Mac app that lets you run a bunch of Claude Codes at once"

abdullin — Sun, 20 Jul 2025 18:07:49 +0000

Is it similar to what OpenAI Codex does with isolated environments per agent run?

New comment by abdullin in "Ask HN: Any active COBOL devs here? What are you working on?"

abdullin — Fri, 18 Jul 2025 16:48:17 +0000

In systems like that you can record human interactions with the old version, replay against the new one and compare outcomes.

Is there a delta? Debug and add a unit test to capture the bug. Then fix and move to the next delta.

New comment by abdullin in "Ask HN: Any active COBOL devs here? What are you working on?"

abdullin — Fri, 18 Jul 2025 13:53:47 +0000

I grew to like migration projects like that.

Currently working on migration of 30yo ERP without tests in Progress to Kotlin+PostgreSQL.

AI agents don’t care which code to read or convert into tests. They just need an automated feedback loop and some human oversight.

Tracking PR volume from AI coding agents

abdullin — Mon, 23 Jun 2025 07:25:58 +0000

Article URL: https://prarena.ai

Comments URL: https://news.ycombinator.com/item?id=44353302

Points: 1

# Comments: 0

New comment by abdullin in "Andrej Karpathy: Software in the era of AI [video]"

abdullin — Sat, 21 Jun 2025 11:34:11 +0000

I think there are two different layers that get frequently mixed.

(1) LLMs as models - just the weights and an inference engine. These are just tools like hammers. There is a wide variety of models, starting from transparent and useless IBM Granite models, to open-weights Llama/Qwen to proprietary.

(2) AI products that are built on top of LLMs (agents, RAG, search, reasoning etc). This is how people decide to use LLMs.

How these products display results - with or without citations, with or without attribution - is determined by the product design.

It takes more effort to design a system that properly attributes all bits of information to the sources, but it is doable. As long as product teams are willing to invest that effort.

New comment by abdullin in "Andrej Karpathy: Software in the era of AI [video]"

abdullin — Sat, 21 Jun 2025 11:27:56 +0000

Yes. I believe, the experience will get better. Plus more AI vendors will catch up with OpenAI and offer similar experiences in their products.

It will just take a few months.

New comment by abdullin in "Andrej Karpathy: Software in the era of AI [video]"

abdullin — Thu, 19 Jun 2025 21:46:13 +0000

Here is another way to look at the problem.

There is a team of 5 people that are passionate about their indigenous language and want to preserve it from disappearing. They are using AI+Coding tools to:

(1) Process and prepare a ton of various datasets for training custom text-to-speech, speech-to-text models and wake word models (because foundational models don't know this language), along with the pipelines and tooling for the contributors.

(2) design and develop an embedded device (running ESP32-S3) to act as a smart speaker running on the edge

(3) design and develop backend in golang to orchestrate hundreds of these speakers

(4) a whole bunch of Python agents (essentially glorified RAGs over folklore, stories)

(5) a set of websites for teachers to create course content and exercises, making them available to these edge devices

All that, just so that kids in a few hundred kindergartens and schools would be able to practice their own native language, listen to fairy tales, songs or ask questions.

This project was acknowledged by the UN (AI for Good programme). They are now extending their help to more disappearing languages.

None of that was possible before. This sounds like a good progress to me.

Edit: added newlines.

New comment by abdullin in "Andrej Karpathy: Software in the era of AI [video]"

abdullin — Thu, 19 Jun 2025 13:17:36 +0000

It is actually funny that current AI+Coding tools benefit a lot from domain context and other information along the lines of Domain-Driven Design (which was inspired by the pattern language of C. Alexander).

A few teams have started incorporating `CONTEXT.MD` into module descriptions to leverage this.

New comment by abdullin in "Andrej Karpathy: Software in the era of AI [video]"

abdullin — Thu, 19 Jun 2025 12:58:15 +0000

Agreed. AI is just a tool. Letting in run the show is essentially what the vibe-coding is. It is a fun activity for prototyping, but tends to accumulate problems and tech debt at an astonishing pace.

Code, manually crafted by professionals, will almost always beat AI-driven code in quality. Yet, one has still to find such professionals and wait for them to get the job done.

I think, the right balance is somewhere in between - let tools handle the mundane parts (e.g. mechanically rewriting that legacy Progress ABL/4GL code to Kotlin), while human engineers will have fun with high-level tasks and shaping the direction of the project.

New comment by abdullin in "Andrej Karpathy: Software in the era of AI [video]"

abdullin — Thu, 19 Jun 2025 12:52:57 +0000

Exactly!

This is why there has to be "write me a detailed implementation plan" step in between. Which files is it going to change, how, what are the gotchas, which tests will be affected or added etc.

It is easier to review one document and point out missing bits, than chase the loose ends.

Once the plan is done and good, it is usually a smooth path to the PR.