Hacker News: markab21

New comment by markab21 in "Google releases Gemma 4 open models"

markab21 — Thu, 02 Apr 2026 21:08:57 +0000

I'll pipe in here as someone working on an agentic harness project using mastra as the harness.

Nemotron3-super is, without question, my favorite model now for my agentic use cases. The closest model I would compare it to, in vibe and feel, is the Qwen family but this thing has an ability to hold attention through complicated (often noisy) agentic environments and I'm sometimes finding myself checking that i'm not on a frontier model.

I now just rent a Dual B6000 on a full-time basis for myself for all my stuff; this is the backbone of my "base" agentic workload, and I only step up to stronger models in rare situations in my pipelines.

The biggest thing with this model, I've found, is just making sure my environment is set up correctly; the temps and templates need to be exactly right. I've had hit-or-miss with OpenRouter. But running this model on a B6000 from Vast with a native NVFP4 model weight from Nvidia, it's really good. (2500 peak tokens/sec on that setup) batching. about 100/s 1-request, 250k context. :)

I can run on a single B6000 up to about 120k context reliably but really this thing SCREAMS on a dual-b6000. (I'm close to just ordering a couple for myself it's working so well).

Good luck .. (Sometimes I feel like I'm the crazy guy in the woods loving this model so much, I'm not sure why more people aren't jumping on it..)

New comment by markab21 in "If DSPy is so great, why isn't anyone using it?"

markab21 — Mon, 23 Mar 2026 15:53:04 +0000

I think the entire premise that the prompting is the surface area for optimizing the application is fundamentally the wrong framing, in the same way that in 1998 better cpam will save CGI. It's solving the wrong problems now, and the limitations in context and model intelligence require a tool like Dspy.

The only thing I'd grab dspy for at this point is to automate the edges of the agentic pipeline that could be improved with RL patterns. But if that is true, you're really shorting yourself by giving your domain DSPY. You should be building your own RL learning loops.

My experience: If you find yourself reaching for a tool like Dspy, you might be sitting on a scenario where reinforcement learning approaches would help even further up the stack than your prompts, and you're probably missing where the real optimization win is. (Think bigger)

New comment by markab21 in "Gemini 3.1 Pro"

markab21 — Thu, 19 Feb 2026 16:55:37 +0000

You just articulated why I struggle to personally connect with Gemini. It feels so unrelatable and exhausting to read its output. I prefer to read Opus/Deepseek/GLM over Gemini, Qwen and the open source GPT models. Maybe it is RLHF that is creating my distaste from using it. (I pay for Gemini; I should be using it more... but the outputs just bug me and feel more work to get actionable insight.)

New comment by markab21 in "Experts Have World Models. LLMs Have Word Models"

markab21 — Mon, 09 Feb 2026 15:03:33 +0000

And I think you basically just described the OpenAI approach to building models and serving them.

New comment by markab21 in "Orchestrate teams of Claude Code sessions"

markab21 — Thu, 05 Feb 2026 18:44:47 +0000

Shaking fist at clouds!!

New comment by markab21 in "Qwen3-Coder-Next"

markab21 — Tue, 03 Feb 2026 16:36:07 +0000

It's getting a lot easier to do this using sub-agents with tools in Claude. I have a fleet of Mastra agents (TypeScript). I use those agents inside my project as CLI tools to do repetitive tasks that gobble tokens such as scanning code, web search, library search, and even SourceGraph traversal.

Overall, it's allowed me to maintain more consistent workflows as I'm less dependent on Opus. Now that Mastra has introduced the concept of Workspaces, which allow for more agentic development, this approach has become even more powerful.

New comment by markab21 in "TimeCapsuleLLM: LLM trained only on data from 1800-1875"

markab21 — Mon, 12 Jan 2026 17:08:56 +0000

Basically looking for emergent behavior.

New comment by markab21 in "Show HN: Mysti – Claude, Codex, and Gemini debate your code, then synthesize"

markab21 — Sat, 27 Dec 2025 15:11:49 +0000

I love where you're going with this. In my experience it's not about a different persona, it's about constantly considering context that triggers, different activations enhance a different outcome. You can achieve the same thing, of course by switching to an agent with a separate persona, but you can also get it simply by injecting new context, or forcing the agent to consider something new. I feel like this concept gets cargo-culted a little bit.

I personally have moved to a pattern where i use mastra-agents in my project to achieve this. I've slowly shifted the bulk of the code research and web research to my internal tools (built with small typescript agents).. I can now really easily bounce between different tools such as claude, codex, opencode and my coding tools are spending more time orchestrating work than doing the work themselves.

New comment by markab21 in "Apps SDK"

markab21 — Mon, 06 Oct 2025 18:49:39 +0000

The skepticism is understandable given the trajectory of GPTs and custom instructions, but there's a meaningful technical difference here: the Apps SDK is built on the Model Context Protocol (MCP), which is an open specification rather than a proprietary format.

MCP standardizes how LLM clients connect to external tools—defining wire formats, authentication flows, and metadata schemas. This means apps you build aren't inherently ChatGPT-specific; they're MCP servers that could work with any MCP-compatible client. The protocol is transport-agnostic and self-describing, with official Python and TypeScript SDKs already available.

That said, the "build our platform" criticism isn't entirely off base. While the protocol is open, practical adoption still depends heavily on ChatGPT's distribution and whether other LLM providers actually implement MCP clients. The real test will be whether this becomes a genuine cross-platform standard or just another way to contribute to OpenAI's ecosystem.

The technical primitives (tool discovery, structured content return, embedded UI resources) are solid and address real integration problems. Whether it succeeds likely depends more on ecosystem dynamics than technical merit.

New comment by markab21 in "Mistral NeMo"

markab21 — Thu, 18 Jul 2024 15:02:47 +0000

It looks like it was built jointly with nvidia: https://huggingface.co/nvidia/Mistral-NeMo-12B-Instruct

New comment by markab21 in "How Mandelbrot set images are affected by floating point precision"

markab21 — Tue, 12 Mar 2024 22:49:24 +0000

I bet you have been waiting years to pull that one out of your pocket.

Well played sir! Nice shot man! :D

New comment by markab21 in "Memory and new controls for ChatGPT"

markab21 — Tue, 13 Feb 2024 19:25:39 +0000

I've found myself more and more using local models rather than ChatGPT; it was pretty trivial to set up Ollama+Ollama-WebUI, which is shockingly good.

I'm so tired of arguing with ChatGPT (or what was Bard) to even get simple things done. SOLAR-10B or Mistral works just fine for my use cases, and I've wired up a direct connection to Fireworks/OpenRouter/Together for the occasion I need anything more than what will run on my local hardware. (mixtral MOE, 70B code/chat models)

New comment by markab21 in "Mistral CEO confirms 'leak' of new open source AI model nearing GPT4 performance"

markab21 — Wed, 31 Jan 2024 20:19:07 +0000

For Llama-based progress - Reddit - /r/LocalLlama has been my top source of info, although it's been getting a little more noisy lately.

I also hang out on a few Discord servers: - Nous Research - TogetherAI / Fireworks / Openrouter - LangChain - TheBloke AI - Mistral AI

These, along with a couple of newsletters, basically keep a pulse on things.

New comment by markab21 in "Gulf Stream weakening now 99% certain, and ramifications will be global"

markab21 — Wed, 18 Oct 2023 20:02:53 +0000

[flagged]

New comment by markab21 in "Obsidian 1.4.10 Desktop (Public)"

markab21 — Tue, 12 Sep 2023 16:36:14 +0000

Yeah, slow news day.

New comment by markab21 in "Debian celebrates 30 years"

markab21 — Thu, 17 Aug 2023 14:35:15 +0000

Linux distributions, including Debian, offer a variety of desktop environments, each with its own design philosophy and user experience. If one environment doesn't suit your preferences, others might be more to your liking. It's worth exploring different desktop environments to find one that aligns with your expectations.

New comment by markab21 in "Debian celebrates 30 years"

markab21 — Thu, 17 Aug 2023 14:33:03 +0000

Debian and Ubuntu have similarities, but keep in mind - Ubuntu is derived from Debian, not the other way around. However, they differ in areas like release cycles, package management, and default configurations.

New comment by markab21 in "Ask HN: Should HN ban ChatGPT/generated responses?"

markab21 — Mon, 12 Dec 2022 12:46:10 +0000

I used it as a consultant on a development project to help me organize some of the milestones and design goals in some documentation.

It wasn't that I didn't know the stuff, I do, but more helpful with quickly organizing and presenting information in a clean and well-written way. I did have to go through and re-write parts of it specific to our domain.. but it saved me many hours of work doing tedious organization of data.

I also tested it with helping create some SOP's for a new position in our very small company, even breaking down the expected tasks into daily schedules.

It's not that it's perfect, but it generates a bit of a boiler-plate starting point for me which then I can work with from there.

New comment by markab21 in "Google and Facebook execs allegedly approved dividing ad market among themselves"

markab21 — Sat, 15 Jan 2022 15:25:25 +0000

I'd be surprised if they don't have async mechanisms.

New comment by markab21 in "FBI's ability to legally access secure messaging app content and metadata [pdf]"

markab21 — Tue, 30 Nov 2021 20:28:25 +0000

Assume anything sent over a cellular network carrier via normal SMS can not only be retrieved, but intercepted.