Hacker News: mmoskal

New comment by mmoskal in "Structured outputs on the Claude Developer Platform"

mmoskal — Fri, 14 Nov 2025 22:26:18 +0000

Grammars work best when aligned with prompt. That is, if your prompt gives you the right format of answer 80% of the time, the grammar will take you to a 100%. If it gives you the right answer 1% of the time, the grammar will give you syntactically correct garbage.

New comment by mmoskal in "Structured outputs on the Claude Developer Platform"

mmoskal — Fri, 14 Nov 2025 22:23:08 +0000

OpenAI is using [0] LLGuidance [1]. You need to set strict:true in your request for schema validation to kick in though.

[0] https://platform.openai.com/docs/guides/function-calling#lar... [1] https://github.com/guidance-ai/llguidance

New comment by mmoskal in "PCB Edge USB C Connector Library"

mmoskal — Sun, 26 Oct 2025 06:17:16 +0000

I had good experience with carefully spaced holes in PCB and a 50 mil header, see https://jacdac.github.io/jacdac-docs/ddk/firmware/jac-connec...

New comment by mmoskal in "How to stop AI's "lethal trifecta""

mmoskal — Fri, 26 Sep 2025 17:29:16 +0000

The previous article is in the same issue, in science and technology section. This is how they typically do it - leader article has a longer version in the paper. Leaders tend to be more opinionated.

New comment by mmoskal in "AGI is an engineering problem, not a model training problem"

mmoskal — Sun, 24 Aug 2025 01:22:29 +0000

Consciousness (subjective experience) is possibly orthogonal to intelligence (ability to achieve complex goals). We definitely have a better handle on what intelligence is than consciousness.

New comment by mmoskal in "Guid Smash"

mmoskal — Sun, 17 Aug 2025 04:46:54 +0000

Counting to 2^61 probably is.

To actually find a collision in 128b cryptographic hash function it would take closer to 2^65 hashes. Back of the envelope calculations suggest that with Pollard's rho it would cost a few million dollars of CPU time at Hetzner's super-low prices. Not nearly mere mortals budget, but not that far off I guess.

New comment by mmoskal in "How Tesla is proving doubters right on why its robotaxi service cannot scale"

mmoskal — Sun, 20 Jul 2025 17:12:50 +0000

Airplanes are dirty, unsafe and unclean?

New comment by mmoskal in "The borrowchecker is what I like the least about Rust"

mmoskal — Sat, 19 Jul 2025 23:46:11 +0000

I think this is like unsafe - most of your code won’t have it, so you get the benefits of borrow checker (memory safety and race freedom) elsewhere.

New comment by mmoskal in "Show HN: Lambduck, a Functional Programming Brainfuck"

mmoskal — Fri, 06 Jun 2025 01:18:03 +0000

This seems way too readable! I think you should remove the character literals in the name of purity.

Also, this is likely way more compact than Brainfuck, as the lambda calculus is written essentially as usual.

And seriously, very cool!

New comment by mmoskal in "Teaching Program Verification in Dafny at Amazon (2023)"

mmoskal — Tue, 03 Jun 2025 01:10:11 +0000

https://github.com/verus-lang/verus is similar tool for Rust (developed by previous heavy users of Dafny).

New comment by mmoskal in "Look ma, no bubbles designing a low-latency megakernel for Llama-1B"

mmoskal — Wed, 28 May 2025 03:55:31 +0000

They are reducing forward pass time from say 1.5ms to 1ms. On bigger model you would likely reduce from 15ms to 14.2ms or something like that.

New comment by mmoskal in "Look ma, no bubbles designing a low-latency megakernel for Llama-1B"

mmoskal — Wed, 28 May 2025 03:52:15 +0000

The sglang and vllm numbers are with cuda graphs enabled.

Having said that, 1B model is an extreme example - hence the 1.5x speedup. For regular models and batch sizes this would probably buy you a few percent.

New comment by mmoskal in "Pyrefly vs. Ty: Comparing Python's two new Rust-based type checkers"

mmoskal — Tue, 27 May 2025 16:52:53 +0000

As mentioned in other comments - in TypeScript which follows this gradual typing there is a number of flags to disable it (gradually so to speak). No reason ty wouldn't do it.

New comment by mmoskal in "Project Verona: Fearless Concurrency for Python"

mmoskal — Sun, 18 May 2025 19:56:16 +0000

If you change one letter in the prompt, however insignificant you may think it is, it will change the results in unpredictable ways, even with temperature 0 etc. The same is not true of renaming a variable in a programming language, most refactorings etc.

New comment by mmoskal in "AI is draining water from areas that need it most"

mmoskal — Sat, 10 May 2025 20:36:18 +0000

I don't know. I suspect most people rate data centers higher than almond milk...

New comment by mmoskal in "Business books are entertainment, not strategic tools"

mmoskal — Sat, 10 May 2025 05:57:21 +0000

My understanding is that printing 300 page paperback costs like $2 while 50 pages cost $1.50. However you can clearly claim way more money for the 300 pages so publishers are not interested in short books, business or otherwise.

New comment by mmoskal in "Nnd – a TUI debugger alternative to GDB, LLDB"

mmoskal — Tue, 06 May 2025 16:12:21 +0000

Very typical for anything with CUDA (they tend to compile everything for 10 different architectures times hundreds of template kernel parameters).

Not sure about ClickHouse though.

New comment by mmoskal in "Mercury, the first commercial-scale diffusion language model"

mmoskal — Wed, 30 Apr 2025 23:47:24 +0000

To put this into perspective, driving for an hour in an electric car (15kW avg consumption) consumes about as much energy as 50,000 chatgpt queries [0] Running your laptop for an hour would be around 100 queries.

[0] https://epoch.ai/gradient-updates/how-much-energy-does-chatg...

New comment by mmoskal in "Qwen3: Think deeper, act faster"

mmoskal — Mon, 28 Apr 2025 23:45:04 +0000

Spec decoding only depends on the tokenizer used. It's transfering either the draft token sequence or at most draft logits to the main model.

New comment by mmoskal in "Qwen3: Think deeper, act faster"

mmoskal — Mon, 28 Apr 2025 23:17:42 +0000

Just for some callibration: approx. no one runs 32 bit for LLMs on any sort of iron, big or otherwise. Some models (eg DeepSeek V3, and derivatives like R1) are native FP8. FP8 was also common for llama3 405b serving.