<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: jbarrow</title><link>https://news.ycombinator.com/user?id=jbarrow</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Tue, 14 Apr 2026 21:35:38 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=jbarrow" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by jbarrow in "LFM2.5-350M: No Size Left Behind"]]></title><description><![CDATA[
<p>Very cool to see a company pushing what's possible with (relatively) tiny models! A 350M parameter trained on 28T tokens that, from the benchmarks, is competitive with Qwen3.5-0.8B.<p>Comparing the architecture to Qwen3.5, it seems:<p>- fewer, wider layers<p>- mixing full attention and conv's, instead of the full+linear attention of Qwen3.5<p>- the vocab is about 1/4 the size</p>
]]></description><pubDate>Wed, 01 Apr 2026 02:31:02 +0000</pubDate><link>https://news.ycombinator.com/item?id=47596074</link><dc:creator>jbarrow</dc:creator><comments>https://news.ycombinator.com/item?id=47596074</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47596074</guid></item><item><title><![CDATA[LFM2.5-350M: No Size Left Behind]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.liquid.ai/blog/lfm2-5-350m-no-size-left-behind">https://www.liquid.ai/blog/lfm2-5-350m-no-size-left-behind</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47596047">https://news.ycombinator.com/item?id=47596047</a></p>
<p>Points: 3</p>
<p># Comments: 1</p>
]]></description><pubDate>Wed, 01 Apr 2026 02:26:32 +0000</pubDate><link>https://www.liquid.ai/blog/lfm2-5-350m-no-size-left-behind</link><dc:creator>jbarrow</dc:creator><comments>https://news.ycombinator.com/item?id=47596047</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47596047</guid></item><item><title><![CDATA[New comment by jbarrow in "What every computer scientist should know about floating-point arithmetic (1991) [pdf]"]]></title><description><![CDATA[
<p>Shared this because I was having fun thinking through floating point numbers the other day.<p>I worked through what fp6 (e3m2) would look like, doing manual additions and multiplications, showing cases where the operations are non-associative, etc. and then I wanted something more rigorous to read.<p>For anyone interested in floating point numbers, I highly recommend working through fp6 as an activity! Felt like I truly came away with a much deeper understanding of floats. Anything less than fp6 felt too simple/constrained, and anything more than fp6 felt like too much to write out by hand. For fp6 you can enumerate all 64 possible values on a small sheet of paper.<p>For anyone not (yet) interested in floating point numbers, I’d still recommend giving it a shot.</p>
]]></description><pubDate>Mon, 16 Mar 2026 14:55:01 +0000</pubDate><link>https://news.ycombinator.com/item?id=47399874</link><dc:creator>jbarrow</dc:creator><comments>https://news.ycombinator.com/item?id=47399874</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47399874</guid></item><item><title><![CDATA[What every computer scientist should know about floating-point arithmetic (1991) [pdf]]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.itu.dk/~sestoft/bachelor/IEEE754_article.pdf">https://www.itu.dk/~sestoft/bachelor/IEEE754_article.pdf</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47343902">https://news.ycombinator.com/item?id=47343902</a></p>
<p>Points: 125</p>
<p># Comments: 56</p>
]]></description><pubDate>Wed, 11 Mar 2026 23:24:22 +0000</pubDate><link>https://www.itu.dk/~sestoft/bachelor/IEEE754_article.pdf</link><dc:creator>jbarrow</dc:creator><comments>https://news.ycombinator.com/item?id=47343902</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47343902</guid></item><item><title><![CDATA[New comment by jbarrow in "Don't post generated/AI-edited comments. HN is for conversation between humans"]]></title><description><![CDATA[
<p>I've been noticing a _lot_ more AI-generated/edited content of late, both comments and stories. It's gotten to the point that I spend a lot less time on HN than I used to, and if it continues to get worse I expect I'll quit altogether.<p>At the end of the day, I'm here because of all the thoughtful commenters and people sharing interesting stories.</p>
]]></description><pubDate>Wed, 11 Mar 2026 22:34:09 +0000</pubDate><link>https://news.ycombinator.com/item?id=47343218</link><dc:creator>jbarrow</dc:creator><comments>https://news.ycombinator.com/item?id=47343218</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47343218</guid></item><item><title><![CDATA[New comment by jbarrow in "Thank HN: You helped save 33k lives"]]></title><description><![CDATA[
<p>Watsi is incredibly inspiring!<p>I’ve been a monthly donor since ~the beginning when I was just an undergraduate, and I still read the stories and emails I receive. I’m glad that you opted for the steady growth path, and that you’ve made it a sustainable thing.</p>
]]></description><pubDate>Tue, 17 Feb 2026 23:05:16 +0000</pubDate><link>https://news.ycombinator.com/item?id=47054721</link><dc:creator>jbarrow</dc:creator><comments>https://news.ycombinator.com/item?id=47054721</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47054721</guid></item><item><title><![CDATA[New comment by jbarrow in "Nano-vLLM: How a vLLM-style inference engine works"]]></title><description><![CDATA[
<p>The whole thing feels AI written, generated from the codebase.*<p>*this is incorrect per the author’s response, my apologies.<p>For instance, it goes into (nano)vLLM internals and doesn’t mention PagedAttention once (one of the core ideas that vLLM is based on)[1].<p>Also mentions that Part 2 will cover dense vs MoE’s, which is weird because nanovllm hardcodes a dense Qwen3 into the source.<p>Here are better (imo) explainers about how vLLM works:<p>- <a href="https://hamzaelshafie.bearblog.dev/paged-attention-from-first-principles-a-view-inside-vllm/" rel="nofollow">https://hamzaelshafie.bearblog.dev/paged-attention-from-firs...</a><p>- <a href="https://www.aleksagordic.com/blog/vllm" rel="nofollow">https://www.aleksagordic.com/blog/vllm</a><p>- <a href="https://huggingface.co/blog/continuous_batching" rel="nofollow">https://huggingface.co/blog/continuous_batching</a><p>Aleksa’s blog is a bit in the weeds for my taste but it’s really worth working through.<p>A lot of the magic of vLLM happens in the PagedAttention kernels, which are really succinctly implanted in nanovllm. And the codebase is great and readable by itself!<p>—<p>1. <a href="https://arxiv.org/abs/2309.06180" rel="nofollow">https://arxiv.org/abs/2309.06180</a></p>
]]></description><pubDate>Mon, 02 Feb 2026 14:18:19 +0000</pubDate><link>https://news.ycombinator.com/item?id=46856317</link><dc:creator>jbarrow</dc:creator><comments>https://news.ycombinator.com/item?id=46856317</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46856317</guid></item><item><title><![CDATA[New comment by jbarrow in "Show HN: Browser-based PDF form fields detection (YOLO-based)"]]></title><description><![CDATA[
<p>Super interesting. Would you be willing to try the Python package (<a href="https://github.com/jbarrow/commonforms" rel="nofollow">https://github.com/jbarrow/commonforms</a>) or share the PDFs?<p>For the non-ONNX models there are some inference tricks that generally improve performance, and potentially lowering confidence could help.</p>
]]></description><pubDate>Mon, 20 Oct 2025 15:18:19 +0000</pubDate><link>https://news.ycombinator.com/item?id=45644936</link><dc:creator>jbarrow</dc:creator><comments>https://news.ycombinator.com/item?id=45644936</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45644936</guid></item><item><title><![CDATA[New comment by jbarrow in "Show HN: Browser-based PDF form fields detection (YOLO-based)"]]></title><description><![CDATA[
<p>Hey, Benjamin, thanks for the attribution! Happy to field any questions HN users have.<p>It's really gratifying to see people building on the work, and I love that it's possible to do browser-side/on-device.</p>
]]></description><pubDate>Sun, 19 Oct 2025 16:39:43 +0000</pubDate><link>https://news.ycombinator.com/item?id=45635580</link><dc:creator>jbarrow</dc:creator><comments>https://news.ycombinator.com/item?id=45635580</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45635580</guid></item><item><title><![CDATA[New comment by jbarrow in "Ask HN: What are you working on? (October 2025)"]]></title><description><![CDATA[
<p>Woah, did not realize that, haha. Let me know if it works well!</p>
]]></description><pubDate>Tue, 14 Oct 2025 00:25:15 +0000</pubDate><link>https://news.ycombinator.com/item?id=45574847</link><dc:creator>jbarrow</dc:creator><comments>https://news.ycombinator.com/item?id=45574847</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45574847</guid></item><item><title><![CDATA[New comment by jbarrow in "Ask HN: What are you working on? (October 2025)"]]></title><description><![CDATA[
<p>Training ML models for PDF forms. You can try out what I’ve got so far with this service that automatically detects where fields should go and makes PDFs fillable: <a href="https://detect.semanticdocs.org/" rel="nofollow">https://detect.semanticdocs.org/</a>
Code and models are at: <a href="https://github.com/jbarrow/commonforms" rel="nofollow">https://github.com/jbarrow/commonforms</a><p>That’s built on a dataset and paper I wrote called CommonForms, where I scraped CommonCrawl for hundreds of thousands of fillable form pages and used that as a training set:<p><a href="https://arxiv.org/abs/2509.16506" rel="nofollow">https://arxiv.org/abs/2509.16506</a><p>Next step is training and releasing some DETRs, which I think will drive quality even higher. But the ultimate end goal is working on automatic form accessibility.</p>
]]></description><pubDate>Mon, 13 Oct 2025 10:40:27 +0000</pubDate><link>https://news.ycombinator.com/item?id=45566892</link><dc:creator>jbarrow</dc:creator><comments>https://news.ycombinator.com/item?id=45566892</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45566892</guid></item><item><title><![CDATA[New comment by jbarrow in "Show HN: CommonForms – open models to auto-detect PDF form fields"]]></title><description><![CDATA[
<p>Existing “auto-fillable” tools are pretty lackluster in my experience. CommonForms is tooling that can automatically detect form fields in PDFs and turn those PDFs into fillable documents. The dataset is ~500k form pages pulled from Common Crawl, which I trained the object detectors on. For being vision only, the results are pretty remarkable!<p>Releasing the dataset, paper, models, and (imo most importantly) simple/convenient tooling to automatically prepare any PDF.<p>Links: Repo: <a href="https://github.com/jbarrow/commonforms" rel="nofollow">https://github.com/jbarrow/commonforms</a> - Paper: <a href="https://arxiv.org/abs/2509.16506" rel="nofollow">https://arxiv.org/abs/2509.16506</a></p>
]]></description><pubDate>Thu, 02 Oct 2025 14:31:58 +0000</pubDate><link>https://news.ycombinator.com/item?id=45450138</link><dc:creator>jbarrow</dc:creator><comments>https://news.ycombinator.com/item?id=45450138</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45450138</guid></item><item><title><![CDATA[Show HN: CommonForms – open models to auto-detect PDF form fields]]></title><description><![CDATA[
<p>Article URL: <a href="https://github.com/jbarrow/commonforms">https://github.com/jbarrow/commonforms</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=45450135">https://news.ycombinator.com/item?id=45450135</a></p>
<p>Points: 1</p>
<p># Comments: 1</p>
]]></description><pubDate>Thu, 02 Oct 2025 14:31:49 +0000</pubDate><link>https://github.com/jbarrow/commonforms</link><dc:creator>jbarrow</dc:creator><comments>https://news.ycombinator.com/item?id=45450135</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45450135</guid></item><item><title><![CDATA[New comment by jbarrow in "Cloud Run GPUs, now GA, makes running AI workloads easier for everyone"]]></title><description><![CDATA[
<p>I’m personally a huge fan of Modal, and have been using their serverless scale-to-zero GPUs for a while. We’ve seen some nice cost reductions from using them, while also being able to scale WAY UP when needed. All with minimal development effort.<p>Interesting to see a big provider entering this space. Originally swapped to Modal because big providers weren’t offering this (e.g. AWS lambdas can’t run on GPU instances). Assuming all providers are going to start moving towards offering this?</p>
]]></description><pubDate>Wed, 04 Jun 2025 10:00:59 +0000</pubDate><link>https://news.ycombinator.com/item?id=44179045</link><dc:creator>jbarrow</dc:creator><comments>https://news.ycombinator.com/item?id=44179045</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44179045</guid></item><item><title><![CDATA[New comment by jbarrow in "Magic Ink: Information Software and the Graphical Interface"]]></title><description><![CDATA[
<p>If you enjoyed this essay, you should check out the author’s current project, Dynamicland[1]. It is a wonderful expression of what computing and interaction could be. Even the project website — navigating a physical shelf, and every part is hyperlinked — is joyful.<p>1. <a href="https://dynamicland.org/" rel="nofollow">https://dynamicland.org/</a></p>
]]></description><pubDate>Tue, 03 Jun 2025 06:05:57 +0000</pubDate><link>https://news.ycombinator.com/item?id=44166862</link><dc:creator>jbarrow</dc:creator><comments>https://news.ycombinator.com/item?id=44166862</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44166862</guid></item><item><title><![CDATA[New comment by jbarrow in "Show HN: Free, in-browser PDF editor"]]></title><description><![CDATA[
<p>Editing text in PDFs is _really_ hard compared to other document formats because most PDFs don't really encode the "physics" of the document. I.e. there isn't a notion of a "text block with word wrapping," it's more "glyphs inserted at location X with font Y."<p>If the PDF hasn't been made accessible, you have to do a lot of inferencing based on the layout about how things are grouped and how they should flow if you want to be able to make meaningful edits. Not impossible (Acrobat does it), but very challenging.<p>It's part of the legacy of PDF as a format for presentation and print jobs, rather than typesetting.</p>
]]></description><pubDate>Sat, 03 May 2025 19:57:28 +0000</pubDate><link>https://news.ycombinator.com/item?id=43881724</link><dc:creator>jbarrow</dc:creator><comments>https://news.ycombinator.com/item?id=43881724</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43881724</guid></item><item><title><![CDATA[New comment by jbarrow in "Show HN: Free, in-browser PDF editor"]]></title><description><![CDATA[
<p>Wonderful! Inserted form-fields show up in Preview and Acrobat, which is not a trivial task. I run a little AI-powered tool that automatically figures out where form fields should go (<a href="https://detect.penpusher.app" rel="nofollow">https://detect.penpusher.app</a>) and robustly adding form fields to the PDF was the hardest part.<p>Fwiw, I do see the issue with being unable to scroll down across both Safari and Chrome.</p>
]]></description><pubDate>Sat, 03 May 2025 19:50:12 +0000</pubDate><link>https://news.ycombinator.com/item?id=43881637</link><dc:creator>jbarrow</dc:creator><comments>https://news.ycombinator.com/item?id=43881637</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43881637</guid></item><item><title><![CDATA[New comment by jbarrow in "Show HN: Automatically Turn PDFs into Fillable Forms"]]></title><description><![CDATA[
<p>I'm sorry I missed this earlier, but I absolutely believe that it could do that. Do you have any pointers to PDF forms that work well or don't work well with screen readers? I'd be happy to take a look, and see if I can improve this tool based on that.<p>In addition, did you try the "enhanced" pipeline? It gives each field a meaningful name based on the label, which might help with accessibility.<p>PDF accessibility is a huge issue that _should_ be easily solved, but isn't, unfortunately.</p>
]]></description><pubDate>Thu, 20 Mar 2025 03:45:53 +0000</pubDate><link>https://news.ycombinator.com/item?id=43419741</link><dc:creator>jbarrow</dc:creator><comments>https://news.ycombinator.com/item?id=43419741</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43419741</guid></item><item><title><![CDATA[Show HN: Automatically Turn PDFs into Fillable Forms]]></title><description><![CDATA[
<p>My hobby project is building a form filling assistant for complex forms, but it turns out that not every PDF form is digitally fillable. A lot of PDF forms are still made with the expectation that you print and fill them by hand. To fix this, I trained a series of models that detect where form fields _should _ exist, then built this utility to turn them into interactive forms.<p>In my experience it's the best tool for the job across nearly every form. It ended up being super useful and pretty close to free to run (thanks Modal!), so I've hosted it as a free utility at <a href="https://detect.penpusher.app" rel="nofollow">https://detect.penpusher.app</a>.<p>There are some advanced settings as well, so if you're not getting good results with a specific pipeline you can try tweaking those, or let me know and I'd be happy to take a look. Plus the next round of model training is focused on improving handling of scans.<p>Hope this is useful and happy to answer any questions!</p>
<hr>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=43269590">https://news.ycombinator.com/item?id=43269590</a></p>
<p>Points: 3</p>
<p># Comments: 2</p>
]]></description><pubDate>Wed, 05 Mar 2025 17:28:23 +0000</pubDate><link>https://detect.penpusher.app</link><dc:creator>jbarrow</dc:creator><comments>https://news.ycombinator.com/item?id=43269590</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43269590</guid></item><item><title><![CDATA[New comment by jbarrow in "Ingesting PDFs and why Gemini 2.0 changes everything"]]></title><description><![CDATA[
<p>> Unfortunately Gemini really seems to struggle on this, and no matter how we tried prompting it, it would generate wildly inaccurate bounding boxes<p>Qwen2.5 VL was trained on a special HTML format for doing OCR with bounding boxes. [1] The resulting boxes aren't quite as accurate as something like Textract/Surya, but I've found they're much more accurate than Gemini or any other LLM.<p>[1] <a href="https://qwenlm.github.io/blog/qwen2.5-vl/" rel="nofollow">https://qwenlm.github.io/blog/qwen2.5-vl/</a></p>
]]></description><pubDate>Wed, 05 Feb 2025 20:02:52 +0000</pubDate><link>https://news.ycombinator.com/item?id=42954299</link><dc:creator>jbarrow</dc:creator><comments>https://news.ycombinator.com/item?id=42954299</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42954299</guid></item></channel></rss>