<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: danvk</title><link>https://news.ycombinator.com/user?id=danvk</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Wed, 08 Apr 2026 01:40:34 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=danvk" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by danvk in "AI helps add 10k more photos to OldNYC"]]></title><description><![CDATA[
<p>(author here) Just to be clear, none of the photos were ever human-located. The system this replaced was, roughly, regular expression + Google Maps geocoding API. The only photos located by hand were the ~200 I used for my test set: <a href="https://github.com/danvk/oldnyc/blob/master/data/geocode/out.csv" rel="nofollow">https://github.com/danvk/oldnyc/blob/master/data/geocode/out...</a></p>
]]></description><pubDate>Tue, 07 Apr 2026 19:56:12 +0000</pubDate><link>https://news.ycombinator.com/item?id=47680557</link><dc:creator>danvk</dc:creator><comments>https://news.ycombinator.com/item?id=47680557</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47680557</guid></item><item><title><![CDATA[New comment by danvk in "AI helps add 10k more photos to OldNYC"]]></title><description><![CDATA[
<p>(Author here) IIUC you're saying that 707133f-a should be at 5th Ave & 9th Street, not 5th Ave & Union Street? Can you say more about why? The text on the back of the first image says "Union St. Station, 5th Ave," which is how it winds up at there. On the other hand, the NYPL page[1] titles the image "Union St. - 18th St."<p>(I briefly got excited that there might be a street sign _in_ the photo, but if you zoom way in it says "DENTIST")<p>+1 to 1940s.nyc. Very different photos — those are were taken for tax assessment, the ones on OldNYC were taken to document the city as it changed. The photographer had an arrangement where he'd get tips from demolition crews, and go shoot buildings before they were gone forever.<p>[1]: <a href="https://digitalcollections.nypl.org/items/5a5e06a0-c539-012f-25d1-58d385a7bc34?canvasIndex=0" rel="nofollow">https://digitalcollections.nypl.org/items/5a5e06a0-c539-012f...</a></p>
]]></description><pubDate>Tue, 07 Apr 2026 19:06:08 +0000</pubDate><link>https://news.ycombinator.com/item?id=47679896</link><dc:creator>danvk</dc:creator><comments>https://news.ycombinator.com/item?id=47679896</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47679896</guid></item><item><title><![CDATA[New comment by danvk in "Dependent types and how to get rid of them"]]></title><description><![CDATA[
<p>Yes. You can give pickType a type in TS. Something like pickType<B extends boolean>(b: B): B extends true ? string : number.<p>I’d love to read a post explaining how TS conditional types are or are not a form of dependent types. Or, I’d like to understand dependent types well enough to write that post.</p>
]]></description><pubDate>Tue, 11 Nov 2025 04:28:12 +0000</pubDate><link>https://news.ycombinator.com/item?id=45884117</link><dc:creator>danvk</dc:creator><comments>https://news.ycombinator.com/item?id=45884117</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45884117</guid></item><item><title><![CDATA[New comment by danvk in "Lone coder cracks 50-year puzzle to find Boggle's top-scoring board"]]></title><description><![CDATA[
<p>It's open source, take a crack at it! Or file an issue requesting it.<p>This analysis doesn't make use of the Boggle dice. It assumes that any cell can be any letter. In practice, all high-scoring boards can be rolled with the Boggle dice. My code does assume the letters are A-Z, though, so the Ñ die in Spanish Boggle would require some code changes.</p>
]]></description><pubDate>Tue, 27 May 2025 13:06:02 +0000</pubDate><link>https://news.ycombinator.com/item?id=44106594</link><dc:creator>danvk</dc:creator><comments>https://news.ycombinator.com/item?id=44106594</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44106594</guid></item><item><title><![CDATA[New comment by danvk in "Lone coder cracks 50-year puzzle to find Boggle's top-scoring board"]]></title><description><![CDATA[
<p>Finding the highest-scoring board with a 16- or 17-letter word is a fun, but very different problem. There are few enough "Hamiltonian paths" through the all the letters on a 4x4 Boggle board (~68,000) and few enough 16 letter words (~2,000) that you can enumerate all pairs in an hour or two.<p>Depending on wordlist and whether you want a 16 or 17 letter word, you get "charitablenesses", "supernaturalised" (British spelling), "quadricentennials" or "quartermistresses". These boards all score considerably lower than the REPLASTERING board. Full results here:
<a href="https://github.com/danvk/hybrid-boggle/#highest-scoring-boards-containing-a-16--or-17-letter-word">https://github.com/danvk/hybrid-boggle/#highest-scoring-boar...</a><p>I hadn't realized until I did this "side quest" that most wordlists top out at 15 letter words. That makes sense for a Scrabble dictionary, but it's not great for Boggle.</p>
]]></description><pubDate>Mon, 26 May 2025 12:27:07 +0000</pubDate><link>https://news.ycombinator.com/item?id=44096801</link><dc:creator>danvk</dc:creator><comments>https://news.ycombinator.com/item?id=44096801</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44096801</guid></item><item><title><![CDATA[New comment by danvk in "Lone coder cracks 50-year puzzle to find Boggle's top-scoring board"]]></title><description><![CDATA[
<p>There are simpler ways to calculate a bound that don’t involve trees. You can read about the sum and max bounds in the WIP paper: <a href="https://github.com/danvk/hybrid-boggle/blob/main/paper">https://github.com/danvk/hybrid-boggle/blob/main/paper</a><p>There are some examples in these old posts:<p>- <a href="https://www.danvk.org/wp/2009-08-08/breaking-3x3-boggle/index.html" rel="nofollow">https://www.danvk.org/wp/2009-08-08/breaking-3x3-boggle/inde...</a><p>- <a href="https://www.danvk.org/wp/2009-08-11/a-few-more-boggle-examples/index.html" rel="nofollow">https://www.danvk.org/wp/2009-08-11/a-few-more-boggle-exampl...</a><p>These bounds are pretty effective at finding the global max for 3x3 Boggle, but 4x4 is a lot bigger.<p>There is a mapping from Boggle optimization to ILP, but I’ve seen no evidence that this is an efficient way to solve it. I’ve been told that branch and cut is usually better than branch and bound, but I don’t know whether it’s applicable to Boggle.</p>
]]></description><pubDate>Sun, 25 May 2025 12:57:49 +0000</pubDate><link>https://news.ycombinator.com/item?id=44087478</link><dc:creator>danvk</dc:creator><comments>https://news.ycombinator.com/item?id=44087478</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44087478</guid></item><item><title><![CDATA[New comment by danvk in "Lone coder cracks 50-year puzzle to find Boggle's top-scoring board"]]></title><description><![CDATA[
<p>I have been surprised that the Boggle code runs about 4x slower on the GCP machine than on my M2 MacBook. I don’t have enough experience running CPU- and RAM-intensive cloud jobs to know whether this is normal.</p>
]]></description><pubDate>Sun, 25 May 2025 12:29:03 +0000</pubDate><link>https://news.ycombinator.com/item?id=44087342</link><dc:creator>danvk</dc:creator><comments>https://news.ycombinator.com/item?id=44087342</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44087342</guid></item><item><title><![CDATA[New comment by danvk in "Lone coder cracks 50-year puzzle to find Boggle's top-scoring board"]]></title><description><![CDATA[
<p>> “As far as I can tell, I’m the only person who is actually interested in this problem,” Vanderkam said.<p>For context, many people are interested in finding high-scoring Boggle boards, usually via simulated annealing, hillclimbing, or genetic algorithms. But so far as I can tell, I'm the only one interested in _proving_ that a particular board is best. Doing that was the new result here.</p>
]]></description><pubDate>Sat, 24 May 2025 22:08:02 +0000</pubDate><link>https://news.ycombinator.com/item?id=44084092</link><dc:creator>danvk</dc:creator><comments>https://news.ycombinator.com/item?id=44084092</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44084092</guid></item><item><title><![CDATA[New comment by danvk in "Lone coder cracks 50-year puzzle to find Boggle's top-scoring board"]]></title><description><![CDATA[
<p>"Lone coder" here. I reached out to Ollie (the FT reporter) because he'd written a book (Seven Games) about computers and games, so I thought the Boggle story might interest him. It did!</p>
]]></description><pubDate>Sat, 24 May 2025 21:54:23 +0000</pubDate><link>https://news.ycombinator.com/item?id=44084022</link><dc:creator>danvk</dc:creator><comments>https://news.ycombinator.com/item?id=44084022</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44084022</guid></item><item><title><![CDATA[New comment by danvk in "After 20 years, the globally optimal Boggle board"]]></title><description><![CDATA[
<p>Your best bet in that case is to store the dictionary in a Trie or DAWG structure that can be mmapped directly from disk.</p>
]]></description><pubDate>Thu, 24 Apr 2025 19:25:02 +0000</pubDate><link>https://news.ycombinator.com/item?id=43786482</link><dc:creator>danvk</dc:creator><comments>https://news.ycombinator.com/item?id=43786482</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43786482</guid></item><item><title><![CDATA[New comment by danvk in "A Computational Proof of the Highest-Scoring Boggle Board"]]></title><description><![CDATA[
<p>Sorry, but this doesn’t pass the smell test. The article mentions 200,000 random 4x4 boards/second on a single core on an M2. That’s a ~4GHz chip. So ~20,000 ops/board. There are 200,000 words in the dictionary. You can’t possibly do something for every word in the dictionary, it would be too slow.<p>It sounds like your Trie implementation had a bug or inefficiency.</p>
]]></description><pubDate>Thu, 24 Apr 2025 02:18:59 +0000</pubDate><link>https://news.ycombinator.com/item?id=43778699</link><dc:creator>danvk</dc:creator><comments>https://news.ycombinator.com/item?id=43778699</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43778699</guid></item><item><title><![CDATA[New comment by danvk in "After 20 years, the globally optimal Boggle board"]]></title><description><![CDATA[
<p>You can see all the wordlists I used here:
<a href="https://github.com/danvk/hybrid-boggle/tree/main/wordlists">https://github.com/danvk/hybrid-boggle/tree/main/wordlists</a><p>The proof used ENABLE2K — repeating it for other wordlists would require another ~23,000 CPU hours each.</p>
]]></description><pubDate>Wed, 23 Apr 2025 20:33:31 +0000</pubDate><link>https://news.ycombinator.com/item?id=43776352</link><dc:creator>danvk</dc:creator><comments>https://news.ycombinator.com/item?id=43776352</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43776352</guid></item><item><title><![CDATA[New comment by danvk in "A Computational Proof of the Highest-Scoring Boggle Board"]]></title><description><![CDATA[
<p>Great! Feel free to reach out -- my email isn't hard to find.</p>
]]></description><pubDate>Wed, 23 Apr 2025 19:01:28 +0000</pubDate><link>https://news.ycombinator.com/item?id=43775536</link><dc:creator>danvk</dc:creator><comments>https://news.ycombinator.com/item?id=43775536</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43775536</guid></item><item><title><![CDATA[New comment by danvk in "After 20 years, the globally optimal Boggle board"]]></title><description><![CDATA[
<p>If you want to give it a try, I'd love to hear if that's the case! It's deleted in the repo now, but here's code to generate a spec for an ILP solver: <a href="https://github.com/danvk/hybrid-boggle/blob/62d3f01aed802734a28a119df6c59b684d38667c/boggle/z3.py">https://github.com/danvk/hybrid-boggle/blob/62d3f01aed802734...</a><p>One interesting thing about Boggle is that the number of variables (16 cells) is very small compared to the number of coefficients on how they combine (the number of possible words).</p>
]]></description><pubDate>Wed, 23 Apr 2025 18:32:37 +0000</pubDate><link>https://news.ycombinator.com/item?id=43775242</link><dc:creator>danvk</dc:creator><comments>https://news.ycombinator.com/item?id=43775242</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43775242</guid></item><item><title><![CDATA[New comment by danvk in "A Computational Proof of the Highest-Scoring Boggle Board"]]></title><description><![CDATA[
<p>Annealing is mentioned a few times in the post but not discussed in any detail. I found that hill climbing with an expanded "pool" of boards and exhaustive search of neighbors was the most reliable way to get from a random starting point to the highest-scoring board: <a href="https://github.com/danvk/hybrid-boggle/blob/main/boggle/hillclimb.py">https://github.com/danvk/hybrid-boggle/blob/main/boggle/hill...</a></p>
]]></description><pubDate>Wed, 23 Apr 2025 18:27:12 +0000</pubDate><link>https://news.ycombinator.com/item?id=43775164</link><dc:creator>danvk</dc:creator><comments>https://news.ycombinator.com/item?id=43775164</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43775164</guid></item><item><title><![CDATA[New comment by danvk in "After 20 years, the globally optimal Boggle board"]]></title><description><![CDATA[
<p>I actually did try ILP, see <a href="https://stackoverflow.com/questions/79422270/why-is-my-z3-and-or-tools-formulation-of-a-problem-slower-than-brute-force-in-py" rel="nofollow">https://stackoverflow.com/questions/79422270/why-is-my-z3-an...</a><p>I tried Z3 and OR Tools. I didn't try Gurobi. But this was enough to make me think ILP was a dead end. (There were a lot of dead ends in this project.)<p>I don't know much about integer programming, though, and I'd love to be proven wrong.</p>
]]></description><pubDate>Wed, 23 Apr 2025 18:23:16 +0000</pubDate><link>https://news.ycombinator.com/item?id=43775112</link><dc:creator>danvk</dc:creator><comments>https://news.ycombinator.com/item?id=43775112</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43775112</guid></item><item><title><![CDATA[After 20 years, the globally optimal Boggle board]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.danvk.org/2025/04/23/boggle-solved.html">https://www.danvk.org/2025/04/23/boggle-solved.html</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=43774702">https://news.ycombinator.com/item?id=43774702</a></p>
<p>Points: 78</p>
<p># Comments: 23</p>
]]></description><pubDate>Wed, 23 Apr 2025 17:45:35 +0000</pubDate><link>https://www.danvk.org/2025/04/23/boggle-solved.html</link><dc:creator>danvk</dc:creator><comments>https://news.ycombinator.com/item?id=43774702</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43774702</guid></item><item><title><![CDATA[New comment by danvk in "How far can you get in 40 minutes from each subway station in NYC?"]]></title><description><![CDATA[
<p>We built a visualization along these lines at Sidewalk Labs back in 2017. It's open source if you're interested in playing around with it: <a href="https://github.com/sidewalklabs/router">https://github.com/sidewalklabs/router</a><p>I particularly liked the multimodal comparison feature. It lets you answer questions like "where does the bus help me get to faster than the subway?" (Answer: basically nowhere.)</p>
]]></description><pubDate>Sun, 26 Jan 2025 16:58:33 +0000</pubDate><link>https://news.ycombinator.com/item?id=42831514</link><dc:creator>danvk</dc:creator><comments>https://news.ycombinator.com/item?id=42831514</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42831514</guid></item><item><title><![CDATA[New comment by danvk in "Llama-OCR: Document to Markdown"]]></title><description><![CDATA[
<p>I have not, but that's a great idea!</p>
]]></description><pubDate>Sun, 17 Nov 2024 13:00:12 +0000</pubDate><link>https://news.ycombinator.com/item?id=42163981</link><dc:creator>danvk</dc:creator><comments>https://news.ycombinator.com/item?id=42163981</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42163981</guid></item><item><title><![CDATA[New comment by danvk in "Llama-OCR: Document to Markdown"]]></title><description><![CDATA[
<p>I've had really good luck recently running OCR over a corpus of images using gpt-4o. The most important thing I realized was that non-fancy data prep is still important, even with fancy LLMs. Cropping my images to just the text (excluding any borders) and increasing the contrast of the image helped enormously. (I wrote about this in 2015 and this post still holds up well with GPT: <a href="https://www.danvk.org/2015/01/07/finding-blocks-of-text-in-an-image-using-python-opencv-and-numpy.html" rel="nofollow">https://www.danvk.org/2015/01/07/finding-blocks-of-text-in-a...</a>).<p>I also found that giving GPT at most a few paragraphs at a time worked better than giving it whole pages. Shorter text = less chance to hallucinate.</p>
]]></description><pubDate>Sat, 16 Nov 2024 14:35:46 +0000</pubDate><link>https://news.ycombinator.com/item?id=42156646</link><dc:creator>danvk</dc:creator><comments>https://news.ycombinator.com/item?id=42156646</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42156646</guid></item></channel></rss>