<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: ahaspel</title><link>https://news.ycombinator.com/user?id=ahaspel</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Wed, 22 Apr 2026 16:26:14 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=ahaspel" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by ahaspel in "Britannica11.org – a structured edition of the 1911 Encyclopædia Britannica"]]></title><description><![CDATA[
<p>I wanted to let everyone know that article search from articles is now working properly again. A path problem. Apologies.</p>
]]></description><pubDate>Wed, 22 Apr 2026 00:46:18 +0000</pubDate><link>https://news.ycombinator.com/item?id=47857153</link><dc:creator>ahaspel</dc:creator><comments>https://news.ycombinator.com/item?id=47857153</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47857153</guid></item><item><title><![CDATA[New comment by ahaspel in "Britannica11.org – a structured edition of the 1911 Encyclopædia Britannica"]]></title><description><![CDATA[
<p>Thanks, nice catch. The tables can be tricky and I appreciate the heads-up on this markup leak. It will be corrected shortly.</p>
]]></description><pubDate>Tue, 21 Apr 2026 23:58:26 +0000</pubDate><link>https://news.ycombinator.com/item?id=47856588</link><dc:creator>ahaspel</dc:creator><comments>https://news.ycombinator.com/item?id=47856588</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47856588</guid></item><item><title><![CDATA[New comment by ahaspel in "Britannica11.org – a structured edition of the 1911 Encyclopædia Britannica"]]></title><description><![CDATA[
<p>It would indeed. I will see about working this in, it's highly pertinent.</p>
]]></description><pubDate>Tue, 21 Apr 2026 20:04:11 +0000</pubDate><link>https://news.ycombinator.com/item?id=47853826</link><dc:creator>ahaspel</dc:creator><comments>https://news.ycombinator.com/item?id=47853826</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47853826</guid></item><item><title><![CDATA[New comment by ahaspel in "Britannica11.org – a structured edition of the 1911 Encyclopædia Britannica"]]></title><description><![CDATA[
<p>I'm looking forward to it. The 9th is great in its own right and a lot of it is in the 11th. Alfred Newton's nearly 200 articles on bird species and a few classic essays by Macaulay come to mind offhand.</p>
]]></description><pubDate>Tue, 21 Apr 2026 19:59:48 +0000</pubDate><link>https://news.ycombinator.com/item?id=47853766</link><dc:creator>ahaspel</dc:creator><comments>https://news.ycombinator.com/item?id=47853766</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47853766</guid></item><item><title><![CDATA[New comment by ahaspel in "Britannica11.org – a structured edition of the 1911 Encyclopædia Britannica"]]></title><description><![CDATA[
<p>If you're reading an article, just go to the top and type in the left-hand search box. That will search for articles as well as text within articles. The right-hand box searches the text of the article you're reading.</p>
]]></description><pubDate>Tue, 21 Apr 2026 19:49:23 +0000</pubDate><link>https://news.ycombinator.com/item?id=47853642</link><dc:creator>ahaspel</dc:creator><comments>https://news.ycombinator.com/item?id=47853642</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47853642</guid></item><item><title><![CDATA[New comment by ahaspel in "Britannica11.org – a structured edition of the 1911 Encyclopædia Britannica"]]></title><description><![CDATA[
<p>I feel exactly the same way about encyclopedias and dictionaries. And Encarta really was amazing. You'd be surprised how much modern criticism of the 11th amounts to "no entry on the Great War", except in earnest.</p>
]]></description><pubDate>Tue, 21 Apr 2026 19:20:55 +0000</pubDate><link>https://news.ycombinator.com/item?id=47853276</link><dc:creator>ahaspel</dc:creator><comments>https://news.ycombinator.com/item?id=47853276</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47853276</guid></item><item><title><![CDATA[New comment by ahaspel in "Britannica11.org – a structured edition of the 1911 Encyclopædia Britannica"]]></title><description><![CDATA[
<p>I'm familiar with the Synopticon, which would be fun to structure.<p>I didn’t do OCR myself, except for the topic index and to fill in a few gaps. I started from existing Wikisource text and then built a pipeline around that: cleaning (headers, hyphenation, etc.), detecting article boundaries, reconstructing sections, and linking things back to the original page images. Most of the effort went into rendering the complex layouts, and handling the cross-linking, not the initial ingestion.<p>Glad to go into more detail if you’re interested, but that’s the gist of it.</p>
]]></description><pubDate>Tue, 21 Apr 2026 19:16:59 +0000</pubDate><link>https://news.ycombinator.com/item?id=47853213</link><dc:creator>ahaspel</dc:creator><comments>https://news.ycombinator.com/item?id=47853213</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47853213</guid></item><item><title><![CDATA[New comment by ahaspel in "Britannica11.org – a structured edition of the 1911 Encyclopædia Britannica"]]></title><description><![CDATA[
<p>Under the hood it’s not XML-TEI — it’s a relational/data-pipeline approach, with article boundaries, sections, contributors, cross-references, and source-page provenance all reconstructed into structured records. The text itself is public domain, but I haven’t released a bulk structured export yet.<p>People asking for dataset access has definitely been one of the themes of this thread. I’m taking that seriously. If I do expose it, I’d want to do it in a form that preserves the structure and doesn't just dump plain text.</p>
]]></description><pubDate>Tue, 21 Apr 2026 19:05:22 +0000</pubDate><link>https://news.ycombinator.com/item?id=47853054</link><dc:creator>ahaspel</dc:creator><comments>https://news.ycombinator.com/item?id=47853054</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47853054</guid></item><item><title><![CDATA[New comment by ahaspel in "Britannica11.org – a structured edition of the 1911 Encyclopædia Britannica"]]></title><description><![CDATA[
<p>No doubt. That’s one of the reasons I find the 1911 edition interesting — the authors have more license to express their own opinions, which naturally reflect those current at the time.</p>
]]></description><pubDate>Tue, 21 Apr 2026 18:58:28 +0000</pubDate><link>https://news.ycombinator.com/item?id=47852965</link><dc:creator>ahaspel</dc:creator><comments>https://news.ycombinator.com/item?id=47852965</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47852965</guid></item><item><title><![CDATA[New comment by ahaspel in "Britannica11.org – a structured edition of the 1911 Encyclopædia Britannica"]]></title><description><![CDATA[
<p>Just me. I spent a lot of time thinking about this, so I like talking about it.</p>
]]></description><pubDate>Tue, 21 Apr 2026 18:54:14 +0000</pubDate><link>https://news.ycombinator.com/item?id=47852901</link><dc:creator>ahaspel</dc:creator><comments>https://news.ycombinator.com/item?id=47852901</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47852901</guid></item><item><title><![CDATA[New comment by ahaspel in "Britannica11.org – a structured edition of the 1911 Encyclopædia Britannica"]]></title><description><![CDATA[
<p>Yes, that’s one of the things I like most about it. The articles have a personal tone and are less homogenized.<p>You get that mix of geography, history, and sometimes quite opinionated description all in one place, which makes them much more readable, in my view. My introduction to this version discusses this and other related matters: <a href="https://britannica11.org/about.html" rel="nofollow">https://britannica11.org/about.html</a></p>
]]></description><pubDate>Tue, 21 Apr 2026 18:43:15 +0000</pubDate><link>https://news.ycombinator.com/item?id=47852747</link><dc:creator>ahaspel</dc:creator><comments>https://news.ycombinator.com/item?id=47852747</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47852747</guid></item><item><title><![CDATA[New comment by ahaspel in "Britannica11.org – a structured edition of the 1911 Encyclopædia Britannica"]]></title><description><![CDATA[
<p>I hadn’t seen that before, it’s a great collection. I like the breadth across editions.</p>
]]></description><pubDate>Tue, 21 Apr 2026 18:39:41 +0000</pubDate><link>https://news.ycombinator.com/item?id=47852700</link><dc:creator>ahaspel</dc:creator><comments>https://news.ycombinator.com/item?id=47852700</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47852700</guid></item><item><title><![CDATA[New comment by ahaspel in "Britannica11.org – a structured edition of the 1911 Encyclopædia Britannica"]]></title><description><![CDATA[
<p>That’s a fun idea — I can see the appeal of that style.<p>The underlying text is public domain, but the structured version here is something I put together for the site. I haven’t released a bulk dataset yet.<p>If you end up experimenting with it, I’d love to hear how it turns out — and I’m still figuring out what structured access might look like.</p>
]]></description><pubDate>Tue, 21 Apr 2026 18:37:31 +0000</pubDate><link>https://news.ycombinator.com/item?id=47852669</link><dc:creator>ahaspel</dc:creator><comments>https://news.ycombinator.com/item?id=47852669</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47852669</guid></item><item><title><![CDATA[New comment by ahaspel in "Britannica11.org – a structured edition of the 1911 Encyclopædia Britannica"]]></title><description><![CDATA[
<p>Excellent points. There are indeed two Zurich articles. One way to get to the city is to search for Zurich and open the second one, which goes to the city directly. The xref in Zurich (canton) is indeed a disambiguation bug (identically named articles); thanks for catching that.<p>I haven't tested the article search box on the article viewer in Firefox. I'll look into that as well.<p>Making the title linkable is a great idea and it will be implemented shortly. Thanks for catching all of this.</p>
]]></description><pubDate>Tue, 21 Apr 2026 18:35:03 +0000</pubDate><link>https://news.ycombinator.com/item?id=47852631</link><dc:creator>ahaspel</dc:creator><comments>https://news.ycombinator.com/item?id=47852631</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47852631</guid></item><item><title><![CDATA[New comment by ahaspel in "Britannica11.org – a structured edition of the 1911 Encyclopædia Britannica"]]></title><description><![CDATA[
<p>The 1911 text itself is public domain, so anyone is free to use it.<p>What I’ve built here is a structured edition — the parsing, reconstruction, linking, indexing, etc. I haven’t published a formal license for that yet.<p>For casual or small-scale use there’s no issue at all. For bulk use (e.g. dataset / training / redistribution), I’d prefer people get in touch so I can figure out a sensible way to support that.</p>
]]></description><pubDate>Tue, 21 Apr 2026 18:18:04 +0000</pubDate><link>https://news.ycombinator.com/item?id=47852419</link><dc:creator>ahaspel</dc:creator><comments>https://news.ycombinator.com/item?id=47852419</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47852419</guid></item><item><title><![CDATA[New comment by ahaspel in "Britannica11.org – a structured edition of the 1911 Encyclopædia Britannica"]]></title><description><![CDATA[
<p>Thanks — really appreciate that, and glad it worked well for a random article.<p>That’s a great suggestion. A side-by-side text + page view would be very nice for exactly the reasons you mention (verifying the text and seeing the original layout). I haven’t built that yet, but I’ve considered it.<p>Also helpful to hear that the links to the scans weren’t immediately obvious — I should probably make them a bit clearer. This may also not be obvious, but you can click the vol:page links in the left margin and go directly to the scan of whatever page you're reading.<p>Thanks again.</p>
]]></description><pubDate>Tue, 21 Apr 2026 18:15:23 +0000</pubDate><link>https://news.ycombinator.com/item?id=47852371</link><dc:creator>ahaspel</dc:creator><comments>https://news.ycombinator.com/item?id=47852371</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47852371</guid></item><item><title><![CDATA[New comment by ahaspel in "Britannica11.org – a structured edition of the 1911 Encyclopædia Britannica"]]></title><description><![CDATA[
<p>I know exactly what you mean — I had the same experience with CD-ROM encyclopedias. There’s something about just browsing and falling into articles that’s hard to replicate.<p>Part of the motivation here was to bring that kind of exploration back, but with the original 1911 text and structure.</p>
]]></description><pubDate>Tue, 21 Apr 2026 18:12:27 +0000</pubDate><link>https://news.ycombinator.com/item?id=47852338</link><dc:creator>ahaspel</dc:creator><comments>https://news.ycombinator.com/item?id=47852338</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47852338</guid></item><item><title><![CDATA[New comment by ahaspel in "Britannica11.org – a structured edition of the 1911 Encyclopædia Britannica"]]></title><description><![CDATA[
<p>Try Jenghiz Khan. That's how they used to spell it then. Or just plain Khan and scroll the results.</p>
]]></description><pubDate>Tue, 21 Apr 2026 18:09:31 +0000</pubDate><link>https://news.ycombinator.com/item?id=47852278</link><dc:creator>ahaspel</dc:creator><comments>https://news.ycombinator.com/item?id=47852278</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47852278</guid></item><item><title><![CDATA[New comment by ahaspel in "Britannica11.org – a structured edition of the 1911 Encyclopædia Britannica"]]></title><description><![CDATA[
<p>That’s exactly the use case I had in mind. The 11th is full of gems like that, but they’ve never been easy to point people to.</p>
]]></description><pubDate>Tue, 21 Apr 2026 18:06:28 +0000</pubDate><link>https://news.ycombinator.com/item?id=47852248</link><dc:creator>ahaspel</dc:creator><comments>https://news.ycombinator.com/item?id=47852248</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47852248</guid></item><item><title><![CDATA[New comment by ahaspel in "Britannica11.org – a structured edition of the 1911 Encyclopædia Britannica"]]></title><description><![CDATA[
<p>That’s high praise. Those are both great projects and this one is definitely in the same spirit.</p>
]]></description><pubDate>Tue, 21 Apr 2026 18:04:38 +0000</pubDate><link>https://news.ycombinator.com/item?id=47852223</link><dc:creator>ahaspel</dc:creator><comments>https://news.ycombinator.com/item?id=47852223</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47852223</guid></item></channel></rss>