<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: awendland</title><link>https://news.ycombinator.com/user?id=awendland</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Sat, 04 Jul 2026 16:41:05 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=awendland" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[Your Immune System Is Not a Muscle]]></title><description><![CDATA[
<p>Article URL: <a href="https://rachel.fast.ai/posts/2024-08-13-crowds-vs-friends/">https://rachel.fast.ai/posts/2024-08-13-crowds-vs-friends/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=41369061">https://news.ycombinator.com/item?id=41369061</a></p>
<p>Points: 287</p>
<p># Comments: 197</p>
]]></description><pubDate>Tue, 27 Aug 2024 16:06:57 +0000</pubDate><link>https://rachel.fast.ai/posts/2024-08-13-crowds-vs-friends/</link><dc:creator>awendland</dc:creator><comments>https://news.ycombinator.com/item?id=41369061</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41369061</guid></item><item><title><![CDATA[New comment by awendland in "Launch HN: Metriport (YC S22) – Open-source API for healthcare data exchange"]]></title><description><![CDATA[
<p>What percentage, or how many millions, of patients are accessible on the network today?</p>
]]></description><pubDate>Fri, 24 May 2024 02:47:53 +0000</pubDate><link>https://news.ycombinator.com/item?id=40462359</link><dc:creator>awendland</dc:creator><comments>https://news.ycombinator.com/item?id=40462359</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40462359</guid></item><item><title><![CDATA[New comment by awendland in "Show HN: Hacker Search – A semantic search engine for Hacker News"]]></title><description><![CDATA[
<p>Following @isoprohplex, I'll be the fourth comment to say I also built a variant of this: <a href="https://hnss.alexwendland.com/" rel="nofollow">https://hnss.alexwendland.com/</a><p>I built mine on top of an RSS feed I generate from Hacker News which filters out any posts linking to the top 1 million domains [1] and creates a readable version of the content. I use it to surface articles on smaller blogs/personal websites—it's become my main content source. It's generated via Github Actions every 4 hours and stored in a detached branch on Github (~2 GB of data from the past 4 years). Here's an example for posts with >= 10 upvotes [2].<p>It only took several hours to build the semantic search on top. And that included time for me to try out and learn several different vector DBs, embedding models, data pipelines, and UI frameworks! The current state of AI tooling is wonderfully simple.<p>In the end I landed on (selected in haste optimizing for developer ergonomics, so only a partial endorsement):<p><pre><code>  - BAAI/bge-small-en as an embedding model
  - Python with
    - HuggingFaceBgeEmbeddings from langchain_community for creating embeddings
    - SentenceSplitter from llama_index for chunking documents
    - ChromaDB as a vector DB + chroma-ops to prune the DB
    - sqlite3 for metadata
    - FastAPI, Pydantic, Jinja2, Tailwind for API and server-rendered webpages
  - jsdom and mozilla-readability for article extraction
</code></pre>
I generated the index locally on my M2 Mac which ripped through the ~70k articles in ~12 hours to generate all the embeddings.<p>I run the search site with Podman on a VM from Hetzner—along with other projects—for ~$8 / month. All requests are handled on CPU w/o calls to external AI providers. Query times are <200 ms, which includes embedding generation → vector DB lookup → metadata retrieval → page rendering. The server source code is here [3].<p>Nice work @jnnnthnn! What you built is fast, the rankings were solid, and the summaries are convenient.<p>[1] <a href="https://majestic.com/reports/majestic-million" rel="nofollow">https://majestic.com/reports/majestic-million</a><p>[2] <a href="https://github.com/awendland/hacker-news-small-sites/blob/generated/feeds/hn-small-sites-score-10.xml">https://github.com/awendland/hacker-news-small-sites/blob/ge...</a><p>[3] <a href="https://github.com/awendland/hacker-news-small-sites-website/blob/main/webserver.py">https://github.com/awendland/hacker-news-small-sites-website...</a></p>
]]></description><pubDate>Thu, 02 May 2024 18:36:59 +0000</pubDate><link>https://news.ycombinator.com/item?id=40239867</link><dc:creator>awendland</dc:creator><comments>https://news.ycombinator.com/item?id=40239867</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40239867</guid></item><item><title><![CDATA[New comment by awendland in "Sociosexual orientations are not reflective of life trajectories"]]></title><description><![CDATA[
<p>I was also curious. Wikipedia addresses the question:<p>> The theory was popular in the 1970s and 1980s, when it was used as a heuristic device, but lost importance in the early 1990s, when it was criticized by several empirical studies.[5][6] A life-history paradigm has replaced the r/K selection paradigm, but continues to incorporate its important themes as a subset of life history theory.[7] Some scientists now prefer to use the terms fast versus slow life history as a replacement for, respectively, r versus K reproductive strategy.[8]</p>
]]></description><pubDate>Mon, 11 Sep 2023 23:44:31 +0000</pubDate><link>https://news.ycombinator.com/item?id=37475183</link><dc:creator>awendland</dc:creator><comments>https://news.ycombinator.com/item?id=37475183</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=37475183</guid></item><item><title><![CDATA[New comment by awendland in "Show HN: Open-Source Infrastructure for Vector Data Streams"]]></title><description><![CDATA[
<p>I’ve been looking for something like this: eventually consistent syncing of DB content -> embeddings in a vector DB.<p>So far, I’ve been dealing with a tradeoff between latency + error handling in my API endpoints. I’ll either 1.) embed content + upsert into to the vector DB inside a transaction block for my main DB in the handler, which kills latency, or 2.) kickoff the embedding work separate from the main handler work, which risks data desynchronizing.<p>I’d much prefer a set-it-and-forget-it approach like Retake.<p>A few questions:<p>* If the “real-time server” goes offline temporarily, will it catch up on any newly added rows in the interim?<p>* Do you intend to emit any OpenTelemetry metrics? I’d like to monitor lag in production.<p>* Will I be able to deploy this as a single container on ECS/Kubernetes?</p>
]]></description><pubDate>Wed, 19 Jul 2023 17:04:26 +0000</pubDate><link>https://news.ycombinator.com/item?id=36789697</link><dc:creator>awendland</dc:creator><comments>https://news.ycombinator.com/item?id=36789697</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=36789697</guid></item><item><title><![CDATA[New comment by awendland in "Node-Red 3.0 Released"]]></title><description><![CDATA[
<p>This is a little lengthy, but I wanted to share the tactical details of my use case to give you a full picture:<p>I use Node-Red for a few scheduled activities: archiving Reddit posts or tweets I upvote and pulling information from real estate websites that match criteria I’m interested in.<p>I like Node-Red vs. cron-managed shell/Python scripts for several reasons:<p><pre><code>  - the admin/editor UI is accessible on any device with a web browser (no git, ssh, etc. tooling required)
  - the node-based visual flow is easy to reason about and debug (so even after years of ignoring my scripts I can quickly come back to them and grok what’s going on)
  - the barrier to entry continues to be low (I can pop in and create a new flow in <1 hr)
</code></pre>
I prefer it over Zapier or IFTTT since it’s more flexible. I’ve authored arbitrary JavaScript and request logic to retrieve and filter data in ways these pre-packaged tools can’t.<p>I run it on an AWS LightSail server for ~$4 per month. I use Ansible to manage Ubuntu with podman + systemd running the Node-Red docker image and TLS provided by Caddy. Roughly ~4 hours to setup from scratch and something I return to once every ~18 months to update/tweak with minimal issue.<p>To sum it up, I appreciate the grok-ability + flexibility + accessibility. It just works and it scales in complexity as I need it to!</p>
]]></description><pubDate>Thu, 14 Jul 2022 12:16:12 +0000</pubDate><link>https://news.ycombinator.com/item?id=32094945</link><dc:creator>awendland</dc:creator><comments>https://news.ycombinator.com/item?id=32094945</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=32094945</guid></item><item><title><![CDATA[New comment by awendland in "Rdrview – Firefox Reader View as a Linux command line tool"]]></title><description><![CDATA[
<p>I needed a reader view library for a side project and decided to compare the most popular options (repo at <a href="https://github.com/awendland/readable-web-extractor-comparison" rel="nofollow">https://github.com/awendland/readable-web-extractor-comparis...</a>). Among cleanview, metascraper, @postlight/mercury-parser, and mozilla/readability I thought that mozilla/readability performed the best because of its consistent extraction of the primary content and minimal mangling of the semantic structure.<p>For a quick preview of each library on a random sample of 16 articles posted to HN, see <a href="https://github.com/awendland/readable-web-extractor-comparison" rel="nofollow">https://github.com/awendland/readable-web-extractor-comparis...</a> (you’ll need to expand a row to see its results).</p>
]]></description><pubDate>Mon, 19 Oct 2020 15:25:03 +0000</pubDate><link>https://news.ycombinator.com/item?id=24827191</link><dc:creator>awendland</dc:creator><comments>https://news.ycombinator.com/item?id=24827191</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=24827191</guid></item><item><title><![CDATA[New comment by awendland in "Elizabeth Warren for President open-sources its 2020 campaign tech"]]></title><description><![CDATA[
<p>Here’s the campaign team’s original post about what will get open-sourced: <a href="https://medium.com/@teamwarren/open-source-tools-from-the-warren-for-president-tech-team-f1f27d2c7551" rel="nofollow">https://medium.com/@teamwarren/open-source-tools-from-the-wa...</a></p>
]]></description><pubDate>Sun, 29 Mar 2020 22:02:47 +0000</pubDate><link>https://news.ycombinator.com/item?id=22723181</link><dc:creator>awendland</dc:creator><comments>https://news.ycombinator.com/item?id=22723181</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=22723181</guid></item><item><title><![CDATA[Is postMessage slow?]]></title><description><![CDATA[
<p>Article URL: <a href="https://dassur.ma/things/is-postmessage-slow/">https://dassur.ma/things/is-postmessage-slow/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=20576421">https://news.ycombinator.com/item?id=20576421</a></p>
<p>Points: 2</p>
<p># Comments: 0</p>
]]></description><pubDate>Wed, 31 Jul 2019 17:32:41 +0000</pubDate><link>https://dassur.ma/things/is-postmessage-slow/</link><dc:creator>awendland</dc:creator><comments>https://news.ycombinator.com/item?id=20576421</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=20576421</guid></item><item><title><![CDATA[Node.js multithreading: What are Worker Threads and why do they matter?]]></title><description><![CDATA[
<p>Article URL: <a href="https://blog.logrocket.com/node-js-multithreading-what-are-worker-threads-and-why-do-they-matter-48ab102f8b10">https://blog.logrocket.com/node-js-multithreading-what-are-worker-threads-and-why-do-they-matter-48ab102f8b10</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=18991812">https://news.ycombinator.com/item?id=18991812</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Thu, 24 Jan 2019 19:02:32 +0000</pubDate><link>https://blog.logrocket.com/node-js-multithreading-what-are-worker-threads-and-why-do-they-matter-48ab102f8b10</link><dc:creator>awendland</dc:creator><comments>https://news.ycombinator.com/item?id=18991812</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=18991812</guid></item><item><title><![CDATA[Reducing Relative Paths in TypeScript and Webpack]]></title><description><![CDATA[
<p>Article URL: <a href="https://blog.johnnyreilly.com/2018/08/killing-relative-paths-with-typescript-and.html">https://blog.johnnyreilly.com/2018/08/killing-relative-paths-with-typescript-and.html</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=17892970">https://news.ycombinator.com/item?id=17892970</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Sat, 01 Sep 2018 17:39:54 +0000</pubDate><link>https://blog.johnnyreilly.com/2018/08/killing-relative-paths-with-typescript-and.html</link><dc:creator>awendland</dc:creator><comments>https://news.ycombinator.com/item?id=17892970</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=17892970</guid></item></channel></rss>