<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: rw</title><link>https://news.ycombinator.com/user?id=rw</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Wed, 29 Apr 2026 07:59:42 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=rw" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by rw in "CRDTs are the future"]]></title><description><![CDATA[
<p>Operational Transformation and Conflict-Free Replicated Datatypes are very different from each other.<p>As the author explains, OT relies on some ordering of system events, and CRDTs don't. That means CRDTs need to be commutative (and probably associative), and OT doesn't.<p>So, OT is less scalable but more powerful, and CRDTs are more scalable but less powerful (in theory).<p>It's sort of like comparing Paxos/Raft to Bittorrent.<p>(I am not an expert on OT.)</p>
]]></description><pubDate>Tue, 29 Sep 2020 00:11:35 +0000</pubDate><link>https://news.ycombinator.com/item?id=24622707</link><dc:creator>rw</dc:creator><comments>https://news.ycombinator.com/item?id=24622707</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=24622707</guid></item><item><title><![CDATA[New comment by rw in "Stegasuras: Neural Linguistic Steganography"]]></title><description><![CDATA[
<p>Stegasuras is convincing work and the quality looks excellent.<p>I wrote a steganographic tool in this same spirit back in 2011, called Plainsight.<p>Back then, we didn't have deep learning, and the "Imagenet moment for NLP" had yet to arrive.<p>My Python code, with examples, is here: <a href="https://github.com/rw/plainsight" rel="nofollow">https://github.com/rw/plainsight</a><p>Unlike the OP, my Plainsight algorithm is 100% invertible by construction, and accepts binary input. (I verified the inversion process with "roundtrip fuzzing", a technique I still use today.)<p>Plainsight uses each <i>bit</i> of the input message to generate tokens. Bits are used to decide how to traverse a Huffman-style n-gram tree, weighted by frequency. This tree of n-grams is the model used in both the encoding and decoding steps. The drawbacks to my method are that the output 1) can be verbose and 2) does not convince a human that it's plausible, except for short messages.<p>Stegasuras has orders-of-magnitude better output, and seems to solve the problems I couldn't solve eight years ago. I would venture that their new result has as much to do with advances in language modeling, as it does with the particulars of their encoding and decoding algorithms.<p>I'll also note that I'm glad these researchers were able to use grant money to do this work. As a non-academic, I applied for an AI Grant to support me in upgrading Plainsight to use deep learning, but I was turned away at the time.<p>Finally, one of the ideas I picked up back then is that spam can be used to contain secret messages. Send enough gibberish to enough people, with your intended recipient included, and you'll look like a spammer--not a spy:<p><pre><code>   $ wget https://spamassassin.apache.org/publiccorpus/20030228_spam.tar.bz2
   $ tar -jxvf 20030228_spam.tar.bz2
   $ cat spam/0* > spam-corpus.txt

   $ echo "The Magic Words are Squeamish Ossifrage" | plainsight -m encipher -f spam-corpus.txt > spam_ciphertext
   
   $ cat spam_ciphertext
   (8.11.6/8.11.6) 3 (Normal) Internet can send e-mails until to transfer 26 10 [127.0.0.1]
   also include address from the most logical, mail business for your Car have a many our
   portals ESMTP Thu, 29 1.0 this letter on internet, <a style=3D"color: 0px; text/plain;
   cellspacing=3D"0" how quoted-printable about receiving you would like width=3D"15%"
   width=3D"15%" border="0" width="511" Date: Tue, 27 Thu, 19 26 because
   zzzz@localhost.spamassassin.taint.org for
   
   $ cat spam_ciphertext | plainsight -m decipher -f spam-corpus.txt
   Adding models:
   Model: spam-corpus.txt added in 2.57s (context == 2)
   input is "<stdin>", output is "<stdout>"   
   deciphering: 100% | 543.84  B/s | Time: 0:00:00
   
   The Magic Words are Squeamish Ossifrage</code></pre></p>
]]></description><pubDate>Fri, 06 Sep 2019 09:54:10 +0000</pubDate><link>https://news.ycombinator.com/item?id=20894337</link><dc:creator>rw</dc:creator><comments>https://news.ycombinator.com/item?id=20894337</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=20894337</guid></item><item><title><![CDATA[Tutorial: Use FlatBuffers in Rust]]></title><description><![CDATA[
<p>Article URL: <a href="https://rwinslow.com/posts/use-flatbuffers-in-rust/">https://rwinslow.com/posts/use-flatbuffers-in-rust/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=20098740">https://news.ycombinator.com/item?id=20098740</a></p>
<p>Points: 2</p>
<p># Comments: 0</p>
]]></description><pubDate>Tue, 04 Jun 2019 20:08:36 +0000</pubDate><link>https://rwinslow.com/posts/use-flatbuffers-in-rust/</link><dc:creator>rw</dc:creator><comments>https://news.ycombinator.com/item?id=20098740</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=20098740</guid></item><item><title><![CDATA[New comment by rw in "TimescaleDB vs. InfluxDB: built differently for time-series data"]]></title><description><![CDATA[
<p>The TimescaleDB benchmark code is a fork of code I wrote, as an independent consultant, for InfluxData in 2016 and 2017. The purpose of my project was to rigorously compare InfluxDB and InfluxDB Enterprise to Cassandra, Elasticsearch, MongoDB, and OpenTSDB. It's called influxdb-comparisons and is an actively-maintained project on Github at [0]. I am no longer affiliated with InfluxData, and these are my own opinions.<p>I designed and built the influxdb-comparisons benchmark suite to be easy to understand for customers. From a technical perspective, it is simulation-based, verifiable, fast, fair, and extensible. In particular, I created the "use-case approach" so that, no matter how technical our benchmark reports got, customers could say to themselves: "I understand this!". For example, in the devops use-case, we generate data and queries from a realistic simulation of telemetry collected from a server fleet. Doing it this way creates benchmarking stories that appeal to a wide variety of both technical and nontechnical customers.<p>This user-first design of a benchmarking suite was a novel innovation, and was a large factor in the success of the project.<p>Another aspect of the project is that we tried to do right by the competition. That means that we spoke with experts (sometimes, the creators of the databases themselves) on how to best achieve our goals. In particular, I worked hard to make the Cassandra, Elasticsearch MongoDB, and OpenTSDB benchmarks show their respective databases in the best light possible. Concretely, each database was configured in a way that is 1) featureful, like InfluxDB, 2) fast at writes, 3) fast at reads, and 4) efficient with disk space.<p>As an example of my diligence in implementing this benchmark suite for InfluxData, I included a mechanism by which the benchmark query results can be verified for correctness across competing databases, to within floating point tolerances. This is important because, when building adapters for drastically different databases, it is easy to introduce bugs that could give a false advantage to one side or the other (e.g. by accidentally throwing data away, or by executing queries that don't range over the whole dataset).<p>I don't see that TimescaleDB is using the verification functionality I created. I encourage TimescaleDB to run query verification, and write up their benchmarking methods in detail, like I did here: [1].<p>I think it's great that TimescaleDB is taking these ideas and extending them. At InfluxData, we made the code open-source so that others could build and learn from our work. In that tradition, I hope that the ongoing discussion about how to do excellent benchmarking of time-series databases keeps evolving.<p>[0] <a href="https://github.com/influxdata/influxdb-comparisons" rel="nofollow">https://github.com/influxdata/influxdb-comparisons</a> (Note that others maintain this project now.)<p>[1] <a href="https://rwinslow.com/rwinslow-benchmark-tech-paper-influxdb-vs-elasticsearch-may-2016.pdf" rel="nofollow">https://rwinslow.com/rwinslow-benchmark-tech-paper-influxdb-...</a></p>
]]></description><pubDate>Wed, 15 Aug 2018 18:37:43 +0000</pubDate><link>https://news.ycombinator.com/item?id=17768548</link><dc:creator>rw</dc:creator><comments>https://news.ycombinator.com/item?id=17768548</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=17768548</guid></item><item><title><![CDATA[New comment by rw in "I'm Scott Aaronson, quantum computing/computational complexity researcher. AMA"]]></title><description><![CDATA[
<p>Hi Scott, thank you for writing your blog all these years. Your Busy Beaver essay ignited my passion for computer science, especially in algorithm analysis, logic, undecidability, and probability theory. I used to be someone who only thought in code; thanks to you, I now also think in math.</p>
]]></description><pubDate>Fri, 29 Jun 2018 18:12:42 +0000</pubDate><link>https://news.ycombinator.com/item?id=17426377</link><dc:creator>rw</dc:creator><comments>https://news.ycombinator.com/item?id=17426377</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=17426377</guid></item><item><title><![CDATA[New comment by rw in "Show HN: Diamond – Full-stack web-framework in D"]]></title><description><![CDATA[
<p>Why the hard dependency on MySQL?</p>
]]></description><pubDate>Tue, 03 Apr 2018 19:48:51 +0000</pubDate><link>https://news.ycombinator.com/item?id=16748426</link><dc:creator>rw</dc:creator><comments>https://news.ycombinator.com/item?id=16748426</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=16748426</guid></item><item><title><![CDATA[New comment by rw in "China using big data to detain people before crime is committed"]]></title><description><![CDATA[
<p>"Your scientists were so preoccupied with whether or not they could, that they didn't stop to think if they should."<p>- Jeff Goldblum as Dr. Ian Malcolm in Jurassic Park</p>
]]></description><pubDate>Wed, 28 Feb 2018 21:39:13 +0000</pubDate><link>https://news.ycombinator.com/item?id=16487279</link><dc:creator>rw</dc:creator><comments>https://news.ycombinator.com/item?id=16487279</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=16487279</guid></item><item><title><![CDATA[The untold story of systemic gender discrimination at UC Berkeley's IT Dept]]></title><description><![CDATA[
<p>Article URL: <a href="https://pando.com/2018/02/23/bears-lair-untold-story-systemic-gender-discrimination-inside-uc-berkeleys-it-department/">https://pando.com/2018/02/23/bears-lair-untold-story-systemic-gender-discrimination-inside-uc-berkeleys-it-department/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=16466834">https://news.ycombinator.com/item?id=16466834</a></p>
<p>Points: 10</p>
<p># Comments: 0</p>
]]></description><pubDate>Mon, 26 Feb 2018 17:03:42 +0000</pubDate><link>https://pando.com/2018/02/23/bears-lair-untold-story-systemic-gender-discrimination-inside-uc-berkeleys-it-department/</link><dc:creator>rw</dc:creator><comments>https://news.ycombinator.com/item?id=16466834</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=16466834</guid></item><item><title><![CDATA[We're building a dystopia just to make people click on ads]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.ted.com/talks/zeynep_tufekci_we_re_building_a_dystopia_just_to_make_people_click_on_ads/transcript">https://www.ted.com/talks/zeynep_tufekci_we_re_building_a_dystopia_just_to_make_people_click_on_ads/transcript</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=15582843">https://news.ycombinator.com/item?id=15582843</a></p>
<p>Points: 2</p>
<p># Comments: 1</p>
]]></description><pubDate>Mon, 30 Oct 2017 02:56:38 +0000</pubDate><link>https://www.ted.com/talks/zeynep_tufekci_we_re_building_a_dystopia_just_to_make_people_click_on_ads/transcript</link><dc:creator>rw</dc:creator><comments>https://news.ycombinator.com/item?id=15582843</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=15582843</guid></item><item><title><![CDATA[New comment by rw in "Show HN: Using LDA to suggest GitHub repositories based on what you have starred"]]></title><description><![CDATA[
<p>Good idea, the READMEs would be best of all.</p>
]]></description><pubDate>Mon, 02 Oct 2017 09:05:13 +0000</pubDate><link>https://news.ycombinator.com/item?id=15382557</link><dc:creator>rw</dc:creator><comments>https://news.ycombinator.com/item?id=15382557</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=15382557</guid></item><item><title><![CDATA[New comment by rw in "Show HN: Using LDA to suggest GitHub repositories based on what you have starred"]]></title><description><![CDATA[
<p>As I said, your approach is a clever way to use the GitHub API. I think you need to change the title and readme to indicate that this isn't an LDA index of GitHub descriptions. To ML practitioners, that's what you are implying with a title of "Show HN: Using LDA to suggest GitHub repositories based on what you have starred".</p>
]]></description><pubDate>Sun, 01 Oct 2017 23:16:19 +0000</pubDate><link>https://news.ycombinator.com/item?id=15380470</link><dc:creator>rw</dc:creator><comments>https://news.ycombinator.com/item?id=15380470</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=15380470</guid></item><item><title><![CDATA[New comment by rw in "Show HN: Using LDA to suggest GitHub repositories based on what you have starred"]]></title><description><![CDATA[
<p>This only uses LDA on your starred repository descriptions, to find topic terms that describe your starred repositories. These topic terms are then used to query the GitHub search API to find matching repositories. The results are then sorted by star count.<p>That is a clever way to make use of a search API like GitHub's. The principled way to do this, though, is to run LDA over all descriptions on GitHub, then use that similarity index to find similar repositories. You could run LDA over code, too.<p>I'll note that there is a cold start problem with this implementation: using LDA on such a small set of short documents will often lead to uninformative topics with words that are too-specific. You need a big corpus to capture e.g. synonym relationships.</p>
]]></description><pubDate>Sun, 01 Oct 2017 23:05:51 +0000</pubDate><link>https://news.ycombinator.com/item?id=15380424</link><dc:creator>rw</dc:creator><comments>https://news.ycombinator.com/item?id=15380424</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=15380424</guid></item><item><title><![CDATA[New comment by rw in "Gophersat: A SAT solver written in Go"]]></title><description><![CDATA[
<p>No, polynomial time. For reference, see these Wikipedia pages:<p><a href="https://en.wikipedia.org/wiki/Polynomial-time_reduction" rel="nofollow">https://en.wikipedia.org/wiki/Polynomial-time_reduction</a><p><a href="https://en.wikipedia.org/wiki/Karp%27s_21_NP-complete_problems" rel="nofollow">https://en.wikipedia.org/wiki/Karp%27s_21_NP-complete_proble...</a></p>
]]></description><pubDate>Thu, 28 Sep 2017 19:40:20 +0000</pubDate><link>https://news.ycombinator.com/item?id=15360092</link><dc:creator>rw</dc:creator><comments>https://news.ycombinator.com/item?id=15360092</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=15360092</guid></item><item><title><![CDATA[New comment by rw in "Zuckerberg's trust problem"]]></title><description><![CDATA[
<p>Why has this article been removed from the top 250 news results? It was #1 for a few minutes, then #5, and now it's gone. We've successfully discussed much more risqué topics here on HN...<p>Why did the comment by `TAForObvReasons calling out this apparent censorship get deleted?</p>
]]></description><pubDate>Thu, 28 Sep 2017 07:52:40 +0000</pubDate><link>https://news.ycombinator.com/item?id=15355517</link><dc:creator>rw</dc:creator><comments>https://news.ycombinator.com/item?id=15355517</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=15355517</guid></item><item><title><![CDATA[New comment by rw in "Thoughts on OpenAI, reinforcement learning, and killer robots"]]></title><description><![CDATA[
<p>No, it's called insufficient feature engineering. Data leakage is when your test data contaminates your training data.</p>
]]></description><pubDate>Sat, 29 Jul 2017 00:45:36 +0000</pubDate><link>https://news.ycombinator.com/item?id=14878791</link><dc:creator>rw</dc:creator><comments>https://news.ycombinator.com/item?id=14878791</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=14878791</guid></item><item><title><![CDATA[New comment by rw in "The Future of Go Summit – Ke Jie vs. AlphaGo"]]></title><description><![CDATA[
<p>A) How would you characterize the differences and similarities between AlphaGo and the best human players?<p>B) How has human play style changed since AlphaGo's introduction?<p>C) What is the answer to the question you most want to be asked?</p>
]]></description><pubDate>Tue, 23 May 2017 03:35:59 +0000</pubDate><link>https://news.ycombinator.com/item?id=14398621</link><dc:creator>rw</dc:creator><comments>https://news.ycombinator.com/item?id=14398621</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=14398621</guid></item><item><title><![CDATA[New comment by rw in "Open sourcing Sonnet – a new library for constructing neural networks"]]></title><description><![CDATA[
<p>TensorFlow is a dataflow computation system. Keras is for building neural networks. Each exists at a different level of abstraction.</p>
]]></description><pubDate>Fri, 07 Apr 2017 19:02:42 +0000</pubDate><link>https://news.ycombinator.com/item?id=14062564</link><dc:creator>rw</dc:creator><comments>https://news.ycombinator.com/item?id=14062564</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=14062564</guid></item><item><title><![CDATA[New comment by rw in "Cracking Minesweeper with Z3 SMT Solver"]]></title><description><![CDATA[
<p>How does this contrast with Answer Set Programming (using e.g. clasp)?</p>
]]></description><pubDate>Mon, 06 Mar 2017 04:24:34 +0000</pubDate><link>https://news.ycombinator.com/item?id=13800690</link><dc:creator>rw</dc:creator><comments>https://news.ycombinator.com/item?id=13800690</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=13800690</guid></item><item><title><![CDATA[New comment by rw in "Introducing Keybase Chat"]]></title><description><![CDATA[
<p>How did you find these changes?</p>
]]></description><pubDate>Thu, 09 Feb 2017 04:26:45 +0000</pubDate><link>https://news.ycombinator.com/item?id=13604655</link><dc:creator>rw</dc:creator><comments>https://news.ycombinator.com/item?id=13604655</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=13604655</guid></item><item><title><![CDATA[New comment by rw in "The Axiom of Choice Is Wrong (2007)"]]></title><description><![CDATA[
<p>You could have answered all of your questions with "finitely many", because, after all, we can each only perform a finite number of actions in the world.<p>In general, the infinite hierarchy of infinite sets "exists" because we can define it.</p>
]]></description><pubDate>Sun, 05 Feb 2017 10:16:19 +0000</pubDate><link>https://news.ycombinator.com/item?id=13571713</link><dc:creator>rw</dc:creator><comments>https://news.ycombinator.com/item?id=13571713</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=13571713</guid></item></channel></rss>