<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: meehai</title><link>https://news.ycombinator.com/user?id=meehai</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Tue, 07 Apr 2026 06:00:00 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=meehai" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by meehai in "Show HN: Twitch Roulette – Find live streamers who need views the most"]]></title><description><![CDATA[
<p>Actively doing this. It indeed forces me to think things through, organize thoughts and speak them out. I open paint/miro to draw. It's good practice.</p>
]]></description><pubDate>Sat, 28 Mar 2026 11:47:34 +0000</pubDate><link>https://news.ycombinator.com/item?id=47553705</link><dc:creator>meehai</dc:creator><comments>https://news.ycombinator.com/item?id=47553705</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47553705</guid></item><item><title><![CDATA[New comment by meehai in "RX – a new random-access JSON alternative"]]></title><description><![CDATA[
<p>and with little data (i.e. <10Mb), this matters much less than accessibility and easy understanding of the data using a simple text editor or jq in the terminal + some filters.</p>
]]></description><pubDate>Thu, 19 Mar 2026 08:49:57 +0000</pubDate><link>https://news.ycombinator.com/item?id=47436569</link><dc:creator>meehai</dc:creator><comments>https://news.ycombinator.com/item?id=47436569</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47436569</guid></item><item><title><![CDATA[New comment by meehai in "LLM from scratch, part 28 – training a base model from scratch on an RTX 3090"]]></title><description><![CDATA[
<p>it's skills first and then money and hardware for scale<p>A more skilled person that understands all the underlying steps will always be more efficient in scaling up due to knowing where to allocate more.<p>basically... you always need the skills and the money is the fine tuning.</p>
]]></description><pubDate>Tue, 09 Dec 2025 11:58:08 +0000</pubDate><link>https://news.ycombinator.com/item?id=46203950</link><dc:creator>meehai</dc:creator><comments>https://news.ycombinator.com/item?id=46203950</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46203950</guid></item><item><title><![CDATA[New comment by meehai in "Vortex: An extensible, state of the art columnar file format"]]></title><description><![CDATA[
<p>Can you append new columns to a file stored on disk without reading it all in mempey? Somehoe this is beyond parquet capabilities.</p>
]]></description><pubDate>Thu, 20 Nov 2025 06:05:16 +0000</pubDate><link>https://news.ycombinator.com/item?id=45989463</link><dc:creator>meehai</dc:creator><comments>https://news.ycombinator.com/item?id=45989463</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45989463</guid></item><item><title><![CDATA[New comment by meehai in "F3: Open-source data file format for the future [pdf]"]]></title><description><![CDATA[
<p><a href="https://stackoverflow.com/questions/31812780/append-a-new-column-to-an-existing-parquet-file" rel="nofollow">https://stackoverflow.com/questions/31812780/append-a-new-co...</a></p>
]]></description><pubDate>Thu, 02 Oct 2025 10:32:37 +0000</pubDate><link>https://news.ycombinator.com/item?id=45448024</link><dc:creator>meehai</dc:creator><comments>https://news.ycombinator.com/item?id=45448024</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45448024</guid></item><item><title><![CDATA[New comment by meehai in "SimpleFold: Folding proteins is simpler than you think"]]></title><description><![CDATA[
<p>Yeah, but if you can do topologies based on latencies you may get some decent tradeoffs. For example with N=1M nodes each doing batch updates in a tree manner, i.e the all reduce is actually layered by latency between nodes.</p>
]]></description><pubDate>Sat, 27 Sep 2025 07:56:37 +0000</pubDate><link>https://news.ycombinator.com/item?id=45393939</link><dc:creator>meehai</dc:creator><comments>https://news.ycombinator.com/item?id=45393939</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45393939</guid></item><item><title><![CDATA[New comment by meehai in "An LLM is a lossy encyclopedia"]]></title><description><![CDATA[
<p>lossy encycopledia that can also do some short-term memory (RAG) things.</p>
]]></description><pubDate>Tue, 02 Sep 2025 12:33:48 +0000</pubDate><link>https://news.ycombinator.com/item?id=45102278</link><dc:creator>meehai</dc:creator><comments>https://news.ycombinator.com/item?id=45102278</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45102278</guid></item><item><title><![CDATA[New comment by meehai in "Counter-Strike: A billion-dollar game built in a dorm room"]]></title><description><![CDATA[
<p>I've played years of KZ and HNS after years of playing competitive CS on local communities (old PGL in romania!). I got over 6k hours in steam CS1.6 + many more on "non-steam". That game shaped me. I even learned the basics of programming while modding a KZ plugin: <a href="https://forums.alliedmods.net/showthread.php?t=130417" rel="nofollow">https://forums.alliedmods.net/showthread.php?t=130417</a><p>Nowadays I code for a living, but for sure this is the game that started the spark for me.<p>It was a great time and I feel that I can always run this game and get back to that childhood feeling.</p>
]]></description><pubDate>Tue, 19 Aug 2025 08:09:09 +0000</pubDate><link>https://news.ycombinator.com/item?id=44949408</link><dc:creator>meehai</dc:creator><comments>https://news.ycombinator.com/item?id=44949408</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44949408</guid></item><item><title><![CDATA[New comment by meehai in "Do not download the app, use the website"]]></title><description><![CDATA[
<p>Tbh, the web won the application platform mostly because it's a standard. Everybody knows html, css and a little JS.<p>On the other hand, for mobile apps, there is still a device-specific mentality.<p>Imagine web apps being built with a different flavor for all the major browsers...<p>I hope that the same level of standardization comes to mobile apps too with the option to use more device-specific features on top of the generic UI.</p>
]]></description><pubDate>Sat, 26 Jul 2025 03:20:47 +0000</pubDate><link>https://news.ycombinator.com/item?id=44691050</link><dc:creator>meehai</dc:creator><comments>https://news.ycombinator.com/item?id=44691050</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44691050</guid></item><item><title><![CDATA[New comment by meehai in "My Self-Hosting Setup"]]></title><description><![CDATA[
<p>Mine is much more barebone:<p>- one single machine
- nginx proxy
- many services on the same machine; some are internal, some are supposed to be public, are all accessible via the web!
- internal ones have a humongous large password for HTTP basic auth that I store in an external password manager (firefox built in one)
- public ones are either public or have google oauth<p>I coded all of them from scratch as that's the point of what I'm doing with homelabbing. You want images? browsers can read them. Videos? Browsers can play them.<p>The hard part is the backend for me. The frontend is very much "90s html".</p>
]]></description><pubDate>Sat, 19 Jul 2025 06:12:30 +0000</pubDate><link>https://news.ycombinator.com/item?id=44613004</link><dc:creator>meehai</dc:creator><comments>https://news.ycombinator.com/item?id=44613004</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44613004</guid></item><item><title><![CDATA[New comment by meehai in "Most RESTful APIs aren't really RESTful"]]></title><description><![CDATA[
<p>the last point got me.<p>How can you idiomatically do a read only request with complex filters? For me both PUT and POST are "writable" operations, while "GET" are assumed to be read only. However, if you need to encode the state of the UI (filters or whatnot), it's preferred to use JSON rather than query params (which have length limitations).<p>So ... how does one do it?</p>
]]></description><pubDate>Wed, 09 Jul 2025 13:29:47 +0000</pubDate><link>https://news.ycombinator.com/item?id=44509834</link><dc:creator>meehai</dc:creator><comments>https://news.ycombinator.com/item?id=44509834</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44509834</guid></item><item><title><![CDATA[New comment by meehai in "I wrote my PhD Thesis in Typst"]]></title><description><![CDATA[
<p><a href="https://github.com/overleaf/overleaf">https://github.com/overleaf/overleaf</a> hm ?</p>
]]></description><pubDate>Mon, 23 Jun 2025 08:27:49 +0000</pubDate><link>https://news.ycombinator.com/item?id=44353582</link><dc:creator>meehai</dc:creator><comments>https://news.ycombinator.com/item?id=44353582</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44353582</guid></item><item><title><![CDATA[New comment by meehai in "Apple Notes Expected to Gain Markdown Support in iOS 26"]]></title><description><![CDATA[
<p>vscode with the markdown plugin works really good.<p>text on the left, render on the right pane<p>example: <a href="https://imgur.com/9rjoMa2.png" rel="nofollow">https://imgur.com/9rjoMa2.png</a></p>
]]></description><pubDate>Thu, 05 Jun 2025 05:06:14 +0000</pubDate><link>https://news.ycombinator.com/item?id=44188465</link><dc:creator>meehai</dc:creator><comments>https://news.ycombinator.com/item?id=44188465</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44188465</guid></item><item><title><![CDATA[New comment by meehai in "Cognitive load is what matters"]]></title><description><![CDATA[
<p>At work we have a pretty big Python monorepo. The way we scale it is by having many standalone CLI mini apps ( about 80) atm with most of them outputting json/parquet in GCS or bigquery tables. Inputs are the same.<p>I insisted a lot on this unix (ish as it's not pipes) philosophy. It paid off so far.<p>We can test each cli app as well as make broader integration tests.</p>
]]></description><pubDate>Thu, 26 Dec 2024 16:36:01 +0000</pubDate><link>https://news.ycombinator.com/item?id=42516108</link><dc:creator>meehai</dc:creator><comments>https://news.ycombinator.com/item?id=42516108</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42516108</guid></item><item><title><![CDATA[New comment by meehai in "Hackers use ZIP file concatenation to evade detection"]]></title><description><![CDATA[
<p>I meant that they should be separate tools that can be piped together.
For example: you have 1 directory of many files (1Gb in total)<p>`zip out.zip dir/`<p>This results in a single out.zip file that is, let's say 500Mb (1:2 compression)<p>If you want to shard it, you have a separate tool, let's call it `shard` that works on any type of byte streams:<p>`shard -I out.zip -O out_shards/ --shard_size 100Mb`<p>This results in `out_shards/1.shard, ..., out_shards/5.shard`, each of 100Mb each.<p>And then you have the opposite: `unshard` (back into 1 zip file) and `unzip`.<p>No need for 'sharding' to exist as a feature in the zip utility.<p>And... if you want only the shard from the get go without the original 1 file archive, you can do something like:<p>`zip dir/ | shard -O out_shards/`<p>Now, these can be copied to the floppy disks (as discussed above) or sent via the network etc. The main thing here is that the sharding tool works on bytes only (doesn't know if it's an mp4 file, a zip file, a txt file etc.) and does no compression and the zip tool does no sharding but optimizes compression.</p>
]]></description><pubDate>Sat, 16 Nov 2024 08:08:02 +0000</pubDate><link>https://news.ycombinator.com/item?id=42155187</link><dc:creator>meehai</dc:creator><comments>https://news.ycombinator.com/item?id=42155187</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42155187</guid></item><item><title><![CDATA[New comment by meehai in "Hackers use ZIP file concatenation to evade detection"]]></title><description><![CDATA[
<p>couldn't agree more!<p>We need to separate and design modules as unitary as possible:<p>- zip should ARCHIVE/COMPRESS, i.e. reduce the file size and create a single file from the file system point of view. The complexity should go in the compression algorithm.<p>- Sharding/sending multiple coherent pieces of the same file (zip or not) is a different module and should be handled by specialized and agnostic protocols that do this like the ones you mentioned.<p>People are always doing tools that handle 2 or more use cases instead of following the UNIX principle to create generic and good single respectability tools that can be combined together (thus allowing a 'whitelist' of combinations which is safe). Quite frankly it's annoying and very often leads to issues such as this that weren't even thought in the original design because of the exponential problem of combining tools together.</p>
]]></description><pubDate>Sat, 16 Nov 2024 07:14:33 +0000</pubDate><link>https://news.ycombinator.com/item?id=42155000</link><dc:creator>meehai</dc:creator><comments>https://news.ycombinator.com/item?id=42155000</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42155000</guid></item><item><title><![CDATA[New comment by meehai in "Open washing – why companies pretend to be open source"]]></title><description><![CDATA[
<p>I think Open Weights is a better name for AI models that don't share the reproducible training scripts and data.</p>
]]></description><pubDate>Sat, 26 Oct 2024 18:01:40 +0000</pubDate><link>https://news.ycombinator.com/item?id=41956503</link><dc:creator>meehai</dc:creator><comments>https://news.ycombinator.com/item?id=41956503</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41956503</guid></item><item><title><![CDATA[New comment by meehai in "The Retreat to Muskworld"]]></title><description><![CDATA[
<p>one answer is due to the fact that humans also do this with just 2 pretty bad cameras and a lot of offloading to the cortex.<p>It also simplifies the stack a lot to have a single set of sensors, so the software becomes mostly: getting good training data (iterative loops from failing production cases) and an efficient training algorithm.<p>This scales to more than just AD and also can leverage new breakthroughs from academia</p>
]]></description><pubDate>Tue, 15 Oct 2024 07:55:49 +0000</pubDate><link>https://news.ycombinator.com/item?id=41846004</link><dc:creator>meehai</dc:creator><comments>https://news.ycombinator.com/item?id=41846004</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41846004</guid></item><item><title><![CDATA[New comment by meehai in "The Ultimate Guide to Error Handling in Python"]]></title><description><![CDATA[
<p>what about this pattern?
<a href="https://www.inngest.com/blog/python-errors-as-values" rel="nofollow">https://www.inngest.com/blog/python-errors-as-values</a><p>I tried it once in an sqlite DB connector with some business logic and simply checking stuff like<p><pre><code>  res: DBException | Result = db_handler.some_business_logic()
  if isinstance(res, DBException):
    return res # you can also log or even raise if this function isn't returning exceptions as values
  # guaranteed to be Result type here
</code></pre>
See here:<p>- <a href="https://gitlab.com/meehai/drpciv-flask/-/blob/main/be/db_handler.py?ref_type=heads#L186" rel="nofollow">https://gitlab.com/meehai/drpciv-flask/-/blob/main/be/db_han...</a><p>- <a href="https://gitlab.com/meehai/drpciv-flask/-/blob/main/be/app.py?ref_type=heads#L69" rel="nofollow">https://gitlab.com/meehai/drpciv-flask/-/blob/main/be/app.py...</a></p>
]]></description><pubDate>Thu, 10 Oct 2024 10:29:07 +0000</pubDate><link>https://news.ycombinator.com/item?id=41797417</link><dc:creator>meehai</dc:creator><comments>https://news.ycombinator.com/item?id=41797417</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41797417</guid></item><item><title><![CDATA[Show HN: VRE Dataset generation for MultiTask vision models training from videos]]></title><description><![CDATA[
<p>Been working on this tool for my PhD which involves training multi task vision models using various pre-trained models as inputs or pseudolabels in order to improve generalization. I work mostly on UAV datasets, but it should work  okay on indoor scenes or self driving (at least Marigold and Mask2Former).<p>For example, this dataset was generated using this tool: <a href="https://huggingface.co/datasets/Meehai/dronescapes" rel="nofollow">https://huggingface.co/datasets/Meehai/dronescapes</a><p>I'm quite aggressively trying to "just get the nn.Module" from the public repos that other researchers put up in their overly convoluted frameworks. A simple `forward(rgb_input: torch.Tensor) -> torch.Tensor` is nice, having 100 imports from a generic framework that has versions incompatibilities with everything else is not.<p>PS: most mains are standalone runnable too, i.e. 
- <a href="https://gitlab.com/meehai/video-representations-extractor/-/blob/master/vre/representations/depth/marigold/marigold.py" rel="nofollow">https://gitlab.com/meehai/video-representations-extractor/-/...</a>
or
- <a href="https://gitlab.com/meehai/video-representations-extractor/-/blob/master/vre/representations/semantic_segmentation/mask2former/mask2former.py?ref_type=heads#L110" rel="nofollow">https://gitlab.com/meehai/video-representations-extractor/-/...</a></p>
<hr>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=41790559">https://news.ycombinator.com/item?id=41790559</a></p>
<p>Points: 2</p>
<p># Comments: 0</p>
]]></description><pubDate>Wed, 09 Oct 2024 17:39:32 +0000</pubDate><link>https://gitlab.com/meehai/video-representations-extractor</link><dc:creator>meehai</dc:creator><comments>https://news.ycombinator.com/item?id=41790559</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41790559</guid></item></channel></rss>