Hacker News: dmw_ng

New comment by dmw_ng in "The solution might be cancelling my AI subscription"

dmw_ng — Sun, 31 May 2026 17:47:41 +0000

Fixed just for you :) I always forget that tag.

The solution might be cancelling my AI subscription

dmw_ng — Sun, 31 May 2026 14:23:30 +0000

Article URL: https://thoughts.hmmz.org/2026-05-31.html

Comments URL: https://news.ycombinator.com/item?id=48345896

Points: 387

# Comments: 243

New comment by dmw_ng in "Restic: Backups done right"

dmw_ng — Sun, 13 Oct 2024 19:02:13 +0000

I'm a restic user, but have resisted the urge to attempt a bikeshed for a long time, mostly due to perf. It's index format seems to be slow and terrible and the chunking algorithm it uses (rabin fingerprints) is very slow compared to more recent alternatives (like FastCDC). Drives me nuts to watch it chugging along backing up or listing snapshots at nowhere close to the IO rate of the system while still making the fans run. Despite that it still seems to be the best free software option around

New comment by dmw_ng in "Hetzner Object Storage"

dmw_ng — Mon, 07 Oct 2024 17:19:29 +0000

That's been a feature of S3 for quite a long time now, called S3 Select https://docs.aws.amazon.com/AmazonS3/latest/userguide/select...

Despite it being an awesome feature I've been itching to use, I've never actually found a use for it beyond messing around. Most places where S3 Select might make sense seems to be subsumed (for my uses) by Athena. Athena has a rather large amount of conceptual and actual boilerplate to get up and running with, though, S3 Select requires no upfront planning beyond building a fancy query string (or using their SDK wrappers)

Where S3 Select is likely to become fiddly is anywhere multiple files are involved. Athena makes querying large collections of CSVs (etc) straightforward, and handles all the scheduling and results merging for you.

New comment by dmw_ng in "Microsoft donates the Mono Project to the Wine team"

dmw_ng — Tue, 27 Aug 2024 22:14:09 +0000

Modern .net on Linux is lovely, you can initialize a project, pull in the S3 client and write a 1-3 line C# program that AOT compiles to a single binary with none of the perf issues or GIL hand-wringing that plagues life in Python.

Given modern Python means type annotations everywhere, the convenience edge between it and modern C# (which dispenses with much of the javaesque boilerplate) is surprisingly thin, and the capabilities of the .net runtime far superior in many ways, making it quite an appealing alternative especially for perf sensitive stuff.

New comment by dmw_ng in "Intel N100 Radxa X4 First Thoughts"

dmw_ng — Sun, 28 Jul 2024 02:19:17 +0000

I recently bought an n100 and within a matter of days got buyer's remorse and impulse-purchased an n305 to go right beside it, which is currently sitting with a wildly overpriced 48 GB stick installed and 2TB SN850X, it's an absolute joy perfwise and the absence of heat it generates.

The only thing I'd reserve judgement on is the tendency to throttle. I haven't got far enough to characterize it, but it's not clear how much value those extra cores will add over the n100 with TDP settings tweaked down in the BIOS, and if leaving the n305 to run at max TDP, heat/noise/cost/temperature-related instability may start to become an issue, especially when packing other hot components like a decent SSD into the tiny cases they come in.

New comment by dmw_ng in "Proton launches its own version of Google Docs"

dmw_ng — Wed, 03 Jul 2024 13:46:42 +0000

Seems like a massive distraction from their offering for a small company, wonder why they didn't consider something like tight integration with OnlyOffice or similar. Setting out to build a new office suite feels about as sensible as building a new web browser from scratch. Except at least with a browser, you have open specs helping you through most of the endless supply of compatibility problems.

New comment by dmw_ng in "120ms to 30ms: Python to Rust"

dmw_ng — Fri, 28 Jun 2024 00:33:18 +0000

> converting huge amount of xml files

> pickling

Sounds like if this is the tooling and the task at hand, about the most complex things that should be passing through the pickler are partitioned lists of filenames rather than raw data. E.g. you can have each partition generate a parquet for combining in a final step (pyarrow.concat_tables() looks useful), or if it were some other format you were working with, potentially sending flat arrays back to the parent process as giant bytestrings or similar

This is not to say the limitations don't suck, just that very often there are simple approaches to avoid most of the pain

New comment by dmw_ng in "SSH as a Sudo Replacement"

dmw_ng — Sun, 23 Jun 2024 08:29:30 +0000

It's comical to see the sudo codebase mentioned in the same breath as increasing auditability here

New comment by dmw_ng in "Scan HTML faster with SIMD instructions – Chrome edition"

dmw_ng — Fri, 14 Jun 2024 11:34:29 +0000

Sufficiently fast software often allows leaving out whole layers of crap and needless indirection, the most common being caching. Fixing an algorithm so you can omit a dedicated database of intermediate results can be a huge maintainability/usability improvement. The same principle appears all over the place, e.g. immediate mode UIs, better networking (e.g. CSS image tiling vs. just fixing small request overhead in HTTP1 vs. QUIC), importing giant CSV files via some elaborate ETL process vs. just having a lightning fast parser+query engine, etc.

Depending on how you look at it, you could view large chunks of DOM state through this lens, as intermediate data that only exists to cache HTML parsing results. What's the point of allocating a hash table to represent element attributes if they are unchanged from the source document, and reparsing them from the source is just as fast as keeping around the parsed form? etc. These kinds of tricks only tend to present themselves after optimization work is done, which is annoying, because it's usually so damn hard to justify optimization work in the first place.

New comment by dmw_ng in "DuckDB Isn't Just Fast"

dmw_ng — Tue, 11 Jun 2024 10:41:22 +0000

I've had similar times with DuckDB, it feels nicer to use on the surface but in terms of perf and actual function I've had a better experience with clickhouse-local.

New comment by dmw_ng in "Spam blocklist SORBS closed by its owner, Proofpoint"

dmw_ng — Sun, 09 Jun 2024 15:59:01 +0000

SPF+DKIM+DMARC are a classic case of Goodhart's law, the amount of spam they stop these days (at least anecdotally) is minimal. Most spam I get seems to come via SalesForce infrastructure, and a variety of similar bulk email marketing providers

New comment by dmw_ng in "NSA Ghidra open-source reverse engineering framework"

dmw_ng — Wed, 29 May 2024 13:02:36 +0000

> Linux with HiDPI, is such a chore

There's a simple broad-spectrum fix for that. For the odd occasion I want to view high definition photos in Linux I just switch screen mode temporarily, for the rest it is sooo not worth the hassle

New comment by dmw_ng in "Cloudflare took down our website"

dmw_ng — Sun, 26 May 2024 13:54:39 +0000

CloudFlare does not provide unmetered anything, at best they provide services on a discretionary basis while trying as hard as possible to make it appear this is not the case. It's better to think of their product line as a CRM system with some CDN features on the side

New comment by dmw_ng in "Amazon S3 will no longer charge for several HTTP error codes"

dmw_ng — Mon, 13 May 2024 20:22:10 +0000

> do lots of useful work for free

Have often wondered about this in terms of some of their control plane APIs, a read-only IAM key used as part of C&C infrastructure for a botnet might be interesting, you get DNS/ClientHello signature to a legitimate and reputable service for free, while stuffing "DDoS this blog" e.g. in some tags of a free resource. Even better if the AWS account belonged to someone else.

But certainly, ability to serve an unlimited URL space from an account with only positive hits being billed seems ripe for abuse. Would guess there's already some ticket for a "top 404ers" internal report or similar

New comment by dmw_ng in "Amazon S3 will no longer charge for several HTTP error codes"

dmw_ng — Mon, 13 May 2024 20:10:46 +0000

I prefer your version: Barr replies to a tweet before gatecrashing the next S3 planning session. "A customer is hurting, folks!". The call immediately falls silent with only occasional gasps heard from stunned engineers, and the gentle weeping of a PM. I wonder if Amazon offers free therapy following an incident like this

New comment by dmw_ng in "Amazon S3 will no longer charge for several HTTP error codes"

dmw_ng — Mon, 13 May 2024 19:13:02 +0000

Can't imagine a change like this would be made without some analysis.. would love an internal view into a decision like this, I wonder if they already have log data to compute financial loss from the change, or if they have sampling instrumentation fancy enough to write/deploy custom reports like this quickly.

In any case 2 weeks seems like an impressive turnaround for such a large service, unless they'd been internally preparing to acknowledge the problem for longer

New comment by dmw_ng in "Jon Pretty wins in court against sexual harassment claims by Scala community"

dmw_ng — Fri, 26 Apr 2024 20:25:26 +0000

He did not "win in court", a consent order means both parties agreed to a mutual resolution prior to a judgement occurring. If I had to guess it was settled this way because of the time, misery and tremendous cost that would otherwise be involved for all parties. "Wins in court" suggests some thorough process leading up to a final evidential determination made by a judge. That is not what happened here, it'd probably be more accurate to say the parties were motivated to cooperate via their solicitors under looming threat of enduring that process.

No skin in this game, but looking at it from the respondents' perspective, 20k split between 4 is incredibly attractive: 5k and an apology to put an end to any threat relating to the issue, given the alternative of potentially unbounded costs to demonstrate innocence with a tortuous and uncertain outcome. From the claimant's perspective it is more confusing, triggering such a heavyweight process then settling for so little given the claimed harms suggests all kinds of things, not least including the kind of advice he may have received given the strength of the case as his solicitors understood it.

New comment by dmw_ng in "I Used Netscape Composer in 2024"

dmw_ng — Fri, 19 Apr 2024 08:29:18 +0000

We only very briefly had that around the start of Win95, by the time 98 came around (IIRC) Explorer and definitely IE already had flat toolbar buttons that only had bevels when hovering. I remember fetishizing how clean those new style controls looked (esp IE5ish with its grey-on-grey patterned background texture!) but it definitely broke the consistent interaction idiom you mention

New comment by dmw_ng in "50% discount on OpenAI pricing if you submit a batch and give them up to 24h"

dmw_ng — Mon, 15 Apr 2024 20:09:32 +0000

I'm not promoting one way or the other, just pointing out why things are the way they are. Restarting a service with stateful networking is reason enough to avoid it where possible, watching entire buildings melt for 15 minutes because a single binary SEGV'd is a real outcome. For an extra helping of pain, add a herd of third party clients of random versions to a system that never needed the comms capabilities on offer and you have a problem to solve that never needed to exist in the first place