Hacker News: Bartweiss

New comment by Bartweiss in "How well do cars do in crash tests they're not optimized for?"

Bartweiss — Tue, 30 Jun 2020 19:26:38 +0000

> in both cases someone could accidentally optimize the test

I think this is what I disagree with.

The water heater story is about a viable-for-market design which also optimized for the test. The equivalent for a car emissions test might be optimizing the transmission to reduce emissions at the specific speeds which will be tested. Those speeds could be sweet spots of the engine curve by accident, or they could be planned that way. I don't think that's necessarily right, but it's within the bounds of "natural" design for the product.

Instead of doing that, VW submitted something for testing which was fundamentally different from what went to market. Rather than being misleading, the test results were fundamentally irrelevant. Creating two completely different modes of behavior isn't something you could do by chance, and it means there's no real limit on how badly they could cheat.

New comment by Bartweiss in "How well do cars do in crash tests they're not optimized for?"

Bartweiss — Tue, 30 Jun 2020 19:14:51 +0000

This is omnipresent even where regulators aren't involved: every graphics card benchmark out there is 'manipulated' relative to real world performance. At this point it's so universal that I don't think anyone is even fighting it - as long as everyone games benchmarks roughly the same amount, the relative scores stay usable.

Your point about fairness and passive design is the one that makes me view these cases differently also. In the anecdote, the product being tested was the same one being sold, and there's no sign the heater was worsened to improve test performance. The designers just picked the best-scoring option among some reasonable configurations. (Frankly, once they noticed that issue, what were they supposed to do? Pick the worst-scoring, or pick the spec out of a hat?)

In the VW story, the test-bench vehicle was fundamentally different from the market vehicle, and the road version was designed to behave worse on the metrics to get other gains. I happen to know someone who bought a diesel Jetta specifically because it was more eco-friendly than other options, and I think he'd draw a clear line between tuning for test metrics and VW consciously lying to their buyers.

New comment by Bartweiss in "How well do cars do in crash tests they're not optimized for?"

Bartweiss — Tue, 30 Jun 2020 19:06:37 +0000

I do think that manipulating a purely instructive measure is less extreme than manipulating a compliance test; consumers can seek alternate tests and reviews, but the state emissions test has special status even if a dozen other tests give a different result. That said, I believe Energy Star ratings affect tax rebates and electric bills, and they're required to be printed on products - so that's not really an arbitrary test.

There are other differences here too, I think. The water heater trick is passive manipulation that stays in place at all times, which limits how far from "real" performance it can get. And per the story, it seems more like "teaching to the test" than "cheating". That is, Volkswagen consciously moved away from the mandate outside of testing. The water heater was (potentially) as energy-efficient as they could design, with the test score manipulated on top of that.

None of that makes it harmless - if "as good as you can make" doesn't hit standards without manipulating them, that's still a problem. But I do find it less galling than "intentionally worsens emissions outside the test bench".

New comment by Bartweiss in "How well do cars do in crash tests they're not optimized for?"

Bartweiss — Tue, 30 Jun 2020 18:42:49 +0000

Crash test dummies have basically this problem also. They're designed for realism in certain very narrow ways, and then the very small number of approved dummies are used for testing car safety.

The industry has made a bit of progress, surprisingly unprompted by regulations - female and child dummies came into circulation before they were required in tests. But overall, testing is still run against a tiny handful of body types which move 'realistically' in only a few regulation-guided respects.

New comment by Bartweiss in "Greatest Java apps"

Bartweiss — Tue, 30 Jun 2020 00:34:29 +0000

Citing Maven also feels a bit circular. It's an important Java application, but being a build tool it's only because there's lots of Java out there to build.

Minecraft and a lot of the other apps are terminally impressive, so it's easier to justify the ecosystem that produced them.

New comment by Bartweiss in "How South Korea Reined In The Outbreak Without Shutting Everything Down"

Bartweiss — Fri, 27 Mar 2020 14:37:25 +0000

Wait, which countries are we referencing outside of those three?

Thailand looks straightforwardly exponential so far and has fairly heavy mask use, agreed. But Singapore, Taiwan, and arguably Malaysia seem too early to call: they're still plausibly on either of a European curve or South Korea's ramp-then-flatline.

Vietnam, Cambodia, Laos, Mongolia, and Burma all seem to be below the line for meaningful data. And Hong Kong isn't broken out. So I guess my questions are: do Indonesia and the Philipines have "mask cultures" to a level comparable to South Korea and Japan, and are their testing regimes wide enough to rely on those curves?

I don't know the answer to that. And I agree that the "masks work" graph/meme circulating is questionable. But unless I'm missing something/somewhere, this data just looks like "too soon to call"?

New comment by Bartweiss in "A detailed look at the router provided by my ISP"

Bartweiss — Thu, 26 Mar 2020 14:15:39 +0000

> you had to be logged in to the web interface already with another account

Obviously I don't know specifics, but if this applies to any router which has multiple tiers of login then it could be a pretty serious problem. I suspect that might be true for routers designed specifically broadcast multiple networks (e.g. school or shared apartment-building routers)?

New comment by Bartweiss in "Social distancing slowing not only Covid-19, but other diseases too"

Bartweiss — Thu, 26 Mar 2020 13:50:59 +0000

I don't think this is uncharitable at all. I'm sure Kinsa has made a good effort at controlling for testing frequency, and I'm sure it's helped. But there's no reason to think the dynamics of COVID-motivated testing are the same as for flu-season, or new-buyer novelty, or anything else.

And more importantly, how could we know if it is? That's not just a Kinsa problem; we see this over and over again with peer-reviewed studies that "control for" certain factors like socioeconimics or health history. They're inherently limited to controlling for what they know about, and it's never perfect. Often, the entire effect is from an undiscovered variable. Take, say, the widely-promoted study finding that visiting a museum, opera, or concert just once a year is tied to a 14% decline in early death risk. The researchers tried to control for health and economic status, then concluded "over half the association is independent of all the factors we identified that could explain the link." [1]

Now, what seems more likely: that the unexplained half is from the profound, persistent social impact of dropping by a museum or concert once a year? Or that some of the explained factors like "civic engagement" can't be defined clearly, others are undercounted (e.g. mental health issues), and some were missed entirely?

I suspect Kinsa did much better than that, because they're not trying to control for such vague terms. But I think "even after controlling for" should basically never rule out asking "what if it's a confounder"?

[1] https://www.cnn.com/style/article/art-longevity-wellness/ind...

New comment by Bartweiss in "For Years WallStreet Spent More on Buybacks Than It Earned-Now They Want Bailout"

Bartweiss — Tue, 24 Mar 2020 14:37:10 +0000

Good point.

The TARP bailout in 2008 involved buying a ton of stock from troubled companies, but it was sold back to them as soon as they could buy the money back. And this will be the second bailout for a bunch of airlines.

So one of the most interesting ideas I've heard is that we shouldn't nationalize things by fiat, but when TARP-style bailouts happen, the government should just keep the stock, at least for a while. If it really was a one-off crisis, the shares are a good investment. But if it's a failing business, or one paying dividends and then looking for handouts, it's not just a money sink.

New comment by Bartweiss in "For Years WallStreet Spent More on Buybacks Than It Earned-Now They Want Bailout"

Bartweiss — Wed, 18 Mar 2020 17:19:21 +0000

There's also a fairly good argument for this in the line of trains and highways. Planes aren't physically trapped on one course, but pretty much every nation heavily regulates who can fly where, when. Airports are often state-controlled, and even private ones need state approval to add new runways or flights.

What we have now is one of the ridiculous "private non-market" arrangements. When airlines in Europe fly empty planes to stop the government from taking their flight slots away, that's not the fault of the companies, but it's also not a functional market we should expect efficiencies from.

I'm not a fan of "regulate markets into dysfunction then nationalize them", but if the fundamental restraints on travel are too severe to let the market function freely, privatization stops making much sense.

New comment by Bartweiss in "Why some governments appear not to be acting on the Covid-19 threat"

Bartweiss — Fri, 13 Mar 2020 15:37:12 +0000

The third option is "because they don't want to be blamed for model error". Governments aren't necessarily competent, but you can try to get them to understand 5%/95% confidence intervals, at least in hindsight. If you publicly release a prediction, and then the real outcome is the 10% confidence line, you're probably going to be yelled at for being wrong regardless of the error bars.

Of course, if the center of the prediction is horrifying, "people don't understand confidence intervals" then becomes a case of avoiding societal breakdown.

New comment by Bartweiss in "Smart devices are eating the market for accessibly-priced watches"

Bartweiss — Thu, 12 Mar 2020 17:09:44 +0000

It's been fascinating to see the rise, fall, and rise of digital watches among techies.

I remember 1990s Dilbert having an entire storyline about the engineers getting into a calculator-watch arms-race. In real life, it was pretty common to laugh about how a $50 digital Casio could do far more things than a Rolex.

By about 2010 (or perhaps even by the iPod Touch or Palm Pilot), I stopped hearing that. Watches had lost all of their unique functions to smartphones, so their raison d'etre was either "rugged and cheap" or "jewelry" and calculator watches almost vanished.

Circa 2015, we get Pebble gen 2, Apple Watch, and Fitbit Blaze: smart watches have phone integrations, fitness tracking, and don't look like hell anymore. Since then, they've increasingly aimed for design good enough to wear with a suit; the Galaxy Watch is always-on and analog.

These days, I see two splits among watch-wearing engineers: smartwatch vs not, and practical vs decorative. So the result is quadrants like:

Fitbit | Galaxy Watch

Casio | Longines

New comment by Bartweiss in "Jack Welch Inflicted Great Damage on Corporate America"

Bartweiss — Tue, 10 Mar 2020 15:28:22 +0000

> The CEO will be under tremendous pressure if he/she tries to optimize for a 1 year timeframe (for example) as opposed to quarter-by-quarter.

Its bizarre to talk to well-meaning execs (even below C-suite) at public companies and hear them overtly say this. "Well we know X and Y are sound investments for the company's success, but it's a question of finding a way to sell something that long-term without tanking our stock price."

I try not to cry market inefficiency without good evidence, but "shareholders promote good corporate governance" starts feeling pretty bizarre when the people running a company describe shareholders like corporate raiders encouraging them to destroy value for a quick payout.

> I wish boards can come up with a compensation structure for execs which optimizes for long term.

For all the talk about "when founders should get out of the way" and "what makes a good founder doesn't always make a good CEO", it's interesting to see that research still finds companies with founder-CEOs performing substantially better. Higher share prices (which might stem from overconfidence), but also better long-term financials, more R&D spending, more influential patent filings, etc.

And that doesn't necessarily mean founders are super-geniuses, exceptional managers, or even unusually attuned to their market. They get less of their salaries in cash, hold options and stocks longer, and vary their behavior less in response to compensation structure. (Also, they often hold so much stock they can't sell in full without panicking the market.)

So it really does look like we just haven't found a good way to compensate non-founder CEOs: their behavior is extremely responsive to their compensation, but nobody has found a scheme that makes them act long-term to the degree a founder would.

New comment by Bartweiss in "Before Clearview Became a Police Tool, It Was a Secret Plaything of the Rich"

Bartweiss — Fri, 06 Mar 2020 22:47:53 +0000

For any business with decent size, absolutely. There are a thousand ways to claw back options, and the reason they don't get used is that doing it even once would make hiring practically impossible.

For a small enough company? It falls in the same category as "diluting out of one guy's shares" - bad morals and bad business, but it still happens.

New comment by Bartweiss in "Before Clearview Became a Police Tool, It Was a Secret Plaything of the Rich"

Bartweiss — Thu, 05 Mar 2020 19:48:08 +0000

Not only is an easily discovered public event poor leverage, it becomes much worse leverage if it comes up in an interview.

When companies (or governments) try to manipulate employees, they frequently rely on some kind of willful ignorance. Wells Fargo is a great example: they set impossible performance targets and turned a blind eye to fraud, then fired and blacklisted whistleblowers - ostensibly for knowing about that same fraud!

If a shady employer wants leverage, even public events can suffice as long as they can claim ignorance. For example, most stock option grants are immediately lost if you're fired, but even at-will employment can't be terminated specifically to deprive someone of their options. So an employer might give a generous options package, then "discover" the IG video and use it for dismissal at just the right time to prevent a profitable exercise. But if that video comes up during hiring, it's no longer a plausible reason for later dismissal, at least without committing perjury regarding the interview.

I can't even work out a scenario where "lots of people know about this including us" is an effective way to manipulate someone.

New comment by Bartweiss in "Creating a Slack Writing Etiquette Guide for Your Workplace"

Thu, 05 Mar 2020 19:16:45 +0000

I notice this pattern all the time in guides to "polite" workplace communication. Their examples are hypothetical, so they look at how positive something sounds without considering the underlying content, or go even further and change content to improve tone. The advice looks good on paper, but using it when there's an actual task at hand might just sound sarcastic or disingenuous. The worst example I've ever seen was something like:

> Instead of "I need that report by the end of the day", try saying "I really appreciate you working to get that report out soon, it's a big priority right now!"

That's absolutely insane, because those are two completely different statements. The second one sounds less demanding because it's not the same request. So the tip isn't positive communication advice, it's either a schedule rework or failing to convey a deadline.

As for this specific example:

> By adding an emoji below, it's clear that the sender is embarrassed to make this last-second request, and isn't trying to come across as sarcastic, rude, or overbearing

That wasn't clear to me at all. If you type in "embarrassed", Slack will only suggest :flushed:, although I'd also have understood :sweat_smile:. I guess the monkey was meant as "I'm hiding my face with shame", but Slack calls that emoji ":see_no_evil:", and at first glance it seemed like "I'm trying to not to look over your shoulder, but is this done yet?". If the problem is "making a last second request", there's no particular reason that emoji are the best way to address it - one example simply has more content than the other. So I like your direct phrasing, and I might add:

> Hi , will you be able to have the report on X ready by ? I'm sorry it's such short notice, thank you!

New comment by Bartweiss in "Fed cuts half point in emergency move amid spreading virus"

Bartweiss — Tue, 03 Mar 2020 18:24:48 +0000

Eh, it probably buffers against overreaction, especially when a correction in fundamentals is being mixed with a reaction to new pressure.

But this is still a good point: if the market really is overheated then short-term monetary policy won't change that, and we can expect a lasting hit regardless of how disease issues play out. And it's not necessarily going to be obvious what's market movement and what's disease-related; I wouldn't be surprised if some over-hyped companies seize this as a chance to lower guidance faster than they normally could without spooking investors.

New comment by Bartweiss in "Varoufakis to Publish Notorious Eurogroup Recordings from 2015 Meetings"

Bartweiss — Mon, 24 Feb 2020 21:09:22 +0000

This is why the whole idea of "in the public interest" exists.

If a reporter received these same recordings in the mail, they would quite likely publish them. If they received a recording of a random person discussing their medical concerns, publishing that would be an outrageous breach of ethics.

(Hence the Gawker/Thiel debacle also. When Ted Haggard was caught having gay extramarital affairs, it was considered fit for publication because he was an evangelical preacher fighting against gay marriage. When a random private individual is outed, its not a public interest matter and can be libelous even when accurate. Thiel fell somewhere in between under both legal and journalistic rules, so we got a debate.)

I'm pretty baffled to see the parent comment imply that private discussions between politicians should inherently be kept secret. We could discuss specific news stories, reporters who violate attribution rules, and whether Varoufakis was bound by privacy laws or Eurogroup confidentiality rules. We could even argue the publication is in the public interest, and yet makes Varoufakis unfit to serve by destroying his ability to function with trust.

But just as you say, treating "that's a private discussion" as the end of matter would excuse Watergate also.

New comment by Bartweiss in "Scientists use ML to find an antibiotic able to kill superbugs in mice"

Bartweiss — Mon, 24 Feb 2020 20:35:19 +0000

This is a novel and important result in antibiotics. It's also a proof-of-concept for using ML to produce vital drugs with novel mechanisms, rather than incidental alterations or discoveries in noncompetitive spaces. It might be an incremental speedup or computing-power advance in ML drug discovery also, but it could equally just be the result of a lucky break or a particularly large lab-test budget. (In which case, "why didn't someone do it already?" is closer to asking why nobody else bothered to win the lottery.)

It's not a major theoretical advance in ML drug-discovery techniques or the first big step in ML drug discovery. It's certainly not the invention of ML drug discovery or neural nets as an ML technique, both things I've seen implied in news stories on this work.

This is attention-worthy, absolutely. (I'll leave "publication-worthy methodology" to experts.) But it's newsworthy on actual merits, as a drug breakthrough and a demonstration of an increasingly-important technique. So I share the frustration when lazy or confused reporting implies this is the same style of ML-theory breakthrough as CNNs, Transformers, or even neural nets themselves.

New comment by Bartweiss in "Scientists use ML to find an antibiotic able to kill superbugs in mice"

Bartweiss — Mon, 24 Feb 2020 20:15:14 +0000

I think the criticism is that it's not obvious whether success here was a function of improved performance, expanded throughput, expanded testing, or sheer luck.

Chess engines have clearly improved in both design and computing power over the years; doubling an engine's resources or pitting a new engine against an old one produces straightforwardly better play. But the drug-discovery technique in use here may not be "playing better" in terms of producing higher-quality predictions.

To extend the chess metaphor:

- Deep Fritz is a stronger player Deep Blue even with 4% as much computing power. This story does not appear to be an algorithmic breakthrough of that source.

- Deep Blue lost to Kasparov in 1996, then beat him in 1997 with double the computing power. That's a clear improvement in play, but not an improvement in efficiency. This story might represent such a change, modelling more prospective drugs to test higher-confidence candidates.

- If an AI that can only win 2% of games against humans plays 10 games, it has an 18% chance of beating someone. But over 100 games, it has an 87% chance of a win. This result might be a team with a larger testing budget claiming the 'first win' without any AI-side improvement.

- If a dozen grandmaster-level chess AIs play GMs, one of them will have to get the first win against a human. Labeling this result a 'breakthrough' in AI terms might be outright publication bias among equivalent projects.

As far as the drug, none of that really matters, except that efficiency improvements would have more potential to increase drug discovery. The drug itself is still useful, and the discovery is a proof of concept; in 1980 no possible computer would have beaten Kasparov. But this is being hailed as a breakthrough in AI in seriously questionable ways. The BBC article, for example, managed to imply that this specific project was novel and important for using neutral nets to produce a significant result.