Hacker News: mattj

New comment by mattj in "Waymo One is now open to all in Los Angeles"

mattj — Tue, 12 Nov 2024 18:22:13 +0000

Good news: this isn't true - https://waymo.com/blog/2024/05/fleet-response/

New comment by mattj in "DIY Espresso (2020)"

mattj — Thu, 17 Aug 2023 13:40:47 +0000

I've built one of these based on the instructions and use it daily (and love it!), but the tone of the site is pretty reflective of the project overall. It's definitely a really impressive hack and I appreciate all the hard work that's gone into it, but could really use a little more user-facing empathy.

I'm not sure if I'd recommend it to someone else - and if I was doing it again I'd probably spend a few hundred more (than the gaggia + parts cost) and just buy an off-the-shelf machine with the same feature set.

New comment by mattj in "Nearly 12M Square Feet of Vacant Office Space in S.F"

mattj — Sun, 25 Oct 2020 23:23:07 +0000

I think the grandparent is probably referring to upstate in the Hudson Valley sense of the phrase. Plenty of cute towns, but Hudson / Woodstock / Kingston are definitely in the O(tens) of great restaurant options, comparable to a slice of any single Manhattan / Brooklyn neighborhood most tech people live in.

New comment by mattj in "Attention Is All You Need"

mattj — Sat, 16 Dec 2017 19:49:21 +0000

The other answers cover the math well, but I think the “why do you need attention?” statement is worth making (and answers the more engineering-y question of “how/when?”):

DNNs typically operate on fixed-size tensors (often with a variable batch size, which you can safely ignore). In order to incorporate a non-fixed size tensor, you need some way of converting it into a fixed size. For example, processing a sentence of variable length into a single prediction value. You have many choices for methods of combining the tensors from each token in the sentence - max, min, mean, median, sum, etc etc. Attention is a weighted mean, where the weights are computed based on a query, key, and value. The query might represent something you know about the sentence or the context (“this is a sentence from a toaster review”), the key represents something you know about each token (“this is the word embedding tensor”), and the value is the tensor you want to use for the weighted mean.

New comment by mattj in "Show HN: Kozmos – A Personal Library"

mattj — Thu, 10 Aug 2017 17:58:28 +0000

This is great - the like / heart button is really slick, and I love how it doesn't get in the way at all. I've used pinboard and others in the past, and the (relatively) heavier bookmarking flow would often stop me from saving things as I didn't want to break my flow.

Excited to see where this ends up!

New comment by mattj in "Most Winning A/B Test Results Are Illusory [pdf]"

mattj — Thu, 19 Jan 2017 23:07:26 +0000

I agree with you (and love your blog, btw), but I think you're skipping over at least a few benefits you can get out of a mature / well built a/b framework that are hard to build into a bandit approach. The biggest one I've found personally useful is days-in analysis; for example, quantifying the impact of a signup-time experiment on one-week retention. This doesn't really apply to learning ranking functions or other transactional (short-feedback loop) optimization.

That being said, building a "proper" a/b harness is really hard and will be a constant source of bugs / FUD around decision-making (don't believe me? try running an a/a experiment and see how many false positives you get). I've personally built a dead-simple bandit system when starting greenfield and would recommend the same to anyone else.

New comment by mattj in "Wide and Deep Learning: Better Together with TensorFlow"

mattj — Wed, 29 Jun 2016 22:56:39 +0000

I think the change here is they're learning the embeddings alongside the feature weights (eg they're part of the same loss function).

New comment by mattj in "Twilio S-1"

mattj — Thu, 26 May 2016 17:11:40 +0000

This kind of language is very standard. The risks section pretty much always contains obvious platitudes ("An earthquake might destroy all our computers," "All our employees may quit").

New comment by mattj in "Ask HN: How do I get better at CSS?"

mattj — Sun, 31 May 2015 16:05:58 +0000

Similar experience, but I focused on finding UI elements I liked from native apps or websites and attempted to clone them without looking at the source, then played around with the result to figure out how I could simplify it, how it behaved cross browser, etc etc.

New comment by mattj in "Introducing Progressive Equity – Increase employee ownership as company grows"

mattj — Wed, 08 Apr 2015 00:05:59 +0000

heads up, your math is a little buggy - 1% of $90b is $900m, not $90m

New comment by mattj in "Kythe: A pluggable ecosystem for building tools that work with code"

mattj — Tue, 27 Jan 2015 17:46:08 +0000

What does this do? I've browsed through the site for a few minutes, and still have no idea what kind of tools you could build with this that you couldn't build before.

Is this for cross-language doc generation? Refactoring tools? Something else?

Are there any concrete examples of a tool built on top of this that would otherwise be impossible / very difficult?

New comment by mattj in "Scalable A/B Experiments at Pinterest"

mattj — Thu, 21 Aug 2014 22:02:20 +0000

These are both good points. I think it's worth calling out here that this post is really about the infrastructure to perform experiments, not necessarily the means of analysis (although the ui you see in the screenshots performs that analysis).

In terms of these issues, we handle them in a few ways. In particular, experiment review catches many of these issues. Think of it like code review for your experiments.

In order to run an experiment, we require you to have an "experiment helper" sign off on your change. This involves reviewing your group sizes, verifying that you have the statistical power you need to test the magnitude of change your hypothesis expects, verifying you interact with the framework correctly etc.. Training to become an experiment helper is generally not very easy, and involves a combination of shadowing existing reviewers, performing enough reviews across the stack, and taking a test to verify you understand potential errors (the test itself being composed of many experiments where we have made mistakes).

Changes to experiments (increasing group sizes, terminating an experiment, modifying an experiment etc.) all require this review.

New comment by mattj in "Moving product recommendations from Hadoop to Redshift saves us time and money"

mattj — Thu, 19 Jun 2014 14:04:37 +0000

A teammate of mine wrote a post about our redshift setup a few months ago with some more details: http://engineering.pinterest.com/post/75186894499/powering-i...

New comment by mattj in "Moving product recommendations from Hadoop to Redshift saves us time and money"

mattj — Thu, 19 Jun 2014 14:00:40 +0000

Yup! My experience with redshift has actually made me curious to try out Postgres (I've always used MySQL before this). The stricter SQL dialect was a little odd at first, but I think I've become more comfortable with it over a few months.

New comment by mattj in "Moving product recommendations from Hadoop to Redshift saves us time and money"

mattj — Thu, 19 Jun 2014 13:48:58 +0000

I've gone through a similar transition (hive to redshift) in a very large scale data environment. Raw Hadoop / cascading is still very useful for more complicated workflows, but redshift is so vastly superior to hive it's not even funny. I thought I would miss adding my own UDFs, but this hasn't been an issue at all. I'm under the impression presto is a similar improvement, but I haven't spent any time with it.

One huge advantage of redshift over hive: you can connect with plain old Postgres libraries, so you can build redshift results into your admin interfaces, one off scripts, and anywhere else you're fine trading a few seconds of latency for extra data.

New comment by mattj in "Zuckerberg, Musk Invest in Artificial-Intelligence Company Vicarious "

mattj — Fri, 21 Mar 2014 15:49:19 +0000

The AI winter is very much over, and we're back to the good old days of selling the future. I bet this team is very sharp, but there's still merit to "over-promise, under-deliver."

"Phoenix, the co-founder, says Vicarious aims beyond image recognition. He said the next milestone will be creating a computer that can understand not just shapes and objects, but the textures associated with them. For example, a computer might understand “chair.” It might also comprehend “ice.” Vicarious wants to create a computer that will understand a request like “show me a chair made of ice.”

Phoenix hopes that, eventually, Vicarious’s computers will learn to how to cure diseases, create cheap, renewable energy, and perform the jobs that employ most human beings. “We tell investors that right now, human beings are doing a lot of things that computers should be able to do,” he says."

New comment by mattj in "Introducing SmartStack: Service Discovery in the Cloud"

mattj — Wed, 23 Oct 2013 20:59:12 +0000

I've been hearing good stuff about this for a while - it's awesome to see it open sourced! Setting up a local haproxy to handle service failover / discovery is a really clever solution, and is an awesome approach to encapsulating a bunch of really messy logic.

New comment by mattj in "Parameterized Docker Containers"

mattj — Wed, 04 Sep 2013 18:24:52 +0000

This is cool, but probably the biggest appeal of docker to me is never having to use puppet again.

New comment by mattj in "Mrjob: A Python 2.5+ package that helps you write and run Hadoop Streaming jobs"

mattj — Wed, 24 Jul 2013 18:00:44 +0000

(original author of mrjob here)

Steve's post is 100% correct. I originally wrote mrjob as an internal tool at yelp out of my frustration with using dumbo for multi-step jobs. Specifically, I found myself writing the same incantation of "wrap a mapper / reducer function with an encoding scheme" over and over again. I tried to add protocol support into dumbo (so you could specify that your job reads json, uses pickle for intermediate data, and writes thrift), but I had a hard time working with the dumbo codebase (disclaimer: I haven't looked at it since, so it might be easy to do this now). I also wanted to represent mappers and reducers as python generators, which makes writing memory-performant steps natural (eg you normally want to rely on the shuffle / sort to perform the hard work of aggregating by key). Finally, I wanted my jobs to be easy to test both from unittest and from the command line - debugging hadoop streaming jobs is way more of a pain in the ass than it should be.

New comment by mattj in "Poll: Full-time software engineers in the Bay Area, what's your annual salary?"

mattj — Sat, 01 Jun 2013 06:10:14 +0000

If they can't afford to pay a market salary, you should be getting at least 0.5% (and probably more like 1%). Still not life changing at 40mm with no dilution (200-400k), but a little more meaningful.