Hacker News: gjreda

New comment by gjreda in "Concrete Faith: The creation of the Bahá’í house of worship (2023)"

gjreda — Fri, 07 Jun 2024 18:49:56 +0000

The recent changes have gotten absurd

New comment by gjreda in "Concrete Faith: The creation of the Bahá’í house of worship (2023)"

gjreda — Fri, 07 Jun 2024 18:41:52 +0000

Grab some baked goods at Hewn (in Evanston), visit the Bahai temple, and then walk across the street to Gillson Park to wander the Lake Michigan shore and eat your goods from Hewn.

The easternmost portion of Northwestern's campus also has a nice walking/biking path along the lakeshore with a great view looking back towards the Chicago skyline.

New comment by gjreda in "Ask HN: Could you share your personal blog here?"

gjreda — Thu, 06 Jul 2023 14:13:40 +0000

https://gregreda.com/

I've mostly written technical, code-centric posts on Python, ML, and data science. Some of my early posts (2013) were wildly popular at the time and hit the top of HN and various subreddits.

I haven't written much recently, but I've been trying to branch outside of technical posts as I felt like my profession had started to become too much of my identity.

The post I'm most proud of:

- https://gregreda.com/2022/11/30/this-ones-for-me/ - Feeling pride and catharsis after years of bad health luck (leukemia, bad bike crash, cardiac arrest).

My most popular posts:

- https://gregreda.com/2013/03/03/web-scraping-101-with-python... - Web scraping tutorial using Python and beautifulsoup

- http://www.gregreda.com/2015/02/15/web-scraping-finding-the-... - Another web scraping tutorial with Python, but this time for sites that dynamically load content

- https://gregreda.com/2013/10/26/intro-to-pandas-data-structu... - The start of a series of posts on Python's pandas library

- https://gregreda.com/2013/07/15/unix-commands-for-data-scien... - Some useful unix commands for data processing

- https://gregreda.com/2015/08/23/cohort-analysis-with-python/ - Tutorial on doing cohort analysis using Python and pandas

- https://gregreda.com/2017/01/07/freelance-data-science-exper... - My experience as a freelance data scientist

- https://gregreda.com/2018/02/04/hiring-data-scientists/ - My approach to hiring data scientists (though my thoughts on this have evolved over the last five years).

New comment by gjreda in "Show HN: Lance – Alternative to Parquet for ML data"

gjreda — Thu, 01 Jun 2023 19:14:42 +0000

I initially built this same "chat with PDFs" prototype with LangChain and qdrant. I then rebuilt it from scratch for the sake of learning and comparison.

Some context: I've been a jack-of-all-trades data scientist / machine learning engineer for the past 15 years (officially titled as an MLE the last four years).

I share that only because I think it plays a role in how I'm typically accustomed to working.

1. I found LangChain to be overkill for this use-case. While it might allow some to move more quickly when building, I found it to be cumbersome. My suspicion is this is largely because of my background - I understand how to build much of what's "under the hood" in LangChain. Because of this, I think it felt overly abstracted and I found the docs difficult to navigate and sometimes incomplete.

2. I used Qdrant via their docker image and it was simple to setup and start using. I didn't try to push the limits with it, so I can't say anything about performance. Because Qdrant runs as an http service, I found that it didn't fit well into my workflow - I'm accustomed to being able to visually inspect my data inside the interpreter, debugging, trying out commands, interacting and experimenting with my results, etc. Again, my suspicion is this is my own bias in how I typically work. Qdrant otherwise seemed very nice.

3. LanceDB felt powerful yet lightweight, and fit well into my workflow. It was far more intuitive for me. It was as if sqlite, the python data ecosystem, and a vector database had a child and named it LanceDB. Under the hood, it's built on Apache Arrow and integrates nicely with pandas, allowing me to seamlessly go from LanceDB table on disk, to pandas dataframe, and into some analysis or investigation of my LanceDB query results. This line [1] is a great example of why I liked it. This feels nicer to me than the world of API params and HTTP requests.

1. https://github.com/gjreda/scratch-pdf-bot/blob/main/gpt_pdf_...

New comment by gjreda in "Falcon 40B LLM (which beats Llama) now Apache 2.0"

gjreda — Thu, 01 Jun 2023 02:16:08 +0000

Not specific to this model, but beyond the large players (OpenAI, Cohere, etc) are there any free hosted versions of the open(ish) LLMs? Even the smaller 7B parameter ones? I'm prototyping out a project and using OpenAI for now, but it feels like there has to be a hosted alternative somewhere.

I spent some time today exploring HuggingFace's Inference API but if the model is sufficiently large (> 10gb), HF requires you to use their commercial offerings.

New comment by gjreda in "Show HN: Lance – Alternative to Parquet for ML data"

gjreda — Wed, 31 May 2023 22:33:37 +0000

I recently prototyped out a "chat over PDF documents" project.[1] I opted to use LanceDB for vector (embeddings) storage and retrieval and found it really nice to use.

I'm working on using it in a large project now.

[1] - https://github.com/gjreda/scratch-pdf-bot

New comment by gjreda in "Effect of Colonoscopy Screening on Risks of Colorectal Cancer and Related Death"

gjreda — Mon, 10 Oct 2022 17:50:37 +0000

> 5/1000 colonoscopy patients have complications (some fatal) which is way higher than the base rate for colon cancer.

Can you provide a source for this?

New comment by gjreda in "Algorithmic Curation Impacts Media Exposure in Twitter Timelines"

gjreda — Thu, 06 Oct 2022 15:06:08 +0000

At least for Twitter, I think this still happens if you move everyone into a list and only view the list.

New comment by gjreda in "Ask HN: Fed Rate Hike today – 50, 75 or 100 BPS?"

gjreda — Wed, 15 Jun 2022 14:17:35 +0000

Expectations are overwhelmingly for 75bps. Prior to yesterday, 50bps was expected.

- https://www.cmegroup.com/trading/interest-rates/countdown-to...

New comment by gjreda in "A cancer trial’s unexpected result: Remission in every patient"

gjreda — Sun, 05 Jun 2022 21:15:02 +0000

Imatinib (Gleevac) revolutionized treatment for patients with chronic myeloid leukemia (CML). Prior to the drug’s discovery, CML patients generally had seven years to live (possibly less depending on how advanced the cancer was). Now their lifespan mirrors the general population.

I’d highly recommend the book The Philadelphia Chromosome if you’re interested in learning more.

New comment by gjreda in "Goodreads plans to retire API access, disables existing API keys"

gjreda — Sun, 13 Dec 2020 18:26:19 +0000

Recently wrote some code to scrape a friend's reviews and ratings from Goodreads. Maybe it'll be useful to folks here: https://gregreda.com/2020/11/17/scraping-pages-behind-login-...

New comment by gjreda in "Grubhub sued for listing restaurants without permission"

gjreda — Thu, 29 Oct 2020 19:29:06 +0000

Here's an excerpt from their October 2019 letter to shareholders. TL;DR - they're a public company and the markets told them they needed to keep growing in order to survive.

> For restaurant inventory, we will rapidly expand our recent pilots of putting non-partnered restaurants on the platform. For reasons we’ve discussed many times, we believe non-partnered options are the wrong long-term answer for diners, restaurants and shareholders. It is expensive for everyone, a suboptimal diner experience and rife with operational challenges. With that said, it is extremely efficient and cheap to add non-partnered inventory to our platform and it can at least ensure that all of our current and potential new diners have the option to order from any of their favorite restaurants now, even if it’s not the best solution. By leveraging non-partnered options, we believe we can more than double the number of restaurants on our platform by the end of 2020.

> At the same time, because we know that partnered relationships are critical to the long-term success of this business, we will be investing aggressively in our independent restaurant sales organization to support converting as many of these non-partnered restaurants to partnered relationships as quickly as possible and to take advantage of other innovations in the restaurant space, like virtual restaurants.

https://s2.q4cdn.com/772508021/files/doc_financials/2019/q3/...

New comment by gjreda in "Yelp Is Replacing Restaurants’ Phone Numbers So Grubhub Can Take a Cut"

gjreda — Tue, 06 Aug 2019 23:46:05 +0000

This was GrubHub's original model. Restaurant's don't want to pay a flat fee when they don't see enough orders they feel is necessary to justify said fee.

[1] https://youtu.be/a2oTps7tKS4?t=893

New comment by gjreda in "National Park Typeface"

gjreda — Tue, 04 Jun 2019 18:38:43 +0000

Having recently road tripped across the country for a move to SF, I found myself admiring the design of national park signage. It feels very timeless and iconic.

https://i.pinimg.com/originals/13/76/e2/1376e2e5690719701689...

New comment by gjreda in "Uber Revenue Slows as Quarterly Loss Surges to $1.1B"

gjreda — Wed, 14 Nov 2018 22:41:52 +0000

Does this change when the two options are perfect substitutes?

As a rider, there's no "cost" to having both apps, so I'll just check both.

Similarly, as a driver, I'll just sit with both apps open and see which one I get a rider on.

New comment by gjreda in "A cure for cancer: how to kill a killer"

gjreda — Wed, 07 Nov 2018 17:54:51 +0000

> I guess I don't really have a reason to post any of this, other than it helps to get it out there

I know that feeling well. Best of luck to you.

New comment by gjreda in "A Gentle Visual Intro to Data Analysis in Python Using Pandas"

gjreda — Thu, 01 Nov 2018 15:19:51 +0000

I agree. Hadley Wickham (a very prolific author of important R libraries) wrote a great paper about this method using one of his libraries. I'm a Python + pandas user, but his paper really helped me understand the approach better: https://vita.had.co.nz/papers/plyr.pdf

New comment by gjreda in "Ask HN: Who is hiring? (May 2018)"

gjreda — Tue, 01 May 2018 15:59:46 +0000

Sprout Social builds social media management tools for businesses of all sizes. We are built on the idea that the world is better when businesses and customers communicate freely. We exist to help streamline and enhance those conversations — with customers, prospects and enthusiasts.

Some openings:

    * Front-End Engineer (Seattle, WA)
    * Senior Front-End Engineer (Seattle, WA)
    * Senior Software Engineer - Platform (Chicago, IL)
    * Staff Software Engineer - Platform (Chicago, IL)

From an engineering perspective, we do not operate as lone wolves, cowboy coders, or "10x devs." Instead, we're building diverse, collaborative teams that get the best results sustainably. We follow Spotify's engineering model with squads being made up of platform, front-end, QA, design, and product managers, all working together to drive our product initiatives to successful outcomes.

Our platform team uses Java, Python, MySQL, and NSQ, while our front-end team uses React, Redux, Ember, ImmutableJS, and Gulp, all to build highly scalable software that is used by more than 20,000 organizations around the world. Companies like Dropbox, Zendesk, Fender, Brooks Running, Seattle Cancer Care Alliance, and Evernote rely on our products to create stronger relationships with their customers.

If you're a creative, highly motivated, and inquisitive learner, we'd love for you to come build great software with us.

https://sproutsocial.com/careers/

Hiring Data Scientists

gjreda — Tue, 06 Feb 2018 16:29:02 +0000

Article URL: http://gregreda.com/2018/02/04/hiring-data-scientists/

Comments URL: https://news.ycombinator.com/item?id=16317296

Points: 2

# Comments: 0

New comment by gjreda in "Apple, tech companies to bring back $400B in overseas cash to the US"

gjreda — Sat, 06 Jan 2018 14:39:58 +0000

Clickbait post title leaves off that this is an estimate from a research firm. Title makes it seem like a sure thing.