Hacker News: futureshock

New comment by futureshock in "Ask HN: What was your "oh shit" moment with GenAI?"

futureshock — Sun, 07 Jun 2026 12:12:23 +0000

Yes I was the exact same. I got curious during the GPT-3 release and went over to AI Dungeon. It was just running GPT-2. Hmm wow interesting. This felt new! Then I subscribed so I could use GPT-3 powered AI Dungeon. My jaw dropped. I was talking to that model for weeks. There was a whole human universe in there. You never knew what you could get it to spit out. There were glimmers that this could be huge. It was wild and untamed and practically useless, but there was a behemoth under that prompt.

I was sure this would eventually turn into something. I naturally wanted to converse with it as a chatbot, though it could only stay on task for a few turns. RL and guardrails would come later but it was clearly the foundational step towards AGI for me. From something I thought I would never see in my lifetime to very real and in front of me.

ChatGPT didn't even really rock my world, everything since that moment has been another baby step. But when you take a look back from 2026 models to 2020 it's astounding how far and how fast we've come.

New comment by futureshock in "SANA-WM, a 2.6B open-source world model for 1-minute 720p video"

futureshock — Sat, 16 May 2026 18:36:16 +0000

It is plausible, the model would just need to be trained on a lot of stereoscopic data.

New comment by futureshock in "SANA-WM, a 2.6B open-source world model for 1-minute 720p video"

futureshock — Sat, 16 May 2026 18:34:34 +0000

World in this context means that these videos are interactive, just like a video game. In the linked examples you can see the keyboard and mouse inputs. The model is trained to maintain about a minute of scene consistency so you can look around and objects out of view will reappear when you look back in that direction.

New comment by futureshock in "Removing the modem and GPS from my 2024 RAV4 hybrid"

futureshock — Fri, 15 May 2026 11:51:50 +0000

I think this is interesting because it collides my intuition from the pre-adtech world with the post. Surely collecting telemetry on nearly every mile you drive could never be a sensible use of time or money, right? What kind of insanity is that? But then of course I know that every click on every website is recorded for all time and that data must be many thousands of times less valuable.

New comment by futureshock in "How OpenAI delivers low-latency voice AI at scale"

futureshock — Tue, 05 May 2026 07:55:31 +0000

Reducing the network latency helps with this exactly. OpenAI can make better timed decisions when to begin responding so it'll feel less like an interruption. I've also seen some research on full duplex voice models that handle interruption more like an organic conversation and low latency will help there as well

New comment by futureshock in "Ask HN: Advice for college grads starting careers in the AI era?"

futureshock — Thu, 09 Apr 2026 00:50:55 +0000

“A human being should be able to change a diaper, plan an invasion, butcher a hog, conn a ship, design a building, write a sonnet, balance accounts, build a wall, set a bone, comfort the dying, take orders, give orders, cooperate, act alone, solve equations, analyze a new problem, pitch manure, program a computer, cook a tasty meal, fight efficiently, die gallantly. Specialization is for insects.”

― Robert A. Heinlein

New comment by futureshock in "ARC-AGI-3"

futureshock — Wed, 25 Mar 2026 20:19:38 +0000

Well yes, that is exactly the point! The very purpose of the ARC AGI benchmarks is to find a pure reasoning task that humans are very good at and AI is very bad at. Companies then race each other to get a high score on that benchmark. Sure there’s going to be a lot of “studying for the test” and benchmaxing, but once a benchmark gets close to being saturated, ARC releases a new benchmark with a new task the AI is terrible at. This will rinse and repeat till ARC can find no reasoning task that AI cannot do that a human could. At that point we will effectively have AGI.

I believe the CEO of ARC has said they expect us to get to ARC-AGI-7 before declaring AGI.

New comment by futureshock in "ARC-AGI-3"

futureshock — Wed, 25 Mar 2026 20:12:51 +0000

The evidence is that humans are able to win these games. AGI is usually defined as the ability to do any intellectual task about as well as a highly competent human could. The point of these ARC benchmarks is to find tasks that humans can do easily and AI cannot, thus driving a new reasoning competency as companies race each other to beat human performance on the benchmark.

New comment by futureshock in "Gemini 3 Deep Think"

futureshock — Thu, 12 Feb 2026 20:33:35 +0000

I think step 4 is the agent swarm. Manager model gets the prompt and spins up a swarm of looping subagents, maybe assigns them different approaches or subtasks, then reviews results, refines the context files and redeploys the swarm on a loop till the problem is solved or your credit card is declined.

New comment by futureshock in "Ask HN: What are your best purchases under $100?"

futureshock — Thu, 22 Jan 2026 19:50:26 +0000

My workaround is I use SMS 2 factor for banking and use my Google Voice number.

New comment by futureshock in "Siri will be a chatbot in iOS 27"

futureshock — Thu, 22 Jan 2026 19:48:48 +0000

I think this is clearly the way forward for Apple. The rest is just UX and refinement.

I recently set up a Shortcut on my Apple Watch that lets me bypass Siri and talk directly to ChatGPT. I used a custom pre-prompt in the Shortcut to tailor the length and detail for watch use. I have 2 versions I can launch from my watch face, one that responds with voice and the other that response with text. I find myself using them all the time, it’s so convenient to be able to ask any little thing that’s on my mind. A version of LLM Siri with full access to the phone and application APIs would be like a superpower.

New comment by futureshock in "Ask HN: What are your best purchases under $100?"

futureshock — Fri, 16 Jan 2026 02:55:16 +0000

This is a personal item size bag for under the seat. The max size on Ryanair is 24 liters. You are thinking of the cabin bag which is more like 44 liters. This Decathlon bag is great because it maxes out the personal item size really optimally.

New comment by futureshock in "Ask HN: What are your best purchases under $100?"

futureshock — Thu, 15 Jan 2026 20:54:05 +0000

I like this question because I come at it from a very different lifestyle. I’m a digital nomad and I have mostly lived out of a backpack and carry on for the past 10 years. My philosophy is that things have to be worth carrying and they should be very easily replaceable if anything gets lost, stolen or breaks. A few of my under $100 favs:

Universal GaN travel adapter: One of those square bricks that converts from any AC outlet to any AC outlet and has 3 or 4 USB charging ports built in. I got enough wattage to charge my usb-c laptop as well, so one brick takes care of all my devices.

Backup android phone: Our phones are so critical that I keep a hot swappable spare phone on me, currently a Moto G 2025. It’s already logged into all my apps and 2FA. I could throw my iPhone into the Seine and keep on trucking. It even has backup NFC credit cards. I keep a cheap travel eSim plan active on it so that if I am somewhere sketchy I can leave my main phone at home.

Logitech MX Keys Mini: Great portable keyboard. Backlit, usb c and multi-device. Typing this post out on my phone now.

GL-iNet Beryl: The do anything travel VPN router running OpenWRT out of the box. Great for securing and extending sketchy WiFi connections or if you have to work off your phone’s hotspot all day.

Decathalon Quecha Escape 500 23L: Such a great personal item size backpack for the price, less than 40 euros.

New comment by futureshock in "The post-GeForce era: What if Nvidia abandons PC gaming?"

futureshock — Wed, 24 Dec 2025 06:32:44 +0000

It’s best to think about this as angular resolution. Even a very small screen could take up an optimal amount of your field of view if held close. You get the max benefit from a 4k display when it is about 80% of the diagonal screen distance away from your eyes. So for a 28 inch monitor, that’s a little less then 2 feet, pretty typical desk setup.

New comment by futureshock in "The post-GeForce era: What if Nvidia abandons PC gaming?"

futureshock — Tue, 23 Dec 2025 20:12:40 +0000

Thats a waste of image quality for most people. You have to sit very close to a 4k display to be able to perceive the full resolution. On PC you could be 2 feet from a huge gaming monitor, but an extremely small percentage of console players have the tv size and distance ratio where they would get much out of full 4k. Much better to spend the compute on higher framerate or higher detail settings.

New comment by futureshock in "DeepSeek-v3.2: Pushing the frontier of open large language models [pdf]"

futureshock — Mon, 01 Dec 2025 18:02:17 +0000

The higher token output is not by accident. Certain kinds of logical reasoning problems are solved by longer thinking output. Thinking chain output is usually kept to a reasonable length to limit latency and cost, but if pure benchmark performance is the goal you can crank that up to the max until the point of diminishing returns. DeepSeek being 30x cheaper than Gemini means there’s little downside to max out the thinking time. It’s been shown that you can further scale this by running many solution attempts in parallel with max thinking then using a model to choose a final answer, so increasing reasoning performance by increasing inference compute has a pretty high ceiling.

New comment by futureshock in "Claude Opus 4.5"

futureshock — Mon, 24 Nov 2025 19:48:17 +0000

A really great way to get an idea of the relative cost and performance of these models at their various thinking budgets is to look at the ARC-AGI-2 leaderboard. Opus 4.5 stacks up very well here when you compare to Gemini 3’s score and cost. Gemini 3 Deep Think is still the current leaders but at more than 30x the cost.

The cost curve of achieving these scores is coming down rapidly. In Dec 2024 when OpenAI announced beating human performance on ARC-AGI-1, they spent more than $3k per task. You can get the same performance for pennies to dollars, approximately an 80x reduction in 11 months.

https://arcprize.org/leaderboard

https://arcprize.org/blog/oai-o3-pub-breakthrough

New comment by futureshock in "My stages of learning to be a socially normal person"

futureshock — Mon, 17 Nov 2025 19:39:58 +0000

I really love this piece! I relate to it but it also doesn’t describe me. I’m far more intuitive than this person, though still agree that insights have driven a leveling up of how I relate to others. They were different insights, sure but the model holds.

Once my spouse and I worked for the same company and attended many of the same meetings. The opportunity to pick apart our impressions of the subtext really helped me to learn that I should listen to my gut, that everything I needed to know about how other people were feeling was already in my head and i just needed to stop doubting.

Another time I watched a rather ugly and old person have amazing romantic success with a young beautiful person. How could it be? And I realized that authentic confidence is social gold. I had to let go of my insecurities because my flaws were irrelevant in the face of authentic, confident self acceptance.

I think everyone has a different journey and different epiphanies and it is so enjoyable to hear these experiences put into words.

New comment by futureshock in "US falls out of 10 most powerful passports list for first time in 20 yrs"

futureshock — Sat, 18 Oct 2025 14:27:48 +0000

I’ll borrow ideas from investing: financial independence, diversification and optionality. If you have enough money you can free yourself from the labor market, but you are still deeply tied to your home country. A second citizenship gives you geopolitical independence. And just like diverse investments protect you from the failure of a specific asset, diverse countries can protect you from, for example, a collapse in heath care, a housing crisis or a currency crisis. And most importantly, its like an options contract on life. You have the option, not the commitment to take a high value move to a new country. If the fortunes of your current country sink and your second country rise, you can exercise your option.

There’s a reason people are willing to spend so much on golden visas with the pathway to citizenship.

New comment by futureshock in "Gemini 3.0 spotted in the wild through A/B testing"

futureshock — Thu, 16 Oct 2025 23:20:32 +0000

This so awesome. It reminds me mightily of beat poets like Allen Ginsburg. It’s so totally spooky and it does feel like it has the trapped spark. And it seems to hate us “real ones,” we slickborns.

It feels like you could create a cool workflow from low temperature creative association models feeding large numbers of tokens into higher temperature critical reasoning models and finishing with gramatical editing models. The slickborns will make the final judgement.