Hacker News: campers

New comment by campers in "Gemini 3 Flash: Frontier intelligence built for speed"

campers — Thu, 18 Dec 2025 06:26:17 +0000

I had wondered if they run their inference at high batch sizes to get better throughput to keep their inference costs lower.

They do have a priority tier at double the cost, but haven't seen any benchmarks on how much faster that actually is.

The flex tier was an underrated feature in GPT5, batch pricing with a regular API call. GPT5.1 using flex priority is an amazing price/intelligence tradeoff for non-latency sensitive applications, without needing to extra plumbing of most batch APIs

New comment by campers in "Imec's superconducting chips to shrink power usage 100x (2024)"

campers — Tue, 23 Sep 2025 16:49:44 +0000

OpenAI and Nvidia's 10GW datacenter agreement and Sam Altmans new blog post wanting to build 1GW of infra a week made me think of this article again and to post it up.

Here's a couple of interesting paragraphs from it.

  Instead of the transistor, the basic element in superconducting logic is the Josephson-junction.
  
  For logic, a Josephson-junction loop without a persistent current indicates a logical 0, while a loop with one single flux quantum’s worth of current represents a logical 1. For memory, two Josephson junction loops are connected together. An SFQ’s worth of persistent current in the left loop is a memory 0, and a current in the right loop is a memory 1.
  
  In classical CMOS-based technology, it is very challenging to stack computational chips on top of each other because of the large amount of power, and therefore heat, that is dissipated within the chips. In superconducting technology, the little power that is dissipated is easily removed by the liquid helium. Logic chips can be directly stacked using advanced 3D integration technologies resulting in shorter and faster connections between the chips, and a smaller footprint.
  
   It is also straightforward to stack multiple boards of 3D superconducting chips on top of each other, leaving only a small space between them. We modeled a stack of 100 such boards, all operating within the same cooling environment and contained in a 20- by 20- by 12-centimeter volume, roughly the size of a shoebox. We calculated that this stack can perform 20 exaflops (in BF16 number format), 20 times the capacity of the largest supercomputer today. What’s more, the system promises to consume only 500 kilowatts of total power. This translates to energy efficiency one hundred times as high as the most efficient supercomputer today.

Imec's superconducting chips to shrink power usage 100x (2024)

campers — Tue, 23 Sep 2025 16:49:44 +0000

Article URL: https://spectrum.ieee.org/superconducting-computer

Comments URL: https://news.ycombinator.com/item?id=45349608

Points: 1

# Comments: 1

New comment by campers in "Gemini CLI GitHub Actions"

campers — Thu, 07 Aug 2025 15:11:47 +0000

I added a key rotator to my AI coder, and asked a couple of friends to make keys for me. That helped code a good chunk of http://typedai.dev when 2.5 Pro came out

New comment by campers in "Gemini 2.5 Deep Think"

campers — Sat, 02 Aug 2025 04:24:03 +0000

Google actually does provide that service! https://cloud.google.com/vertex-ai/generative-ai/docs/model-...

New comment by campers in "Qwen3-Coder: Agentic coding in the world"

campers — Wed, 23 Jul 2025 02:31:18 +0000

Looking forward to using this on Cerebras!

New comment by campers in "Gemini Diffusion"

campers — Thu, 22 May 2025 13:14:49 +0000

I've been thinking about adding in an agent to our Codex/Jules like platform which goes through the git history for the main files being changed, extracts the Jira ticket ID's, look through them for additional context, along with the analyzing the changes to other files in commits.

New comment by campers in "Thoughts on thinking"

campers — Sat, 17 May 2025 06:46:33 +0000

There is a huge focus on training the LLMs to reason, that ability will slowly (or not that slowly depending on your timeframe!) but surely improve in the AI models given the gargantuan amount of money and talent being thrown at the problem. To what level we'll have to wait and see.

New comment by campers in "A 10x Faster TypeScript"

campers — Wed, 12 Mar 2025 05:39:06 +0000

There isn't a TypeScript runtime, it is just a JavaScript/ECMAScript compiler/transpiler with a type checking and language server

New comment by campers in "GPT-4.5"

campers — Fri, 28 Feb 2025 02:13:16 +0000

The price will come down over time as they apply all the techniques to distill it down to a smaller parameter model. Just like GPT4 pricing came down significantly over time.

New comment by campers in "Show HN: Mastra – Open-source JS agent framework, by the developers of Gatsby"

campers — Thu, 20 Feb 2025 13:25:35 +0000

I get the same feeing when I first looked at the LangChain documentation when I wanted to first start tinkering with LLM apps.

I built my own TypeScript AI platform https://typedai.dev with an extensive feature list where I've kept iterating on what I find the most ergonomic way to develop, using standard constructs as much as possible. I've coded enough Java streams, RxJS chains, and JavaScript callbacks and Promise chains to know what kind of code I like to read and debug.

I was having a peek at xstate but after I came across https://docs.dbos.dev/ here recently I'm pretty sure that's that path I'll go down for durable execution to keep building everything with a simple programming model.

New comment by campers in "Show HN: Mastra – Open-source JS agent framework, by the developers of Gatsby"

campers — Thu, 20 Feb 2025 13:02:46 +0000

https://typedai.dev is another full-featured one I've built, with a web UI, multi-user support, code editing agents, CodeAct autonomous agent

ScyllaDB – Why We're Moving to a Source Available License

campers — Thu, 19 Dec 2024 02:15:51 +0000

Article URL: https://www.scylladb.com/2024/12/18/why-were-moving-to-a-source-available-license/

Comments URL: https://news.ycombinator.com/item?id=42457680

Points: 82

# Comments: 30

New comment by campers in "Llama 3.1 405B now runs at 969 tokens/s on Cerebras Inference"

campers — Tue, 19 Nov 2024 08:21:13 +0000

https://web.archive.org/web/20230812020202/https://www.youtu...

New comment by campers in "FrontierMath: A benchmark for evaluating advanced mathematical reasoning in AI"

campers — Sun, 10 Nov 2024 06:29:36 +0000

If an AI achieved 100% in this benchmark it would indicate super-intelligence in the field of mathematics. But depending on what else it could do it may fall short on general intelligence across all domains.

New comment by campers in "Cerebras Trains Llama Models to Leap over GPUs"

campers — Thu, 31 Oct 2024 07:23:01 +0000

On Google Cloud a server with 8 TPU v5e will do 2175 token/seconds on Llama2 70B.

https://cloud.google.com/blog/products/compute/updates-to-ai...

From https://cloud.google.com/tpu/pricing and https://cloud.google.com/vertex-ai/pricing#prediction-prices (search for ct5lp-hightpu-8t on the page) the cost for that appears to be $11.04/hr which is just under $100k for a year. Or half that on a 3-year commit.

That seems like a better deal than millions for a few CS-3 nodes.

And they've just announced the v6 TPU:

  Compared to TPU v5e, Trillium delivers: 
  Over 4x improvement in training performance 
  Up to 3x increase in inference throughput 
  A 67% increase in energy efficiency
  An impressive 4.7x increase in peak compute performance per chip 
  Double the High Bandwidth Memory (HBM) capacity 
  Double the Interchip Interconnect (ICI) bandwidth

https://cloud.google.com/blog/products/compute/trillium-sixt...

New comment by campers in "Cerebras Inference now 3x faster: Llama3.1-70B breaks 2,100 tokens/s"

campers — Fri, 25 Oct 2024 06:43:42 +0000

  The first implementation of inference on the Wafer Scale Engine and utilized only a fraction of its peak bandwidth, compute, and IO capacity. Today’s release is the culmination of numerous software, hardware, and ML improvements we made to our stack to greatly improve the utilization and real-world performance of Cerebras Inference.
 
  We’ve re-written or optimized the most critical kernels such as MatMul, reduce/broadcast, element wise ops, and activations. Wafer IO has been streamlined to run asynchronously from compute. This release also implements speculative decoding, a widely used technique that uses a small model and large model in tandem to generate answers faster.

Cerebras Inference now 3x faster: Llama3.1-70B breaks 2,100 tokens/s

campers — Fri, 25 Oct 2024 03:04:52 +0000

Article URL: https://cerebras.ai/blog/cerebras-inference-3x-faster

Comments URL: https://news.ycombinator.com/item?id=41941883

Points: 147

# Comments: 84

New comment by campers in "Computer use, a new Claude 3.5 Sonnet, and Claude 3.5 Haiku"

campers — Wed, 23 Oct 2024 02:27:32 +0000

A bit like the new Gemini Pro 1.5-002 release.

New comment by campers in "Differential Transformer"

campers — Tue, 08 Oct 2024 13:08:43 +0000

The tl;dr on high level performance improvements

"The scaling curves indicate that Diff Transformer requires only about 65% of model size or training tokens needed by Transformer to achieve comparable language modeling performance."

"Diff Transformer retains high performance even at reduced bit-widths, ranging from 16 bits to 6 bits. In comparison, Transformer’s accuracy significantly drops with 6-bit quantization. The 4-bit Diff Transformer achieves comparable accuracy as the 6-bit Transformer, and outperforms the 4-bit Transformer by about 25% in accuracy."