Hacker News: joakleaf

New comment by joakleaf in "Accelerating Gemma 4: faster inference with multi-token prediction drafters"

joakleaf — Tue, 05 May 2026 20:39:08 +0000

Seems like a pull request for vLLM was just approved a few minutes ago:

https://github.com/vllm-project/vllm/pull/41745

("Add Gemma4 MTP speculative decoding support")

New comment by joakleaf in "Apple Says Mac Studio and Mac Mini Will Be in Short Supply for Months"

joakleaf — Fri, 01 May 2026 09:58:30 +0000

I disagree.

I am old enough to remember the iPod nano -- Especially the 2nd generation. They were effectively low-priced and smaller iPods.

Apple sold millions of these much much quicker than the iPods and iPod minis (which came right before). Especially in 2006, it was _the_ "Christmas gift" just before the iPhone, iPod touch and later iPad mini took over. Possibly Steve Jobs' demo where he showed how they fit into the otherwise useless small jeans pocket helped convince the world.

The iPod nano effectively wiped out the competing music player market.

The Neo reminds me of the iPod nano and iPad mini. It is smaller and cheaper version of an existing successful product.

I think the iPhone SE and E are the outliers.

New comment by joakleaf in "MacBook Neo"

joakleaf — Wed, 04 Mar 2026 14:40:18 +0000

Technically, the Apple Developer Transition Kit Mac Mini from the Apple Silicon transition (just before the M1 release) ran on an A12Z.

New comment by joakleaf in "80386 Protection"

joakleaf — Fri, 27 Feb 2026 08:59:59 +0000

Enhanced mode was already in 3.0 (and I think allowed for flat addressing)

However, Win32s was introduced in 3.11 which a subset of the Windows 32-bit API from NT.

3.11 also introduced 32-bit disk access and 32-bit drivers.

Microsoft did 32-bit in steps -- it was confusing already back then.

New comment by joakleaf in "Nepal's Mountainside Teahouses Elevate the Experience for Trekkers"

joakleaf — Mon, 19 Jan 2026 12:46:16 +0000

Mera Peak is said to be possible without any climbing experience, and it looks like the trek from Lukla is about 2 weeks. Is that true? How hard is the trek -- Looks like it requires well above average fitness level?

New comment by joakleaf in "The mineral riches hiding under Greenland's ice"

joakleaf — Wed, 07 Jan 2026 17:12:44 +0000

Neither Europe nor EU is a single country with a single foreign policy. There are around 40 different small and large countries in Europe each with their own foreign policy, history, culture and language. Two of the countries are currently at war with each other (if we still include Russia in Europe). Historically, Europe is a continent of wars and full of disagreement, where countries have done much to benefit themselves.

I really don't know much of what is happening in China or India or how you would ever measure something as subjective as morality. The point was, that it isn't just European (or EU) nations that don't stand up to the US. Nobody really dare -- Even those other heavy-weights. So it doesn't seem fair to me to single Europe (European nations) out for not doing anything.

I would say that Europe has a lot of bad history and guilt and we know it. And there is an aspiration in many of the European countries to be better and do "the right thing" now, but it is definitely debatable whether those countries actually do it, or if we even know what "right" is.

New comment by joakleaf in "The mineral riches hiding under Greenland's ice"

joakleaf — Wed, 07 Jan 2026 15:30:20 +0000

You are right. People went on with their lives, just as they did in many other parts of the world, but I don't think what happened is forgotten -- Not even in the US.

Btw. as far as I remember neither China, India, Russia, nor practically any other nation stopped trading with the US over the war in Iraq. Maybe I am wrong about that.

Small detail on casualties in Iraq: the estimates listed on Wikipedia range from 150K to about 1 million (1).

[1] https://en.wikipedia.org/wiki/Casualties_of_the_Iraq_War

New comment by joakleaf in "The mineral riches hiding under Greenland's ice"

joakleaf — Wed, 07 Jan 2026 15:09:29 +0000

Yes. You are right. Unfortunately, many countries that were/are part of EU sent forces to Iraq (not all).

You mention that Asia was suspicious, but the "coalition of willing" actually included Asian countries such as Phillippines, South Korea, Japan, Uzbekistan, Singapore.

I believe the current overarching feeling in Europe is that we were mislead by the US administration more than our own politicians. Already back then, there was quite a lot of skepticism and significant doubt in the media all over Europe about the justification of that war. Also in the coalition countries.

And Indeed, there were no consequences later. But what should have been done and by whom at that point? How do you prove that it was deliberately misleading? Why would it be the job of nations of Europe or EU?

I agree that it wasn't pretty, and that the European nations and EU should have opposed more, but even as it was back then, it was not a clear "cheering on" moment. I remember having discussions about Iraq with people from Scandinavia, Italy, Spain, Germany, and France back when the invasion started. Although a large group did support the war (I think many were still emotionally affected by 9/11), I actually don't remember talking to any one of them.

The reality is that the US is the most powerful geopolitical entity and Europe is a continent consisting of many individual countries. Even the EU is a divided group of nations, and even if united would not be as powerful as the US is currently.

New comment by joakleaf in "The mineral riches hiding under Greenland's ice"

joakleaf — Wed, 07 Jan 2026 11:53:40 +0000

This doesn't really align with CNN's view, but may apply to another even more popular US news channel that seems to be much more aligned with the current administration...

Greenland and Denmark are not the same. Greenland is a self-governed territory under the Kingdom of Denmark. The US administration wishes to take over Greenland from Denmark completely. So you should replace your headlines with "Greenland" and "Greenlanders".

Note: There have already been discussions about making Greenland independent from Denmark, but there is uncertainty over how to handle economic and defense situations. Greenland currently receives significant support (about $10000-15000 per capita yearly) from Denmark. So it is not clear how the country would run without that.

New comment by joakleaf in "The mineral riches hiding under Greenland's ice"

joakleaf — Wed, 07 Jan 2026 11:30:22 +0000

Europe has not just "cheered on". There were demonstrations throughout Europe against the wars in the middle east and both e.g. France and Germany openly opposed the war in Iraq.

The Europeans I know (from all over) have generally been opposed to American geopolitics both in the Middle East, South East Asia, and South America. The US has traditionally been seen as an ally, but that doesn't mean we "cheer on" its actions.

Because there are many financial and military interests, it is very hard to do much for e.g. the EU, and the politicians are very careful with their words. Just as it is for the rest of the world...

Note: Europe is not a single entity but a continent full of different countries including (part of) Russia. Even the EU doesn't really have one single foreign policy.

New comment by joakleaf in "The mineral riches hiding under Greenland's ice"

joakleaf — Wed, 07 Jan 2026 11:13:01 +0000

It may not just be about the minerals....

It could be about leaving NATO.

US (Trump) feels they need Greenland for "security".

They currently have (almost complete) access to use Greenland via NATO and the existing agreements with Denmark. So there is no need to extend this.

However, if the US would want to leave NATO, they would no longer have access to Greenland under existing agreement.

Therefore, if the US wants to leave NATO and still use Greenland (both militarily and for resources), they need to acquire Greenland.

Acquiring Greenland would allow the US to control the entire western hemisphere, leave NATO, and abandon the eastern hemisphere entirely.

New comment by joakleaf in "I Tested the M5 iPad Pro's Neural-Accelerated AI, and the Hype Is Real"

joakleaf — Mon, 01 Dec 2025 04:16:01 +0000

Related and test on MacBook Pro M5 vs M4:

https://machinelearning.apple.com/research/exploring-llms-ml...

"Exploring LLMs with MLX and the Neural Accelerators in the M5 GPU"

New comment by joakleaf in "The last European train that travels by sea"

joakleaf — Mon, 27 Oct 2025 10:38:15 +0000

It will be by far the longest span of a suspension bridge at 3300 meter.

The current longest is in Turkey at 2023 meter.

Each of the pylons of the Messina Bridge will be around 400 meters tall. Which is taller than the Empire State Building.

The strait is too deep, with too much current and seismic activity to place the pylons in the water. So they have to be on the shore, as I understand it.

New comment by joakleaf in "Steve Jobs and Cray-1 to be featured on 2026 American Innovations $1 coin"

joakleaf — Thu, 16 Oct 2025 11:18:50 +0000

It was released by The Steve Jobs archive posthumously

https://stevejobsarchive.com/

The archive was launched by Laurene Powell Jobs in 2022

New comment by joakleaf in "Europe's EV sales surge 26% in 2025 while Tesla faces decline"

joakleaf — Tue, 23 Sep 2025 12:23:48 +0000

Here is a list of EV sales in Europe by country first half 2025:

https://www.best-selling-cars.com/europe/2025-half-year-euro...

This gives a much much more nuanced look, and it doesn't look as completely clear cut as the headline implies.

For example: Spain saw an increase of 83.9% and France a decline of 6.9%

... And then you see that Denmark bought as many EVs as spain, although Spain has 10x population.

New comment by joakleaf in "Scientist exposes anti-wind groups as oil-funded, now they want to silence him"

joakleaf — Wed, 27 Aug 2025 12:06:46 +0000

I listened to the press conference the other day with Trumps cabinet meeting.

It is bizar to listen to.

Robert F. Kennedy Jr. claimed that windmills had killed 100+ whales. I tried to find out what he referred to, but couldn't find anything but articles debunking any claim that windmills affect whales (after construction).

He also claimed that the price per kWh of wind energy is above $0.30, which is quite a bit from the $0.03 ($0.12 offshore) price per kWh listed in Wikipedia [1] for United States.

At the same meeting Trump stated that the only viable solution is fossil fuel."... and maybe a little nuclear, but mostly fossil fuel.". And that wind is about 10x more expensive than natural gas (again contradicting the prices listed in the Wikipedia reference where the prices for onshore wind and natural gas are almost identical).

[1] https://en.wikipedia.org/wiki/Cost_of_electricity_by_source

New comment by joakleaf in "So you want to parse a PDF?"

joakleaf — Mon, 04 Aug 2025 11:57:35 +0000

I know OCR is easier to set up, but you lose a lot going that way.

We process several million pages from Newspapers and Magazines from all over the world with medium to very high complexity layouts.

We built the PDF parser on top of open source PDF libraries, and this gives many advantages: • We can accurately get headlines other text placed on top on images. OCR is generally hopeless with text placed on top of images or on complex backgrounds • Distinguish letters accurately (i.e. number 1, I, l, "o", "zero") • OCR will pick up ghost letters from images, where OCR program believes there is text, even if there isn't. We don't. • We have much higher accuracy than OCR because we don't depend on the OCR programs' ability to recognize the letters. • We can utilize font information and accurate color information, which helps us distinguish elements from each other. • We have accurate bounding box locations of each letter, word, line, and block (pts).

To do it, we completely abandon the PDF text-structure and only use the individual location of each letter. Then we combine letter positions to words, words to lines, and lines to text-blocks using a number of algorithms.

We use the structure blocks that we generated with machine learning afterwards, so this is just the first step in analyzing the page.

It may seem like a large undertaking, but it literally only took a few months to built this initially, and we have very rarely touched the code over the last 10 years. So it was a very good investment for us.

Obviously, you can achieve a lot of the same with OCR -- But you lose information, accuracy, and computational efficiency. And you depend on the OCR program you use. Best OCR programs are commercial and somewhat pricy at scale.

New comment by joakleaf in "So you want to parse a PDF?"

joakleaf — Mon, 04 Aug 2025 10:19:07 +0000

This happens -- also variants which have been processed with OCR.

So if it is scanned it contains just a single image - no text.

OCR programs will commonly create a PDF where the text/background and detected images are separate. And then the OCR program inserts transparent (no-draw) letters in place of the text it has identified, or (less frequently) place the letters behind the scanned image in the PDF (i.e. with lower z).

We can detect if something has been generated by an OCR program by looking at the "Creator data" in the PDF that describes the program use to create the PDF. So we can handle that differently (and we do handle that a little bit differently).

PDF->image->text is 100% not lossless.

When you rasterize the PDF, you losing information because you are going from a resolution independent format to a specific resolution: • Text must be rasterized into letters at the target resolution • Images must be resampled at the target resolution • Vector paths must be rasterized to the target resolution

So for example the target resolution must be high enough that small text is legible.

If you perform OCR, you depend on the ability of the OCR program to accurately identify the letters based on the rasterized form.

OCR is not 100% accurate, because it is computer vision recognition problem, and • there are hundrends of thousands of fonts in the wild each with different details and appearances. • two letters can look the same; simple example where trivial OCR/recognition fails is capital letter "I" and lower case "l". These are both vertical lines, so you need the context (letters nearby). Same with "O" and zero. • OCR is also pretty hopeless with e.g. headlines/text written on top of images because it is hard to distinguish letters from the background. But even regular black on white text fails sometimes. • OCR will also commonly identify "ghost" letters in images that are not really there. I.e. pick up a bunch of pixels that have been detected as a letter, but really is just some pixel structure part of the image (not even necessarily text on the image) -- A form of hallucination.

New comment by joakleaf in "So you want to parse a PDF?"

joakleaf — Mon, 04 Aug 2025 09:14:11 +0000

We do a lot of parsing of PDFs and basically break the structure into 'letter with font at position (box)' because the "structure" within the PDF is unreliable.

We have algorithms that combines the individual letters to words, words to lines, lines to boxes all by looking at it geometrically. Obviously identify the spaces between words.

We handle hidden text and problematic glyph-to-unicode tables.

The output is similar to OCR except we don't do the rasterization and quality is higher because we don't depend on vision based text recognition.

The base implementation of all this, I made in less than a month 10 years ago and we rarely, if ever, touch it.

We do machine learning afterwards on the structure output too.

New comment by joakleaf in "So you want to parse a PDF?"

joakleaf — Mon, 04 Aug 2025 09:00:13 +0000

I sort of agree... I do the same.

We also parse millions of PDFs per month in all kinds languages (both Western and Asian alphabets).

Getting the basics of PDF parsing to work is really not that complicated -- A few months work. And is an order of magnitude more efficient than generating an image in 300-600 DPI and doing OCR or Visual LLM.

But some of the challenges (which we have solved) are:

• Glyphs to unicode tables are often limited or incorrect • "Boxing" blocks of text into "paragraphs" can be tricky • Handling extra spaces and missing spaces between letters and words. Often PDFs do not include the spaces or they are incorrect so you need to identify gaps yourself. • Often graphic designers of magazines/newspapers will hide text behind e.g. a simple white rectangle, and place new version of the text above. So you need to keep track of z-order and ignore hidden text. • Common text can be embedded as vector paths -- Not just logos but we also see it with text. So you need a way to handle that. • Dropcap and similar "artistic" choices can be a bit painful

There are lot of other smaller issues -- but they are generally edge cases.

OCR handles some of these issues for you. But we found that OCR often misidentifies letters (all major OCR), and they are certainly not perfect with spaces either. So if you are going for quality, you can get better results if you parse the PDFs.

Visual Transformers are not good with accurate coordinates/boxing yet -- At least we haven't seen a good enough implementation of it yet. Even though it is getting better.