Hacker News: uncSoft

Reverse-engineering SwiftUIs DocumentGroup to restyle and group untouchable tabs

uncSoft — Sun, 07 Jun 2026 05:58:07 +0000

Article URL: https://github.com/uncSoft/Tabberwocky

Comments URL: https://news.ycombinator.com/item?id=48432246

Points: 2

# Comments: 0

New comment by uncSoft in "The Future of Obsidian Plugins"

uncSoft — Wed, 13 May 2026 15:10:10 +0000

I was "bullied" into making a light theme for my app as well after refusing for months - I didn't realize how many people need a light mode.

New comment by uncSoft in "The Future of Obsidian Plugins"

uncSoft — Wed, 13 May 2026 15:08:50 +0000

That's why I wrote cyberwriter, my company needs sandboxed apps because we work on sensitive data. Community plugins are too risky running with full system access

New comment by uncSoft in "The Future of Obsidian Plugins"

uncSoft — Wed, 13 May 2026 15:06:33 +0000

Obsidian is great- it motivated me to make cyberwriter, where the app is sandboxed and the top community plugins are baked in. I work in healthcare and we can't have unvetted community plugin updates running in our network with full system access and with sensitive data.

New comment by uncSoft in "Show HN: CyberWriter – a .md editor built on Apple's (barely-used) on-device AI"

uncSoft — Tue, 21 Apr 2026 16:57:23 +0000

exactly, but even moreso that Apple exposed the API in a fairly complete way. I integrated their local model in less than a day - creating embeddings and RAG essentially on an entire document vault in about 2 seconds- for people who dont have local AI configured, is pretty crazy

New comment by uncSoft in "Show HN: CyberWriter – a .md editor built on Apple's (barely-used) on-device AI"

uncSoft — Mon, 20 Apr 2026 13:35:47 +0000

thanks, I am probably the worst designer and didn't want to use AI, but you're not wrong

Show HN: CyberWriter – a .md editor built on Apple's (barely-used) on-device AI

uncSoft — Mon, 20 Apr 2026 13:07:04 +0000

Apple has quietly shipped a pretty complete on-device AI stack into macOS, with these features first getting API access in MacOS 26. There are multiple components in the foundation model, but the skills it shipped with actually make this ~3b parameter model useful. The API to hit the model is super easy, and no one is really wiring them together yet.

- Foundation Models (macOS 26) - a ~3B-parameter LLM with an API. Streaming, structured output, tool use. No API key, no cloud call, no per-token cost. - NLContextualEmbedding (Natural Language framework, macOS 14+) -- a BERT-style 512-dim text embedder. Exactly what OpenAI and Cohere sell, sitting in Apple's SDKs since iOS 17. - SFSpeechRecognizer / SpeechAnalyzer - on-device speech-to-text including live dictation. Solid accuracy on Apple Silicon.

I built cyberWriter, a Markdown editor, on top of all three, mostly as a test and showcase to see what it can do. I actually integrated local and cloud AI first, and then Apple shipped the foundation model, it stacked on super easy, and now users with no local or API AI knowledge can use it with just a click or two. Well the real reason is because most markdown editors need plugins that run with full system access, and I work on health data and can't have that.

Vault chat / semantic search. The app indexes your Markdown folder via NLContextualEmbedding (around 50 seconds for 1000 chunks on an M1). The search bar gets a "Related Ideas" section that matches by meaning - typing "orbital mechanics" surfaces notes about rockets and launch windows even when those exact words never appear. Ask the AI a question and it retrieves the top 5 chunks as context. Plain RAG, but the embedder, retrieval, chat model, and search all run locally.

AI Workspace. Command+Shift+A opens a chat panel, Command+J triggers inline quick actions (rewrite, summarize, change tone, fix grammar, continue). Apple Intelligence is the default; Claude, OpenAI, Ollama, and LM Studio all work if you prefer. The same context layer - document selection, attached files, retrieved vault chunks - feeds every provider through the same system-message path. Because the vault context is file and filename aware, it can create backlinks to the referenced file if it writes or edits a doc for you.

Voice notes and dictation. Record a voice note directly into your doc, transcribe it with SpeechAnalyzer, or just dictate into the editor while you think. Audio never leaves the Mac.

The privacy story is straightforward because the primitives are already private. Vectors live in a `.vault.embeddings.json` file next to your vault, never sent anywhere. If you use Apple Intelligence, even the retrieved text stays on-device. For cloud models there is a clear toggle and an inline warning before any filenames or snippets leave the machine.

Honest limitations:

- 512-dim embeddings are solid mid-tier. A GPT-4-class embedder catches subtler relationships this will miss. - 256-token chunks can split long paragraphs mid-argument. - Foundation Models caps its context window around 6K characters, so vault context is budgeted to 3K with truncation markers on the rest. - Multilingual support is English-only right now. NLContextualEmbedding has Latin, Cyrillic, and CJK model variants; wiring the language detector across chunks is Phase 2.

The developer experience for these APIs is genuinely good. Foundation Models streams cleanly, NLContextualEmbedding downloads assets on demand and gives you mean-poolable token vectors in a handful of lines. Curious what others here are building on this stack - feels like low-hanging fruit that has been sitting there for a while.

https://imgur.com/a/HyhHLv2

The Apple AI embedding feature is going live today. I'm honestly surprised it even works out of the box.

Comments URL: https://news.ycombinator.com/item?id=47833747

Points: 15

# Comments: 6

New comment by uncSoft in "10% of Firefox crashes are caused by bitflips"

uncSoft — Fri, 06 Mar 2026 17:55:15 +0000

They just need to call it GW Classic apparently and it will sell

New comment by uncSoft in "Show HN: Open dataset of real-world LLM performance on Apple Silicon"

uncSoft — Thu, 05 Mar 2026 14:36:39 +0000

Thanks for the heads up! I just reached out to see if I can contribute, thank you

New comment by uncSoft in "Ask HN: If your project is free, what are you building and why keep it free?"

uncSoft — Thu, 05 Mar 2026 03:21:21 +0000

I build anubis-oss because my friends and I are always trying to see how local LLMs run on our different Macs with different configs. So I build a benchmarker for us with a bunch of tools like exportable benchmark reports and arena mode, then I said hey, I can add public leaderboards with the full dataset of submissions open sourced, and give that to my ML and model tuning friends to gauge performance across broad configs. Yes it cost me money, but it's a fun project and I'm trying to get it into homebrew. Just need a few more stars.

https://devpadapp.com/anubis-oss.html

https://github.com/uncSoft/anubis-oss

https://imgur.com/a/X64WsWY

New comment by uncSoft in "MacBook Neo"

uncSoft — Thu, 05 Mar 2026 02:53:09 +0000

Ha, wasn't it windows vista that allowed you to plug an SD card to use for swap space/fake ram?

New comment by uncSoft in "Show HN: Open dataset of real-world LLM performance on Apple Silicon"

uncSoft — Thu, 05 Mar 2026 02:45:32 +0000

Addendum- Anubis OSS is GPL-3.0 licensed. Built fully in Swift and dev cert signed for safety (if you don't want to clone the source and compile yourself), no external dependencies except Sparkle for autoupdates if you want them, privacy-first - benchmark data is submitted voluntarily and never includes anything beyond hardware specs and model performance metrics.

Show HN: Open dataset of real-world LLM performance on Apple Silicon

uncSoft — Thu, 05 Mar 2026 02:44:11 +0000

Why open source local AI benchmarking on Apple Silicon matters - and why your benchmark submission is more valuable than you think.

The narrative around AI has been almost entirely cloud-centric. You send a prompt to a data center, tokens come back, and you try not to think about the latency, cost, or privacy implications. For a long time, that was the only game in town.

Apple Silicon - from M1 through the M4 Pro/Max shipping today, with M5 on the horizon - has quietly become one of the most capable local AI compute platforms on the planet. The unified memory architecture means an M4 Max with 128GB can run models that would require a dedicated GPU workstation elsewhere. At laptop wattages. Offline. Without sending a single token to a third party.

This shift is legitimately great for all parties (except cloud ones that want your money), but it comes with an unsolved problem: we don't have great, community-driven data on how these machines actually perform in the wild.

That's why I built Anubis OSS.

The Fragmented Local LLM Ecosystem

If you've run local models on macOS, you've felt this friction. Chat wrappers like Ollama and LM Studio are great for conversation but not built for systematic testing. Hardware monitors like asitop show GPU utilization but have no concept of what model is loaded or what the prompt context is. Eval frameworks like promptfoo require terminal fluency that puts them out of reach for many practitioners.

None of these tools correlate hardware behavior with inference performance. You can watch your GPU spike during generation, but you can't easily answer: Is Gemma 3 12B Q4_K_M more watt-efficient than Mistral Small 3.1 on an M3 Pro? How does TTFT scale with context length on 32GB vs. 64GB?

Anubis answers those questions. It's a native SwiftUI app - no Electron, no Python runtime, no external dependencies - that runs benchmark sessions against any OpenAI-compatible backend (Ollama, LM Studio, mlx-lm, and more) while simultaneously pulling real hardware telemetry via IOReport: GPU/CPU utilization, power draw in watts, ANE activity, memory including Metal allocations, and thermal state.

Why the Open Dataset Is the Real Story

The leaderboard submissions aren't a scoreboard - they're the start of a real-world, community-sourced performance dataset across diverse Apple Silicon configs, model families, quantizations, and backends.

This data is hard to get any other way. Formal chipmaker benchmarks are synthetic. Reviewer benchmarks cover a handful of models. Nobody has the hardware budget to run a full cross-product matrix. But collectively, the community does.

For backend developers, the dataset surfaces which chip/memory configurations are underperforming their theoretical bandwidth, where TTFT degrades under long contexts, and what the real-world power envelope looks like under sustained load. For quantization authors, it shows efficiency curves across real hardware, ANE utilization patterns, and whether a quantization actually reduces memory pressure or just parameter count.

Running a benchmark takes about two minutes. Submitting takes one click.

Your hardware is probably underrepresented. The matrix of chip × memory × backend × thermal environment is enormous — every submission fills a cell nobody else may have covered.

The dataset is open. This isn't data disappearing into a corporate analytics pipeline. It's a community resource for anyone building tools, writing research, or optimizing for the platform.

Anubis OSS is working toward 75 GitHub stars to qualify for Homebrew Cask distribution, which would make installation dramatically easier. A star is a genuinely meaningful contribution.

Download from the latest GitHub release — notarized macOS app, no build required Run a benchmark against any model in your preferred backend Submit results to the community leaderboard Star the repo at github.com/uncSoft/anubis-oss

Comments URL: https://news.ycombinator.com/item?id=47256849

Points: 2

# Comments: 4