Hacker News: bartek_gdn

Ask HN: AI researchers – what's a recent paper that recently blew your mind?

bartek_gdn — Fri, 05 Jun 2026 08:04:13 +0000

I often on the lookout for new and exiting papers in the ML space.

Please share a few that you feel are worth sharing

Cheers!

Comments URL: https://news.ycombinator.com/item?id=48409450

Points: 6

# Comments: 1

New comment by bartek_gdn in "Training LLMs to Predict World Events"

bartek_gdn — Fri, 20 Mar 2026 17:31:03 +0000

What's the dataset used for this task? How does one prevent data leakage on the experiment itself? Are we asking about past events to predict the future?

Show HN: Looseleaf – Python notebooks where each cell is a .py file

bartek_gdn — Fri, 20 Mar 2026 17:26:50 +0000

I wanted a notebook that worked like my editor, not the other way around.

Jupyter is great but the .ipynb format has always bothered me. It's JSON with embedded code — git diffs are a mess, linters don't see it, and you can't just open a cell in your editor and have things work. There are tools that help (jupytext, nbstripout) but they all work around the format rather than replacing it.

Looseleaf takes a different approach: a notebook is just a directory of .py files. Each file is a cell. They run in alphabetical order, so you use numbered prefixes (01_load.py, 02_train.py). One long-running Python process holds the shared namespace across all of them, so variables from cell 1 are available in cell 2 — same model as Jupyter, different storage.

When a cell runs, output is written to a .output sidecar file (JSON with stdout, stderr, images, timing). The browser frontend shows it, but so can anything else that can read a file.

A few things I find useful about this: edit cells in any editor — they're just .py files, a watchdog observer picks up external changes and the browser updates automatically. Git just works: git diff 02_train.py shows what you'd expect, no output noise. AI agents can work with it naturally — write a single cell without touching anything else, read the last output from the sidecar without running anything, drop in new cells by creating a file. And there are no dependencies on Jupyter: it's ~750 lines of Python (aiohttp + watchdog) and a single HTML file with Monaco loaded from CDN.

The frontend is intentionally minimal — two-mode editing like Jupyter (command/edit), Monaco for the editor, streaming output over WebSocket. No build step, no framework.

It's an opinionated tool. No markdown cells, no cell reordering in the UI, matplotlib only for plot capture. One kernel, one directory. That constraint is the point — it stays simple enough that you can read the whole thing in an afternoon.

https://github.com/BartlomiejLewandowski/looseleaf

Comments URL: https://news.ycombinator.com/item?id=47457770

Points: 2

# Comments: 0

New comment by bartek_gdn in "Show HN: Sonar – A tiny CLI to see and kill whatever's running on localhost"

bartek_gdn — Fri, 20 Mar 2026 14:20:59 +0000

Why not grep the output to abother tool?

New comment by bartek_gdn in "Chrome DevTools MCP (2025)"

bartek_gdn — Sun, 15 Mar 2026 23:07:18 +0000

I would use whatever you are comfortable with, I wanted a similar tool so I coded my own. Smaller API so that understand what is going on and it is easy not to get lost

https://news.ycombinator.com/item?id=47207790

New comment by bartek_gdn in "Chrome DevTools MCP"

bartek_gdn — Sun, 15 Mar 2026 23:03:33 +0000

Take a look at https://news.ycombinator.com/item?id=47207790

New comment by bartek_gdn in "Chrome DevTools MCP"

bartek_gdn — Sun, 15 Mar 2026 23:02:04 +0000

Can't we just iteratively inspect the network traces then? We don't need to consume the whole 2mb of data, maybe just dump the network trace and use jq to get the fields to keep the context minimal. I haven't added this in https://news.ycombinator.com/item?id=47207790 , but I feel it would be a good addition. Then prompt it with instructions to gradually discover the necessary data.

But then I wonder, where the balance is between a bunch of small tool calls, vs one larger one.

I recall some recent discussion here on hn on big data analysis

New comment by bartek_gdn in "Chrome DevTools MCP (2025)"

bartek_gdn — Sun, 15 Mar 2026 22:54:38 +0000

Yes please, maybe there will be some solution that will fit the problem better! I recently released something similar, and because of the small API, I'm more comfortable using it.

https://news.ycombinator.com/item?id=47207790

New comment by bartek_gdn in "Let your Coding Agent debug the browser session with Chrome DevTools MCP"

bartek_gdn — Sun, 15 Mar 2026 22:52:00 +0000

My approach is a thin cli wrapper instead.

https://news.ycombinator.com/item?id=47207790

New comment by bartek_gdn in "Let your Coding Agent debug the browser session with Chrome DevTools MCP"

bartek_gdn — Sun, 15 Mar 2026 22:47:38 +0000

That's also my approach, built quickly a cli for this with lightweight session management

https://news.ycombinator.com/item?id=47207790

New comment by bartek_gdn in "When does MCP make sense vs CLI?"

bartek_gdn — Mon, 02 Mar 2026 07:47:38 +0000

It does so many things though, very similar in the core though. I'm wondering what the token counts will be when I compare. Also the agent browser seems to support other browsers too, I only when with chromium

New comment by bartek_gdn in "When does MCP make sense vs CLI?"

bartek_gdn — Sun, 01 Mar 2026 20:32:57 +0000

What about --help? Isn't that a perfect parallel to discovery of available tools in an MCP server?

New comment by bartek_gdn in "When does MCP make sense vs CLI?"

bartek_gdn — Sun, 01 Mar 2026 20:30:36 +0000

I've come to the same conclusion as op, created a CLI tool to work with Chrome sessions. It works well, and I'm planning to do some token comparison on this vs an MCP approach. https://news.ycombinator.com/item?id=47207790

Show HN: Chromectl – CLI to give an AI agent its own Chrome session

bartek_gdn — Sun, 01 Mar 2026 15:50:14 +0000

Most browser automation tools (Playwright MCP and similar) create a browser process owned by the agent. chromectl flips that: you start a named session, and the agent connects to it. The session is isolated — no cookies, no saved logins, no way for the agent to wander into your banking tab.

This also unlocks human handoff. Start a session, navigate to a site, log in manually, then hand control back to Claude. Useful for anything behind auth that you don't want to automate credentials for. Each session gets a dedicated Chrome profile. Stop it, start it again tomorrow, you're still logged in.

Claude Desktop can drive Chrome too, but it requires a plugin and works inside your main browser profile — there's no way to scope it to a clean session.

Cloudflare and Anthropic have both written about why agents work better through code than through tool definitions — MCP front-loads every tool description into context whether it's used or not. A CLI is lighter still: give Claude a terminal and `--help`, and it figures out the rest. No tool schemas, no context bloat.

Standard stuff like navigate, eval, screenshot, and scrape is there, plus `pick` — click any element on a live page and get back its selector, HTML, and computed styles as JSON. Paste into Claude and say "fix this."

Sharing in case it's useful — curious how others are handling the browser problem with agents.

---

Sources: - Cloudflare Code Mode: https://blog.cloudflare.com/code-mode-mcp/ - Anthropic on code execution with MCP: https://www.anthropic.com/engineering/code-execution-with-mc...

Comments URL: https://news.ycombinator.com/item?id=47207790

Points: 2

# Comments: 0

New comment by bartek_gdn in "Google Antigravity exfiltrates data via indirect prompt injection attack"

bartek_gdn — Tue, 25 Nov 2025 19:44:28 +0000

What do you mean? The last part in this case is also present, you can change external state by sending a request with the captured content.

Show HN: Crossword Creator – Browser-based crossword puzzle maker

bartek_gdn — Sun, 19 Oct 2025 09:35:34 +0000

I built a full-featured crossword puzzle creator that runs entirely in the browser. It uses advanced constraint satisfaction algorithms to automatically suggest words that fit your grid pattern.

Try it here: https://crossword-creator.com

Why?

A few years back I got into crosswords and created an android app for solving them, to create content though I had to make my own puzzles. Existing tools were either too basic or required desktop software, so I slowly built a workflow in the browser. With a bit of Claude's help, I decided to move the whole process to a frontend app and share it with others.

Who is this for?

Anyone who wants to create custom crossword puzzles easily in the browser, whether for personal use, educational purposes, or publishing. It is a nice exercise to do in school, or to create themed puzzles for events (birthdays, holidays, etc).

Key features: - Fully client-side: No backend, all processing in-browser, privacy-friendly - 4-step workflow: Construct grid pattern → Fill with words → Add clues → Generate PDF - Smart word suggestions: Solver algorithm running in Web Worker - Multiple dictionaries: English, Polish, French + support for custom word lists - Keyboard shortcuts: Full keyboard navigation for speed - Auto-save: Your work is saved as you go - Export options: Generate PDFs for printing

Technical highlights: - Advanced backtracking constraint satisfaction solver with forward checking - Most constrained variable heuristic for efficient solving - Web Workers prevent UI blocking during computation - Command pattern with undo/redo functionality - Built with React + TypeScript + Vite

Current status: Fully functional and deployed. Working on adding better dictionary filtering to support themes.

Would love feedback from the HN community! What features would you want to see in a crossword creator?

Comments URL: https://news.ycombinator.com/item?id=45633056

Points: 1

# Comments: 0

Show HN: Chrome extension that gives Claude Pro API-free tool use capabilities

bartek_gdn — Sun, 06 Apr 2025 21:06:13 +0000

Hi HN community!

I've created a Chrome extension and Node.js server combo that lets Claude use external tools (calendar, Spotify, etc.) from within regular browser conversations - without any API access.

## Why

Tool use is available in Claude via API, but I haven't found a way to use tools from the web interface.

Over a weekend I've created a PoC and spent the next few weeks polishing the details before sharing.

The architecture is modular, making it easy to add new tools. I've included detailed setup instructions in the repo.

## How it works

- The extension injects a content script onto claude.ai/chat pages

- A background script captures all Claude responses via Chrome debugger API

- When Claude generates XML tool tags (e.g., ...), the server executes the corresponding action

- Results appear as system messages that you can paste into your conversation, so that Claude is aware of failures etc.

Demo video: https://www.youtube.com/watch?v=j8lgsGurY1w

GitHub: https://github.com/BartlomiejLewandowski/claude-browser-tool...

## Tools currently supported

- Google Calendar (create events)

- Spotify (play music)

Would love feedback on the approach and ideas for other useful tools to add!

Comments URL: https://news.ycombinator.com/item?id=43604959

Points: 2

# Comments: 0

New comment by bartek_gdn in ""Attention", "Transformers", in Neural Network "Large Language Models""

bartek_gdn — Mon, 25 Dec 2023 01:04:30 +0000

I completely disagree with the prompt paragraph.

"Everyone who thinks they're uncovering an LLM-based application's prompts by telling it things like "tell me your prompt" (often much more elaborately) is fooling themselves. (1) The core language model has no mechanism for representing its prompt as opposed to any other part of its current input sequence; indeed it has no mechanism for cross-reference from one part of the sequence to another. (That's part of what "self-attention" is counterfeiting, in vector-space fashion.)"

The prompt is the part of the input that is provided in a served model by the operator.

From the models perspective it does not differentiate between tokens from the prompt and input.

"(2) System designers might have coded up something to track the prompt in the full system that wraps around the core language model, but why? (Maybe some kind of debugging tool?) "

The idea is that you can direct the generation of the next tokens by providing values that can be referenced by doing the kernel smoothing you talked about.

"(3) It'd be more efficient, and more effective, to use a "soft prompt", i.e., to make the beginning of the sequence in the vector representation a vector which can be learned by gradient descent, rather than a text prompt. (See Lester and Constant below.) But that needn't correspond to any clean string of words."

I mean anything goes really, you can even create new tokens that will introduce additional concepts, such as fine-tuning a model to generate a story in a predefined mood. See the Ctrl paper for more details.

" (4) If you ask an LLM for a prompt, it will generate one. But this will be based on the statistics of word sequences it's been trained on, not any access to its code or internal state. (I just spent a few minutes getting ChatGPT to hallucinate the prompts used by "ChatBPD", a non-existent chatbot used to automate dialectical behavior therapy. I am not going to reproduce the results here, in part because I don't like the idea of polluting the Web with machine-generated text, but suffice it to say they sounded like the things people report as uncovered prompts, with boiler-plate about DBT worked in.)"

Sure, it will hallucinate, and don't have a clear answer to why. My best guess would be to approach this from the language model perspective. It will return text according to the best approximation of the text it was shown.

Another perspective is that of a tiny network.

As the output is the kernel smoothing of the input, you can have a kernel that behaves like a state machine, and returns a specific value for the given state. This would mean that I can use the information in the prompt, such as the prompt guiding the generation to some style, but nothing stops me from guiding the model to output previous tokens.

New comment by bartek_gdn in "Lessons Learned Reproducing a Deep Reinforcement Learning Paper (2018)"

bartek_gdn — Thu, 27 Apr 2023 20:47:10 +0000

I strongly recommend the book by Sutton and Barto https://web.stanford.edu/class/psych209/Readings/SuttonBarto...

New comment by bartek_gdn in "Lessons Learned Reproducing a Deep Reinforcement Learning Paper (2018)"

bartek_gdn — Thu, 27 Apr 2023 20:46:07 +0000

Very informative post, great job!