Hacker News: pedrovhb

New comment by pedrovhb in "Show HN: The Mog Programming Language"

pedrovhb — Tue, 10 Mar 2026 13:03:47 +0000

Very interesting, thanks for sharing.

I'm curious about the focus on the language design and features with regards to agent orchestration vs. being a general purpose/ML architecture oriented language. The headline examples go "Agent hook", "Async HTTP with retry", and then "FFT on tensors", and that last one seems different from the others. It's easy to imagine Mog being the backbone of agent coordination in a project using more standard languages, so I imagined that would be its role; but then I'd expect primitives/abstractions to be more geared towards this role specifically. For instance, a rich subprocess interface with special handling of stdin/stderr and maybe process interaction and lifecycle is something I'd expect to see before tensors and math-y stuff. Is the goal for Mog to ultimately be a general purpose language designed for LLMs to write, or one meant for agentic harnesses and orchestration/integration?

New comment by pedrovhb in "Show HN: OverType – A Markdown WYSIWYG editor that's just a textarea"

pedrovhb — Sun, 17 Aug 2025 22:38:47 +0000

Nice! Seems very useful if you can drop in and have everything work.

Nitpicking a bit: it's not as much _rendering_ markdown as it's _syntax highlighting_ it. Another interesting approach there could be to use the CSS Custom Highlight API [0]. Then it wouldn't need the preview div, and perhaps it'd even be possible to have non-mono fonts and varying size text for headers.

[0] https://developer.mozilla.org/en-US/docs/Web/API/CSS_Custom_...

New comment by pedrovhb in "Checking Out CPython 3.14's remote debugging protocol"

pedrovhb — Thu, 24 Jul 2025 02:11:36 +0000

You can presumably run code that calls `sys.settrace` for that. Which makes it somewhat underwhelming to realize that you pretty much could also do that before, but perhaps convenient that now you don't have to have the foresight to have set that up beforehand.

New comment by pedrovhb in "Brazil's government-run payments system has become dominant"

pedrovhb — Tue, 08 Apr 2025 12:05:57 +0000

As a Brazilian - Pix was a pleasant surprise, especially in that for once it feels like we're not lagging behind. It's convenient, free, instant transfers across banks. You can also easily create or programmatically generate QR codes or pastable codes with preset receivers and amounts. Great UX all around, and it quickly became the de-facto standard in how people send money.

It's technically quite impressive - it's a large scale thing and it works really well. I can think of maybe one or two times in these years where I saw downtime, and in both cases it was working again after a few minutes. The usual experience with the government building technical solutions is to have something that makes little sense, is slow, and goes down frequently with even the most predictable usage peaks, but with Pix they really seem to have nailed it.

It does feel a bit weird to have so many payments go through the government's systems, and it definitely feels like it puts them in a position of having more information than they should. There's a lot of Orwellian surveillance potential there, as any transfers are necessarily tied to both users' real identities. I don't think there's a realistic way around this, though.

Another concern is that people can expose some of their information without necessarily being aware of it. You can register e.g. emails and phone numbers as Pix "keys", and then anyone can initiate a transfer to those keys and your full name will pop up so you can confirm or cancel the transfer. I've seen some clever advice around this - "When using a carpooling app (often details are arranged off the platform using WhatsApp), put the driver's phone number on Pix. If a name comes up and it doesn't match the name or gender of the driver's profile, something is up". Obviously though there's potential for misuse and I'm sure the vast majority of people don't think about this when registering their Pix keys. You can, however, just use randomly generated uuids as keys as well, a different one for each transaction if you so desire, so this one can be a non-issue with more awareness.

Overall though it's a very convenient thing which works surprisingly well, and the downsides are theoretical at this point. IMO it's a rare case of our government nailing something.

New comment by pedrovhb in "Exposed DeepSeek database leaking sensitive information, including chat history"

pedrovhb — Thu, 30 Jan 2025 00:22:20 +0000

> but most probably was training data to prevent deepseek from completing such prompts, evidenced by the `"finish_reason":"stop"` included in the span attributes

As I understand, the finish reason being “stop” in API responses usually means the AI ended the output normally. In any case, I don't see how training data could end up in production logs, nor why they'd want to prevent such data (a prompt you'd expect to see a normal user to write) from being responded to.

> [...] I'm guessing Wiz wanted to ride the current media wave with this post instead of seeing how far they could take it.

Security researchers are often asked to not pursue findings further than confirming their existence. It can be unhelpful or mess things up accidentally. Since these researchers probably weren't invited to deeply test their systems, I think it's the polite way to go about it.

This mistake was totally amateur hour by DeepSeek, though. I'm not too into security stuff but if I were looking for something, the first thing I'd think to do is nmap the servers and see what's up with any interesting open ports. Wouldn't be surprised at all if others had found this too.

New comment by pedrovhb in "Exposed DeepSeek database leaking sensitive information, including chat history"

pedrovhb — Wed, 29 Jan 2025 23:57:05 +0000

I imagine it wouldn't necessarily require their opening of remote connections, just a misconfigured reverse proxy.

New comment by pedrovhb in "Open Heart Protocol"

pedrovhb — Sat, 25 Jan 2025 17:33:35 +0000

This seems centralized, though you can self-host it.

New comment by pedrovhb in "Trusting clients is probably a security flaw"

pedrovhb — Fri, 17 Jan 2025 10:44:08 +0000

> [the extensive anti-reverse engineering measures are] more annoying than any financial app I've had, and I have 5 of them on my phone

Ah, this reminds me of the Tuya app.

I've done some ssl unpinning and mitm to see requests going in and out of my phone, it's pretty fun and there's often really nice and easy to use restful APIs underneath. Among them I've also done a couple of banking apps and they weren't particularly defensive either. That's great; as a user I'm empowered by it and like TFA says, it's totally fine from a security standpoint if you just don't trust the client to do anything they shouldn't be able to do. It shouldn't be your form validation that stops me from transferring a trillion dollars, and though I haven't tried, I'm sure that's not the case for those apps. All it does is allow me to get my monthly statements with a for loop rather than waiting for a laggy UI and clicking through each month.

Now, Tuya is a Chinese company offering a bunch of cheap IoT devices like smart power switches and IR motion detectors. You can interact with everything through their app. That app for some reason has spent by far the most resources on anti-RE of any apps I've seen. I already bought your hardware, mate. Please let me use it on my local network. My smart home infrared motion sensors were meant to turn lights on when I enter a room. But they don't feel very smart when I'm standing in the dark for 4 seconds while they check with a server in China. I don't even need a clean API; just let me see what you do, and I'll do something similar, no support or documentation necessary. But they go through extensive measures to prevent you from interacting with the hardware you bought and which is sitting in your home.

This was a while ago, but I think for the motion sensing in particular, I managed to just put them in a subnetwork with blocked internet access, and snooped on the network to catch their DHCP requests when they tried to call home. This would happen every once in a while presumably for settings/update checks, but crucially also when there was motion detected, and I didn't mind a few false positives. So in the end they were very quick, locally functioning, privacy-friendly little devices!

New comment by pedrovhb in "Pigment Mixing into Digital Painting"

pedrovhb — Mon, 30 Dec 2024 00:10:38 +0000

That's very interesting!

My first thought, looking at the webpage: "Huh, that's neat. I didn't know that painting software didn't even attempt to do color mixing beyond naive interpolation, though I guess it figures; the physics behind all the light stuff must be fairly gnarly, and there's a lot of information lost in RGB that probably can't be just reconstructed."

Scrolling down a bit: "Huh, there's some snippets for using it as a library. Wait, it does operations in RGB? What's going on here?"

Finally, clicking the paper link, I found the interesting bit: "We achieve this by establishing a latent color space, where RGB colors are represented as mixtures of primary pigments together with additive residuals. The latents can be manipulated with linear operations, leading to expected, plausible results."

That's very clever, and seems like a great use for modern machine learning techniques outside the fashionable realm of language models. It uses perceptual color spaces internally too, and physics based priors. All around very technically impressive and beautiful piece of work.

It rhymes with an idea that's been floating in my head for a bit - would generative image models, or image encoder models, work better if rather than rgb, we fed them with wavelength data, or at least a perceptually uniform color space? Seems it'd be closer to truth than arbitrarily using the wavelengths our cone cells happen to respond to (and roughly, at that).

New comment by pedrovhb in "How we made our AI code review bot stop leaving nitpicky comments"

pedrovhb — Sun, 22 Dec 2024 00:56:08 +0000

Here's an idea: have the LLM output each comment with a "severity" score ranging from 0-100 or maybe a set of possible values ("trivial", "small", "high"). Let it get everything off of its chest outputting the nitpicks but recognizing they're minor. Filter the output to only contain comments above a given threshold.

It's hard to avoid thinking of a pink elephant, but easy enough to consciously recognize it's not relevant to the task at hand.

New comment by pedrovhb in "Training LLMs to Reason in a Continuous Latent Space"

pedrovhb — Tue, 10 Dec 2024 20:09:20 +0000

That's certainly possible, but it reminds me a bit of a similar thing I've seen in their UI that rhymes in a way that makes me think otherwise. In the code interpreter tool, you have a little preview of the "steps" it's following as it writes code. This turns out to just be the contents of the last written/streamed comment line. It's a neat UI idea I think - pretty simple and works well. I wouldn't be surprised if that's what's going on with o1 too - the thought process is structured in some way, and they take the headings or section names and just display that.

New comment by pedrovhb in "A Modern CSS Reset (2024 update)"

pedrovhb — Fri, 25 Oct 2024 10:43:14 +0000

For what it's worth, as a primarily backend dev having ~recently started getting more deeply into frontend web, I have specifically noted in my head that the box model isn't too intuitive and in my inexperienced opinion, the default was a bad one. I figured surely if it is the way it is, then it's for reasons I do not yet comprehend™, so it actually feels pretty validating that someone who knows what they're talking about agrees.

New comment by pedrovhb in "Faster convergence for diffusion models"

pedrovhb — Mon, 14 Oct 2024 11:23:30 +0000

It does feel right to me, because it's not distilling the second model, and in fact the second model is not an image generation model at all, but a visual encoder. That is, it's a more "general purpose" model which specializes in extracting semantic information from images.

In hindsight it makes total sense - generative image models don't automatically start out with an idea of semantic meaning or the world, and so they have to implicitly learn one during training. That's a hard task by itself, and it's not specifically trained for this task, but rather learns it on the go at the same time as the network learns to create images. The idea of the paper then is to provide the diffusion model with a preexisting concept of the world by nudging its internal representations to be similar to the visual encoders'. As I understand DINO isn't even used during inference after the model is ready, it's just about representations.

I wouldn't at all describe it as "a technique for transplanting an existing model onto a different architecture". It's different from distillation because again, DINO isn't an image generation model at all. It's more like (very roughly simplifying for the sake of analogy) instead of teaching someone to cook from scratch, we're starting with a chef who already knows all about ingredients, flavors, and cooking techniques, but hasn't yet learned to create dishes. This chef would likely learn to create new recipes much faster and more effectively than someone starting from zero knowledge about food. It's different from telling them to just copy another chef's recipes.

New comment by pedrovhb in "Tbsp – treesitter-based source processing language"

pedrovhb — Mon, 02 Sep 2024 11:11:48 +0000

You may already be aware of it, but in case not - it sounds like tree-sitter-graph could be something you'd be interested in: https://docs.rs/tree-sitter-graph/latest/tree_sitter_graph/r...

I haven't gotten into it yet but it looks pretty neat, and it's an official tool.

New comment by pedrovhb in "Using Fibonacci numbers to convert from miles to kilometers and vice versa"

pedrovhb — Fri, 30 Aug 2024 23:17:33 +0000

Or by the definition that the ratio between consecutive fib numbers approaches Phi, just multiply by 1.618? Though at that point might as well just use the real conversion ratio.

In other news, π² ≈ g.

New comment by pedrovhb in "77% of employees report AI has increased workloads and hampered productivity"

pedrovhb — Wed, 24 Jul 2024 12:24:57 +0000

+1 on feeling there's a lot of UX possibilities left on the table. Most seem to have accepted chat as the only means of using LLMs. In particular, I don't think most people realize that LLMs can be used in very powerful ways that just aren't possible with black-box API services as they currently exist. Google kind of has an edge on this area with recent context caching support for Gemini, but that's just one thing. Some things that feel like they could enable new modes of interaction aren't possible at all, like grammar constrained generation and rapid LLM-tool interactions (think a repl or shell rather than function calls; currently you have to pay for the input tokens all over again if you want to use the results of that function call as context and it adds up quickly).

On Copilot, I've been using it since it was public, and have always found it useful, but it hasn't really changed much. There's a chat window now (groundbreaking, I know) and it shows a "processing steps" thing that says it's doing some distinct agentic tasks like collecting context and test run results and what have you, but it doesn't feel like it knows my codebase any better than the cursory description I'd give an LLM without context. I use the jetbrains plugin though, and I understand the vscode extension has some different features, so ymmv.

New comment by pedrovhb in "Qimgv – Fast, simple image viewer"

pedrovhb — Tue, 04 Jun 2024 04:02:33 +0000

It does view RAW when compiled with the right flags. JXL too, interestingly. Managed to save a bunch of space on old photos (converting with cjxl, but which I wouldn't have done if I weren't able to see them somehow).

New comment by pedrovhb in "The File Filesystem (2021)"

pedrovhb — Wed, 01 May 2024 19:17:15 +0000

Here's an idea: recursively mount code files/projects. Use something like tree-sitter to extract class and function definitions and make each into a "file" within the directory representing the actual file. Need to get an idea for how a codebase is structured? Just `tree` it :)

Getting deeper into the rabbit hole, maybe imports could be resolved into symlinks and such. Plenty of interesting possibilities!

New comment by pedrovhb in "Jamba: Production-grade Mamba-based AI model"

pedrovhb — Thu, 28 Mar 2024 21:47:32 +0000

Have you tried asking it for a specific concrete length, like a number of words? I was also frustrated with concise answers when asking for long ones, but I found that the outputs improved significantly if I asked for e.g. 4000 words specifically. Further than that, have it break it down into sections and write X words per section.

New comment by pedrovhb in "“Emergent” abilities in LLMs actually develop gradually and predictably – study"

pedrovhb — Mon, 25 Mar 2024 11:27:44 +0000

My intuition is that a significant challenge for LLMs' ability to do arithmetics has to do with tokenization. For instance, `1654+73225` as per the OpenAI tokenizer tool breaks down into `165•4•+•732•25`, meaning the LLM is incapable of considering digits individually; that is, "165" is a single "word" and its relationship to "4" and in fact each other token representing a numerical value has to be learned. It can't do simple carry operations (or other arithmetic abstractions humans have access to) in the vast majority of cases because its internal representation of text is not designed for this. Arithmetic is easy to do in base 10 or 2 or 16, but it's a whole lot harder in base ~100k where 99% of the "digits" are words like "cat" or "///////////".

Compare that to understanding arbitrary base64-encoded strings; that's much harder for humans to do without tools. Tokenization still isn't _the_ greatest fit for it, but it's a lot more tractable, and LLMs can do it no problem. Even understanding ASCII art is impressive, given they have no innate idea of what any letter looks like, and they "see" fragments of each letter on each line.

So I'm not sure if I agree or disagree with you here. I'd say LLMs in fact have very impressive capabilities to learn logical structures. Whether grammar is the problem isn't clear to me, but their internal representation format obviously and enormously influences how much harder seemingly trivial tasks become. Perhaps some efforts in hand-tuning vocabularies could improve performance in some tasks, perhaps something different altogether is necessary, but I don't think it's an impossible hurdle to overcome.