<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: reexpressionist</title><link>https://news.ycombinator.com/user?id=reexpressionist</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Thu, 09 Apr 2026 10:53:31 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=reexpressionist" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by reexpressionist in "On Sleeper Agent LLMs"]]></title><description><![CDATA[
<p>This type of behavior (and related) would primarily only be an issue with unconstrained generative models. If you're the one deploying the model, or a downstream consumer, once trained, the neural network can be reexpressed (via an exogenous/secondary model/process) to derive reliable and interpretable uncertainty quantification by conditioning on reference classes (in a held-out Calibration set) formed by the Similarity to Training (depth-matches to training), Distance to Training, and a CDF-based per-class threshold on the output magnitude. If the prediction/output falls below the desired probability threshold, gracefully fail by rejecting the prediction, rather than allowing silent errors to accumulate.<p>For higher-risk settings, you can always turn the crank to be more conservative (i.e., more stringent parameters and/or requiring a larger sample size in the highest probability and reliability data partition).<p>For classification tasks, this follows directly. For generative output, this comes into play with the final verification classifier used over the output.</p>
]]></description><pubDate>Sat, 13 Jan 2024 18:58:35 +0000</pubDate><link>https://news.ycombinator.com/item?id=38983268</link><dc:creator>reexpressionist</dc:creator><comments>https://news.ycombinator.com/item?id=38983268</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=38983268</guid></item><item><title><![CDATA[New comment by reexpressionist in "LLaMA-Pro-8B"]]></title><description><![CDATA[
<p>I believe we have two rather different settings in mind. My statement assumes the enterprise use-case, where having a verifier is required. (In this context, I'm also assuming the approach of constraining against the observed data.) In such a selective classification setting, the end-user need not be exposed to lower quality outputs, but rather null predictions if the model cascade has been exhausted (i.e., progressively moving to larger models until the probability is acceptable).<p>Hopefully in 2024 we can get at least one of the benchmarks to move to assessing non-parametric/distribution-free uncertainty for selective classification, reflecting more recent CS/Stats advances that should be used in practice. Working on it.</p>
]]></description><pubDate>Sun, 07 Jan 2024 18:25:11 +0000</pubDate><link>https://news.ycombinator.com/item?id=38903690</link><dc:creator>reexpressionist</dc:creator><comments>https://news.ycombinator.com/item?id=38903690</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=38903690</guid></item><item><title><![CDATA[New comment by reexpressionist in "LLaMA-Pro-8B"]]></title><description><![CDATA[
<p>The alternative approach is to start with a small[er] model, but derive reliable uncertainty estimates, only moving to a larger model if necessary (i.e., if the probability of the predictions is lower than needed for the task).<p>And I agree that the leaderboards don't currently reflect the quantities of interest typically needed in practice.</p>
]]></description><pubDate>Sat, 06 Jan 2024 21:23:46 +0000</pubDate><link>https://news.ycombinator.com/item?id=38895631</link><dc:creator>reexpressionist</dc:creator><comments>https://news.ycombinator.com/item?id=38895631</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=38895631</guid></item><item><title><![CDATA[New comment by reexpressionist in "Best 7B LLM on leaderboards made by an amateur following a medium tutorial"]]></title><description><![CDATA[
<p>If the end goal is document classification and/or semantic search, the Reexpress Fast I model (3.2 billion parameters) is a good choice. The key is that it produces reliable uncertainty estimates (for classification), so you know if you need a larger (or alternative) model. (In fact, an argument can be made that since the other models don't produce such uncertainty estimates, they are not ideal for serious use cases without adding an additional mechanism, such as ensembling with the Reexpress model.)</p>
]]></description><pubDate>Sat, 06 Jan 2024 21:04:13 +0000</pubDate><link>https://news.ycombinator.com/item?id=38895449</link><dc:creator>reexpressionist</dc:creator><comments>https://news.ycombinator.com/item?id=38895449</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=38895449</guid></item><item><title><![CDATA[New comment by reexpressionist in "Efficient LLM fine-tuning for classification on Mac"]]></title><description><![CDATA[
<p>TL;DR: Reexpress makes it really easy (and inexpensive) to fine-tune a large language model (LLM) for typical document classification tasks. All of the processing happens on your Mac and you also get the indispensable additional advantages of uncertainty quantification, interpretability by example/exemplar, and semantic search capabilities.</p>
]]></description><pubDate>Fri, 05 Jan 2024 15:39:03 +0000</pubDate><link>https://news.ycombinator.com/item?id=38880176</link><dc:creator>reexpressionist</dc:creator><comments>https://news.ycombinator.com/item?id=38880176</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=38880176</guid></item><item><title><![CDATA[Efficient LLM fine-tuning for classification on Mac]]></title><description><![CDATA[
<p>Article URL: <a href="https://github.com/ReexpressAI/Example_Data/blob/main/tutorials/tutorial3_financial_sentiment_comparison_to_genai_finetuning/README.md">https://github.com/ReexpressAI/Example_Data/blob/main/tutorials/tutorial3_financial_sentiment_comparison_to_genai_finetuning/README.md</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=38880175">https://news.ycombinator.com/item?id=38880175</a></p>
<p>Points: 1</p>
<p># Comments: 1</p>
]]></description><pubDate>Fri, 05 Jan 2024 15:39:03 +0000</pubDate><link>https://github.com/ReexpressAI/Example_Data/blob/main/tutorials/tutorial3_financial_sentiment_comparison_to_genai_finetuning/README.md</link><dc:creator>reexpressionist</dc:creator><comments>https://news.ycombinator.com/item?id=38880175</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=38880175</guid></item><item><title><![CDATA[How to locally run a semantic search with representations fine-tuned on your Mac]]></title><description><![CDATA[
<p>Article URL: <a href="https://github.com/ReexpressAI/Example_Data/blob/main/tutorials/tutorial4_semantic_search_without_labels/README.md">https://github.com/ReexpressAI/Example_Data/blob/main/tutorials/tutorial4_semantic_search_without_labels/README.md</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=38852637">https://news.ycombinator.com/item?id=38852637</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Wed, 03 Jan 2024 10:46:39 +0000</pubDate><link>https://github.com/ReexpressAI/Example_Data/blob/main/tutorials/tutorial4_semantic_search_without_labels/README.md</link><dc:creator>reexpressionist</dc:creator><comments>https://news.ycombinator.com/item?id=38852637</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=38852637</guid></item><item><title><![CDATA[Show HN: On-device, no-code LLMs with guardrails (for Apple Silicon)]]></title><description><![CDATA[
<p>We've been working to make uncertainty quantification and interpretability first-class properties of LLMs. Reexpress one, a macOS app, is our first effort to make these properties widely available.<p>Perhaps counter-intuitively, and contrary to common wisdom, LLMs can in fact be transformed to generate very reliable uncertainty estimates (i.e., "knowing what they do and don't know" by assigning a probability to the output).<p>Getting there is a bit complicated, with vector matching/databases, prediction-time data dependencies, complicated inference, and multiple models flying all over the place.<p>We've made it simple and efficient to use in practice with an on-device, no-code approach. Common document classification tasks can be handled with the on-device models (up to 3.2 billion parameters). Additionally, you can add these capabilities to another LLM (e.g., for QA or more complicated tasks) by connecting your existing model by simply uploading the output logits into the app. For example, if you're using an on-device Mistral AI model, or cloud-based genAI model, just upload the output logits into the app.<p>Would be great to get feedback. Also, if you have another use case with a scale that doesn't fully fit into the on-device setting, happy to discuss and collaborate for your setting.<p>And if anyone finds this interesting and wants to get involved more in building reliable AI, let us know!<p>(Note that an Apple silicon Mac is required; ideally M1 Max or better with 64gb of RAM. You train the model yourself, which requires labeled data. The tutorial 1 video has a link to sentiment data in the JSON lines format; it's a good place to start: <a href="https://github.com/ReexpressAI/Example_Data/blob/main/tutorials/tutorial1_sentiment/README.md">https://github.com/ReexpressAI/Example_Data/blob/main/tutori...</a>)</p>
<hr>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=38649799">https://news.ycombinator.com/item?id=38649799</a></p>
<p>Points: 2</p>
<p># Comments: 0</p>
]]></description><pubDate>Fri, 15 Dec 2023 01:07:41 +0000</pubDate><link>https://re.express</link><dc:creator>reexpressionist</dc:creator><comments>https://news.ycombinator.com/item?id=38649799</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=38649799</guid></item><item><title><![CDATA[New comment by reexpressionist in "AI and Trust"]]></title><description><![CDATA[
<p>Important essay and points. I want to mention that there exist now practical technical approaches that can be used to create trustworthy AI...and such approaches can be run on local models, as this comment suggests.<p>> "[...] [AI] will act trustworthy, but it will not be trustworthy. We won’t know how they are trained. We won’t know their secret instructions. We won’t know their biases, either accidental or deliberate. [...]"<p>I agree that this is true with standard deployments of the generative AI models, but we can instead reframe networks as a direct connection between the observed/known data and new predictions, and to tightly constrain predictions against the known labels. In this way, we can have controllable oversight of biases, out-of-distribution errors, and more broadly, a clear relation to the task-specific training data.<p>That is to say, I believe the concerns in the essay are valid in that they reflect one possible path in the current fork in the road, but it is not inevitable, given the potential of reliable, on-device, personal AI.</p>
]]></description><pubDate>Tue, 05 Dec 2023 01:54:23 +0000</pubDate><link>https://news.ycombinator.com/item?id=38526023</link><dc:creator>reexpressionist</dc:creator><comments>https://news.ycombinator.com/item?id=38526023</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=38526023</guid></item><item><title><![CDATA[New comment by reexpressionist in "LLM Visualization"]]></title><description><![CDATA[
<p>Ditto. This is the most sophisticated viz of parameters I've seen...and it's also an interactive, step-through tutorial!</p>
]]></description><pubDate>Sun, 03 Dec 2023 20:06:04 +0000</pubDate><link>https://news.ycombinator.com/item?id=38510263</link><dc:creator>reexpressionist</dc:creator><comments>https://news.ycombinator.com/item?id=38510263</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=38510263</guid></item><item><title><![CDATA[New comment by reexpressionist in "Good old-fashioned AI remains viable in spite of the rise of LLMs"]]></title><description><![CDATA[
<p>"As with most software development, modern AI work is all about knowing your tools and when it's appropriate to use them." 100% agree. Ditto with just using the easiest to access models as initial proof-of-concept/dev/etc. to get started.<p>(I do agree with the overall sentiment of the TC article, although as noted by others below, there's some mashing of terminology in the article. E.g., I, too, associate GOFAI with symbolic AI and planning.)<p>There's another dimension, too, not mentioned in the article: Even with general purpose LLMs, for production applications, it's still required to have labeled data to produce uncertainty estimates. (There's a sense in which any well-defined and tested production application is a 'single-task' setting, in it its own way.) One of the reasons on-device/edge AI has gotten so interesting, in my opinion, is that we now know how to derive reliable uncertainty estimates with the neural models (more or less independent of scale). As long as prediction uncertainty is sufficiently low, there's no particular reason to go to a larger model. That can lead to non-trivial cost/resource savings, as well as the other benefits of keeping things on-device.</p>
]]></description><pubDate>Sun, 03 Dec 2023 04:42:44 +0000</pubDate><link>https://news.ycombinator.com/item?id=38504892</link><dc:creator>reexpressionist</dc:creator><comments>https://news.ycombinator.com/item?id=38504892</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=38504892</guid></item><item><title><![CDATA[New comment by reexpressionist in "Are Open-Source Large Language Models Catching Up?"]]></title><description><![CDATA[
<p>I like the analogy to a router and local Mixture of Experts; that's basically how I see things going, as well. (Also, agreed that Huggingface has really gone far in making it possible to build such systems across many models.)<p>There's also another related sense for which we want routing across models for efficiency reasons in the local setting, even for tasks for the same input modalities:<p>First, attempt prediction on small(er) models, and if the constrained output is not sufficiently high probability (with highest calibration reliability), route to progressively larger models. If the process is exhausted, kick it to a human for further adjudication/checking.</p>
]]></description><pubDate>Sat, 02 Dec 2023 00:03:12 +0000</pubDate><link>https://news.ycombinator.com/item?id=38494312</link><dc:creator>reexpressionist</dc:creator><comments>https://news.ycombinator.com/item?id=38494312</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=38494312</guid></item></channel></rss>