Hacker News: reexpressionist

New comment by reexpressionist in "On Sleeper Agent LLMs"

reexpressionist — Sat, 13 Jan 2024 18:58:35 +0000

This type of behavior (and related) would primarily only be an issue with unconstrained generative models. If you're the one deploying the model, or a downstream consumer, once trained, the neural network can be reexpressed (via an exogenous/secondary model/process) to derive reliable and interpretable uncertainty quantification by conditioning on reference classes (in a held-out Calibration set) formed by the Similarity to Training (depth-matches to training), Distance to Training, and a CDF-based per-class threshold on the output magnitude. If the prediction/output falls below the desired probability threshold, gracefully fail by rejecting the prediction, rather than allowing silent errors to accumulate.

For higher-risk settings, you can always turn the crank to be more conservative (i.e., more stringent parameters and/or requiring a larger sample size in the highest probability and reliability data partition).

For classification tasks, this follows directly. For generative output, this comes into play with the final verification classifier used over the output.

New comment by reexpressionist in "LLaMA-Pro-8B"

reexpressionist — Sun, 07 Jan 2024 18:25:11 +0000

I believe we have two rather different settings in mind. My statement assumes the enterprise use-case, where having a verifier is required. (In this context, I'm also assuming the approach of constraining against the observed data.) In such a selective classification setting, the end-user need not be exposed to lower quality outputs, but rather null predictions if the model cascade has been exhausted (i.e., progressively moving to larger models until the probability is acceptable).

Hopefully in 2024 we can get at least one of the benchmarks to move to assessing non-parametric/distribution-free uncertainty for selective classification, reflecting more recent CS/Stats advances that should be used in practice. Working on it.

New comment by reexpressionist in "LLaMA-Pro-8B"

reexpressionist — Sat, 06 Jan 2024 21:23:46 +0000

The alternative approach is to start with a small[er] model, but derive reliable uncertainty estimates, only moving to a larger model if necessary (i.e., if the probability of the predictions is lower than needed for the task).

And I agree that the leaderboards don't currently reflect the quantities of interest typically needed in practice.

New comment by reexpressionist in "Best 7B LLM on leaderboards made by an amateur following a medium tutorial"

reexpressionist — Sat, 06 Jan 2024 21:04:13 +0000

If the end goal is document classification and/or semantic search, the Reexpress Fast I model (3.2 billion parameters) is a good choice. The key is that it produces reliable uncertainty estimates (for classification), so you know if you need a larger (or alternative) model. (In fact, an argument can be made that since the other models don't produce such uncertainty estimates, they are not ideal for serious use cases without adding an additional mechanism, such as ensembling with the Reexpress model.)

New comment by reexpressionist in "Efficient LLM fine-tuning for classification on Mac"

reexpressionist — Fri, 05 Jan 2024 15:39:03 +0000

TL;DR: Reexpress makes it really easy (and inexpensive) to fine-tune a large language model (LLM) for typical document classification tasks. All of the processing happens on your Mac and you also get the indispensable additional advantages of uncertainty quantification, interpretability by example/exemplar, and semantic search capabilities.

Efficient LLM fine-tuning for classification on Mac

reexpressionist — Fri, 05 Jan 2024 15:39:03 +0000

Article URL: https://github.com/ReexpressAI/Example_Data/blob/main/tutorials/tutorial3_financial_sentiment_comparison_to_genai_finetuning/README.md

Comments URL: https://news.ycombinator.com/item?id=38880175

Points: 1

# Comments: 1

How to locally run a semantic search with representations fine-tuned on your Mac

reexpressionist — Wed, 03 Jan 2024 10:46:39 +0000

Article URL: https://github.com/ReexpressAI/Example_Data/blob/main/tutorials/tutorial4_semantic_search_without_labels/README.md

Comments URL: https://news.ycombinator.com/item?id=38852637

Points: 1

# Comments: 0

Show HN: On-device, no-code LLMs with guardrails (for Apple Silicon)

reexpressionist — Fri, 15 Dec 2023 01:07:41 +0000

We've been working to make uncertainty quantification and interpretability first-class properties of LLMs. Reexpress one, a macOS app, is our first effort to make these properties widely available.

Perhaps counter-intuitively, and contrary to common wisdom, LLMs can in fact be transformed to generate very reliable uncertainty estimates (i.e., "knowing what they do and don't know" by assigning a probability to the output).

Getting there is a bit complicated, with vector matching/databases, prediction-time data dependencies, complicated inference, and multiple models flying all over the place.

We've made it simple and efficient to use in practice with an on-device, no-code approach. Common document classification tasks can be handled with the on-device models (up to 3.2 billion parameters). Additionally, you can add these capabilities to another LLM (e.g., for QA or more complicated tasks) by connecting your existing model by simply uploading the output logits into the app. For example, if you're using an on-device Mistral AI model, or cloud-based genAI model, just upload the output logits into the app.

Would be great to get feedback. Also, if you have another use case with a scale that doesn't fully fit into the on-device setting, happy to discuss and collaborate for your setting.

And if anyone finds this interesting and wants to get involved more in building reliable AI, let us know!

(Note that an Apple silicon Mac is required; ideally M1 Max or better with 64gb of RAM. You train the model yourself, which requires labeled data. The tutorial 1 video has a link to sentiment data in the JSON lines format; it's a good place to start: https://github.com/ReexpressAI/Example_Data/blob/main/tutori...)

Comments URL: https://news.ycombinator.com/item?id=38649799

Points: 2

# Comments: 0

New comment by reexpressionist in "AI and Trust"

reexpressionist — Tue, 05 Dec 2023 01:54:23 +0000

Important essay and points. I want to mention that there exist now practical technical approaches that can be used to create trustworthy AI...and such approaches can be run on local models, as this comment suggests.

> "[...] [AI] will act trustworthy, but it will not be trustworthy. We won’t know how they are trained. We won’t know their secret instructions. We won’t know their biases, either accidental or deliberate. [...]"

I agree that this is true with standard deployments of the generative AI models, but we can instead reframe networks as a direct connection between the observed/known data and new predictions, and to tightly constrain predictions against the known labels. In this way, we can have controllable oversight of biases, out-of-distribution errors, and more broadly, a clear relation to the task-specific training data.

That is to say, I believe the concerns in the essay are valid in that they reflect one possible path in the current fork in the road, but it is not inevitable, given the potential of reliable, on-device, personal AI.

New comment by reexpressionist in "LLM Visualization"

reexpressionist — Sun, 03 Dec 2023 20:06:04 +0000

Ditto. This is the most sophisticated viz of parameters I've seen...and it's also an interactive, step-through tutorial!

New comment by reexpressionist in "Good old-fashioned AI remains viable in spite of the rise of LLMs"

reexpressionist — Sun, 03 Dec 2023 04:42:44 +0000

"As with most software development, modern AI work is all about knowing your tools and when it's appropriate to use them." 100% agree. Ditto with just using the easiest to access models as initial proof-of-concept/dev/etc. to get started.

(I do agree with the overall sentiment of the TC article, although as noted by others below, there's some mashing of terminology in the article. E.g., I, too, associate GOFAI with symbolic AI and planning.)

There's another dimension, too, not mentioned in the article: Even with general purpose LLMs, for production applications, it's still required to have labeled data to produce uncertainty estimates. (There's a sense in which any well-defined and tested production application is a 'single-task' setting, in it its own way.) One of the reasons on-device/edge AI has gotten so interesting, in my opinion, is that we now know how to derive reliable uncertainty estimates with the neural models (more or less independent of scale). As long as prediction uncertainty is sufficiently low, there's no particular reason to go to a larger model. That can lead to non-trivial cost/resource savings, as well as the other benefits of keeping things on-device.

New comment by reexpressionist in "Are Open-Source Large Language Models Catching Up?"

reexpressionist — Sat, 02 Dec 2023 00:03:12 +0000

I like the analogy to a router and local Mixture of Experts; that's basically how I see things going, as well. (Also, agreed that Huggingface has really gone far in making it possible to build such systems across many models.)

There's also another related sense for which we want routing across models for efficiency reasons in the local setting, even for tasks for the same input modalities:

First, attempt prediction on small(er) models, and if the constrained output is not sufficiently high probability (with highest calibration reliability), route to progressively larger models. If the process is exhausted, kick it to a human for further adjudication/checking.