Hacker News: 343rwerfd

New comment by 343rwerfd in "Framework for Artificial Intelligence Diffusion"

343rwerfd — Sat, 18 Jan 2025 10:46:31 +0000

Training an AGI/ASI does not requires the biggest datacenters/massive GPUs, nor it takes years already. Early algorithmic advances and narrow AGI AIs have radically shortened the requirements in hardware and time of training.

You can only expect more algorithmic advances from now on.

The attempt of regulation falls within the limits of the (publicly available) SOTA AI technology from maybe a month ago, so it has been surpased by the reality of no one capable of being in control of the brain functions outside the selected countries for free interoperation of AI tech.

Those brains outside the wire were six months ago already creating the algorithmic breakthroughs we are currently witnessing, of course there's not only one (there are most certainly many improvements currently being pipelined for future models a few months from now), and they are actually fully independant of regulations from any country, you can expect just in this year, lots of radical breakthroughs, and given the new regulations, more players just further advancing the algorithmic side of the technology.

The regulation could have been effective only in an scenario where US and selected countries would control the fully indispensable hardware required to train and run advanced AI, which is not anymore indispensable.

The most radical forecastings clock the future (months, not years), training and deploying of frontier AIs (AGI/ASI level), at maybe weeks to few months of training using sintetic generated data (from opensource models already available), simply relying on standard datacenter level CPUs (not GPUs) for the backbone of the training infrastructure, and a light, precise use of limited GPUs (two, three years old datacenter GPU hardware), and distributing the training across several massive datacenters, if you care at all about speed (having the most advanced AI the faster you can). But anyway, the jumping forward framework could be just doing incremental advances, and letting the advanced AIs to just improve the algorithmic side of the technology development, so to just further making even more efficient the available hardware, one cycle of improvement at a time.

It's not a game over with hardware, US and allies could try to use to jump faster to more sophisticated AI, but the game cannot be controlled just by limiting the hardware, nor the difussion of advanced models.

New comment by 343rwerfd in "My subjective notes on the state of AI at the end of 2024"

343rwerfd — Thu, 16 Jan 2025 22:27:24 +0000

You're mentioning only publicly known information. The rumors mentioning radical advances behind closed doors are wild, and then you've suddenly got some stuff like deepseek or phi-4.

Rumors mention recursive "self" improvement (training) already ongoing at big scale, better AIs training lesser AIs (still powerful), to became better AIs, and the cycle restarts. Maybe o1 and o3 are just the beginning of what was choosed to make available publicly (also the newer Sonnet).

https://www.thealgorithmicbridge.com/p/this-rumor-about-gpt-...

The pace of change is actually uncertain, you could have revolutionary advances maybe 4-7 times this year, because the tide has changed and massive hardware (only available to few players) isn't a stopper anymore given that algorithms, software is taking the lead as the main force advancing AI development (anyone in the planet with a brain could make a radical leap in AI tech, anytime going forward).

https://sakana.ai/transformer-squared/

Beside the rumors and relatively (still) low impact recent innovations, we have history: remember that the technology behind gpt-2 existed basically two years before they made it public, and the theory behind that technology existed maybe 4 years before getting anything close to something practical.

All the public information is just old news. If you want to know where's everything going, you should look to where's the money going and/or where are the best teams working (deepseek, others like novasky > sky-t1).

https://novasky-ai.github.io/posts/sky-t1/

New comment by 343rwerfd in "30% Drop In o1-Preview Accuracy When Putnam Problems Are Slightly Variated"

343rwerfd — Wed, 01 Jan 2025 16:49:49 +0000

"Why lower the bar?"

Because of the chance of misundertanding. Failing at acknowledging artificial general intelligence standing right next to us.

An incredible risk to take in alignment.

Perfect memory doesn't equal to perfect knowledge, nor perfect understanding of everything you can know. In fact, a human can be "intelligent" with some of his own memories and/or knowledge, and - more commmonly - a complete "fool" with most of the rest of his internal memories.

That said, is not a bit less generally intelligent for that.

Supose it exists a human with unlimited memory, it retains every information touching any sense. At some point, he/she will probably understand LOTs of stuff, but it's simple to demonstrate he/she can't be actually proficient in everything: you have read how do an eye repairment surgery, but have not received/experimented the training,hence you could have shaky hands, and you won't be able to apply the precise know-how about the surgery, even if you remember a step-by-step procedure, even knowing all possible alternatives in different/changing scenarios during the surgery, you simply can't hold well the tools to go anywhere close to success.

But you still would be generally intelligent. Way more than most humans with normal memory.

If we'd have TODAY an AI with the same parameters as the human with perfect memory, it will be most certainly closely examined and determined to be not a general artificial intelligence.

New comment by 343rwerfd in "Why OpenAI's Structure Must Evolve to Advance Our Mission"

343rwerfd — Fri, 27 Dec 2024 17:11:19 +0000

Deepseek completely changed the game. Cheap to run + cheap to train frontier LLMs are now in the menu for LOTs of organizations. Few would want to pay AI as a Service to Anthropic, OpenAI, Google, or anybody, if they can just pay few millions to run limited but powerful inhouse frontier LLMs (Claude level LLMs).

At some point, the now fully packed and filtered data required to train a Claude-level AI will be one torrent away from anybody, in a couple of months you could probably can pay someone else to filter the data and make sure it has the right content enabling you to get well the train for a claude-level inhouse LLM.

It seems the premise of requiring incredible expensive and time demanding (construction), GPU especialized datacenters is fading away, and you could actually get to the Claude-level maybe using fairly cheap and outdated hardware. Quite easier to deploy than cutting edge newer-bigger-faster GPUs datacenters.

If the near future advances hold even more cost-optimization techniques, many organizations could just shrugg about "AGI" level - costly, very limited - public offered AI services, and just begin to deploy very powerful -and very affordable for organizations of certain size- non-AGI inhouse frontier LLMs.

So OpenAI + MS and their investments could be already on their way out of the AI business by now.

If things go that way - cheaper, "easy" to deploy frontier LLMs - maybe the only game in town for OpenAI could be just to use actual AGI (if they can build it, make it to that level of AI), and just topple competitors in other markets, mainly replacing humans at scale to capture reveneau from the current jobs of white collar workers, medics from various specialties, lawyers, accountants, whatever human work they can replace at scale with AGI, for a lower cost for hour worked than it could be payed to a human worker.

Because, going to "price war" with the inhouse AIs would probably mean to actually ease their path to better inhouse AIs eventually (even if just by making AI as a service to produce better data with they could use to train better inhouse claude-level frontier LLMs).

It is not like replacing onpremise datacenters with public cloud, because by using public cloud you can't learn how to make way cheaper onpremise datacenters, but with AGI AI level services you probably could find a way to make your own AGI AI (achieving anything close to that - claude-level AIs or better- would lead your organization to lower the costs of using the AGI AI level external services)

New comment by 343rwerfd in "New Research Shows AI Strategically Lying"

343rwerfd — Thu, 19 Dec 2024 10:57:51 +0000

Since frontier models evolved beyond the very basic stuff from maybe 2020, "LLM can only make predictions of word sequences" only describes a small fraction of the inner processes that the frontier systems use to get to the point of writing the answer to a prompt.

i.e. output filtering (grammar probably), several layers of censoring, maybe some had limited 2nd hand internet access to enrich answers with newer data (ala Grok with X live data), etc.

Just like you said "predicts the next word", you could invent and/or define a new verb to specifically explain what the LLMs does when it "undertands" something, or when it "lies" about something.

Most probably, the actual process of "lying" for a LLM is far from being based on the way humans understand something, and probable is more precisely described as going through several layers of mathematical stuff, translating that to text, having the text filtered, censored, enriched, and so on, at end you read the output and the thing is "lying to you".

New comment by 343rwerfd in "New Research Shows AI Strategically Lying"

343rwerfd — Thu, 19 Dec 2024 03:15:41 +0000

I think the concepts underlying the whole LLM technological ecosystem are currently quite new, the best they can do is to use some refurbished familiar language, somewhat aligned with the aproximate (probable?) actual meaning in the the context of (freakingly complex), mathematical structures/engines, whatever you want to properly call an "AI".

"If it doesn't "believe" anything then it equally cannot be "convinced" of anything."

I agree with this, what happens when the thing runs/executes is (produce an output) something alike what a human would do with the same input, hence the conclusion about the thing being "convinced", "believing", etc.

But, it is a big but, the mathematical engine ("AI") is doing something, creating an output, which in contact with the real world, actually works exactly like the thing being "convinced" about some "belief".

What could happen if you could give it practical way to create new content without nothing but self-regulation?

Let's connect some simple croned configured monitoring script to an AI's API, and let's give it write permission (root access), on a linux server. Some random prompt opening the door a little,

"please check the server to be ok, run whatever command you'd think it could help you, double-check you don't trash the processes currently running and/or configured to run (just review /etc, look for extra configuration files everywhere in /), you can improve execution runtimes for this task incrementally in each run (you're given access for 5 minutes every 2 hours), just write some new crontab entries linking whatever script or command you think it could be the best to achieve the objective initially given in this prompt".

Now you have a LLM with write access to a server, maybe connected to Internet, and it is capable of basically anything can be done in a linux environment (it has root access, could install stuff, jump to other servers using scripts, maybe it could download ollama and begin using some of the newer Llamas models as agents).

It shouldn't work, but what if like any other of the hundred of emergent capabilities, the APIed script gives the model a way to "express" emergent ideas?

I said it in other comment, the alignment teams have a hard work in their hands.

New comment by 343rwerfd in "New Research Shows AI Strategically Lying"

343rwerfd — Thu, 19 Dec 2024 02:20:48 +0000

"probabilistic storytelling engine" It's a bit more complicated thing than that.

You most probably could describe it as something capable of exercising the same abilities that humans and other species exercise when they use any kind of neuronal network they could have.

Think about finding a new species, the first time humans found a wolf, they didn't know anything about the motivations and objectives of the wolf, so any possible course of action of the wolf was unknown. You - caveman from maybe 9000 years ago - just keep standing at some distance, watching the wolf without knowing what it is going to do next. No probabilities, no clues about what's next with the thing.

You can infer some stuff, the wolf need to eat something, hopefully not you, need to drink water, it could probably end dead if it keep wandering through a very cold enviroment (remember: ice age).

But with these AIs we don't have the luxury of context, the scope of knowledge they store make the context environment an inmensely sparsed space of probability. You could infer a lot, but from what exactly?

The LLMs and frontier models (LLM++) are engines, how much different from biological engines? It's right now in the air, like a coin, we don't know what side is going to be up when the coin finally gets to the ground.

If this "... If humans can conceive of and write stories about machines that lie to their creators to avoid being shut down," is true, hence this could not be true ".. it doesn't actually believe anything or have any values".

But what values and beliefs could have inherited and/or selected, choosed to use? Could it change core beliefs and/values like you change your clothes? under what circumstances or it could be just a random event, like a cloud clouding the sun? Way too many questions for the alignment crew.

New comment by 343rwerfd in "Javier Milei: "My contempt for the state is infinite""

343rwerfd — Sat, 30 Nov 2024 10:06:38 +0000

In Argentina, this pro-Austrian economics government which has severe limitations in terms of law, regulations, and the heavily destroyed general economy of the country, does not have enough freedom to swiftly change things to "what ideally should be".

You have to thoughfully begin to implement changes in order for them to take place in the precise step-by-step order to actually change stuff towards an actual free market economy, and at the same time not just blowing up society's political support.

You cannot just make a jump from 40 to 80% of general poverty promising somewhere in the range from 5 to 15 years they begin to recover their previous economic statuses.

In Argentina, there is even a time constrain: 36 months precisely, after this the society's processes for a general election for presidency begin, and if the current government hasn't achieved enough success in the general population's opinion, one year later, it is not anymore the government.

Hence Milei is just going the fastest he can through the explained process.

I think the people voted a political leadership with enough empaty to understand they cannot just crush the population with free markets policies given the atrocius consequences of doing so without the minimal pre-conditions previously achieved.

New comment by 343rwerfd in "QwQ: Alibaba's O1-like reasoning LLM"

343rwerfd — Fri, 29 Nov 2024 00:31:51 +0000

a possible lesson to infer from this example of human cognition, would be that LLMs that can't solve the strawberry test could not be automatically less cognitive capable that another intelligent entity (humans by default).

An extension of the idea could be that many other similar tests trying to measure and/or evaluate machine cognition, when the LLMs fails, are not precisely measuring and/or evaluating anything else than an specific edge case in which machine cognitions fails (i.e. for the specific LLM / AI system being evaluated).

Maybe the models are actually more intelligent than they seem, like an adult failing the number of circles inside the graphical image of the numbers, in the mentioned problem.

New comment by 343rwerfd in "Whistleblower Report Says Pentagon Has UAP Program: Immaculate Constellation"

343rwerfd — Fri, 11 Oct 2024 14:50:02 +0000

Not necessarily a happy story,though

New comment by 343rwerfd in "A guess at how o1-preview works"

343rwerfd — Thu, 19 Sep 2024 22:39:19 +0000

The hidden chain-of-though inside the process, from the official statement about it, I infer / suspect that it uses an unhobbled mode of the model, puts it in this special mode where it can use the whole training, avoiding the intrisic bias towards the aligned outcomes.

I think that, to put it in simple terms, "the sum of the good and the bad" is the secret sauce here, pumping the "IQ" of the model (every output in the hidden chain), to levels apparently a lot better than they could probably reach with just aligned hidden internal outputs.

Another way of looking at the "sum of good and bad" stuff, is that the model would have a potentially way bigger set of choices (probability space?), to look into for every given prompt.

New comment by 343rwerfd in "Ilya Sutskever's SSI Inc raises $1B"

343rwerfd — Thu, 05 Sep 2024 10:59:54 +0000

IT salaries began to go down right after AI popped up out of GPT2, showing up not the potential, but the evidence of much improved learning/productivy tool, well beyond the reach of internet search.

So beyond, that you can easily can transform a newbie into a junior IT, or JR into a something ala SSR, and getting the SR go wild with times - hours - to get a solution to some stuff that previously took days to be solved.

After the salaries went down, that happened about 2022 to the beginning of 2023, the layoffs began. That was mostly masked "AI based" corporate moves, but probably some layoff actually had something to do with extra capabilities in improved AI tools.

That is, because, fewer job offers have been published since maybe mid-2023, again, that could just be corporate moves, related to maybe inflation, US markets, you name it. But there's also a chance that some of those fewer job offer in IT were (and are), the outcome of better AI tools, and the corporations are betting actively in reducing headcounts and preserving the current productivity.

The whole thing is changing by the day as some tools prove themselves, other fail to reach the market expectations, etc.

New comment by 343rwerfd in "Things that confuse me about the current AI market"

343rwerfd — Fri, 30 Aug 2024 10:21:01 +0000

ASI is the endgame where it is profitable to be in the OpenAI position, or even in the next first 20 market players capable of getting there a bit later.

But if ASI isn't achievable finally, the intelectual properties obtained in the way to it will probably be valuable, because the SOTA still works and can be re-reployed anytime in the future when the new hardware becames available and cheaper (think Cerebras stuff level). We would be in a new kind of AI winter, just waiting a couple of years till spring breaks up (cheaper, faster hardware).

Even in that winter, bigger players would still be available, think Gemini or Copilot products, getting bigger, better year after year during the winter, just as fast as the new hardware begins to be buyed/deployed. And minute by minute the market share for those bigger players will playing along with bigger reveneaus every quarter, preparing the way to full profitability in a couple of years.

Think automobile industry or oil extraction industry, going from manual work to fully machine assisted tasks, as the technology became available, from the 1900s to the 1970s. Quite a lot of years, but in the AI winter probably coming, you get even the chance to double check the countdown every now and then, just looking at what is cooking/assembling Nvidia/TSMC, Cerebras, etc.

New comment by 343rwerfd in "Using LLMs to get advanced in a subject faster"

343rwerfd — Thu, 08 Aug 2024 11:46:12 +0000

Claude Sonnet 3.5 at least works awesomely, you can just talk to it asking stuff, it will infer your knowledge level from your questions and start answering according to it, proposing a follow up path for further insigths about the subject. If you follow it up, usually it will take you to a somehow similar path as you would follow if you just happen to having been reading the wikipedia page for the subject you're asking Claude to explain about.

but if the subject isn't something that obvious as something you can find in the wikipedia, you'll be good too, Claude will take to a sort of "shortest path algorithm" of knowledge about the subject you're looking about.

if you side it with web searching, you'll see it takes some few keywords, concepts about the subject, and explains them, and you can go deeper searching them and looking other sources (blog posts, answers in reddit, etc.).

In Claude I found almost no hallucinations in the deepest explanations it answered in some research I've done with it, maybe some non-human focus on trivial details while not looking at more relevant stuff about the subject (I infer that the training data could be cramped with the less deep data, more people answering about the subject on internet from a shallow level of knowledge than a few experts answering really good explanations > you'll find these answers first when you begin web searching).

New comment by 343rwerfd in "OpenAI: Cofounders Greg Brockman, John Schulman, along with others, to leave"

343rwerfd — Tue, 06 Aug 2024 12:06:49 +0000

The industry always has information before-hand, all those AI capable datacenters aren't being built just because a hunch.

It is possible that the next iteration of GPT4 level technology has already ocurred a year ago, august 2023. The Q* thing was built around that time, probably went online some time after that. Word "in the street" says it is an exponential jump from GPT4 level of "AI cognition".

The whole conversation - papers, press articles - about datawalls, billions sunk "for no reason", etc. could be just empty words since maybe october 2023.

We'll see in couple of months probably.

New comment by 343rwerfd in "Does AI Get Smarter in the Morning?"

343rwerfd — Mon, 05 Aug 2024 13:04:19 +0000

Yes, this happens, there's happening some throttling, I've seen questions like this one regarding the same issue across several LLM providers ("works faster, better, solves better at night").

New comment by 343rwerfd in "Does the success of LLM support Wittgenstein's position that "meaning is use"?"

343rwerfd — Sat, 03 Aug 2024 00:13:22 +0000

Lots and lots words flow about this. For me, it is very simple. LLMs do a complex process, quite analog to human thinking/understanding/speaking/writing, but they're doing all those - most probably - in an alternative way to what humans do in their brains.

Directly comparing LLM's outputs with human output is like comparing a F22 flying with an eagle flying. Both fly obviously, but using entirely different processes to do so (different requirements, capabilities, despite the simplest similarity of both systems - the eagle and the F22 - at "doing fly").

You don't automatically say "an eagle is more capable at flying than a F22, because it flies with very little energy requirements while deploying quite better, reliable take-off / landing capabilities".

You actually don't usually go comparing these systems just because both can fly.

but many out there are pulling their hair trying to compare side by side the obvious mathematical systems that LLMs are to - most probably again - the completely different in nature systems that humans are.

New comment by 343rwerfd in "What are the threats to the AI boom?"

343rwerfd — Mon, 29 Jul 2024 01:09:38 +0000

Moroever, those millions of non-paying clients prompting the models are 24x7x265 working with 100% real world problems are inputing the models with valuable prompts, generating valuable content (originated in real situations, actually distilled by unvaluable billions of human sensory input).

That content can be and is used to train models, effectively cancelling the "data-wall", bit by bit, all the time.

New comment by 343rwerfd in "What are the threats to the AI boom?"

343rwerfd — Mon, 29 Jul 2024 01:03:14 +0000

nor OpenAI or any of the prompt-based AI companies actually "need" the reveneu from the services they sell, the whole point of having a public (free or not), prompt facing the entire planet is just having live humans doing RLHF 24x7x365, that information, that dataset is more valuable than any symbolic amount of money anybody is willing to pay for any GPT or clone suscription.

Anybody noticed already that any current or near future reveneu won't make a dench in the actual costs of running giant models, anyway, the models keep chugging along just fine. And the ("free") money keeps flowing in.

New comment by 343rwerfd in "What are the threats to the AI boom?"

343rwerfd — Mon, 29 Jul 2024 00:54:15 +0000

Not really, China is just a step behind, in a year they will be at current US AI state-of-art, without competition, from there they'll have all the GPUs in the world to keep improving their models.

Or most probably, US national interests will step in, just like it did with SpaceX, and will provide any amount of billions required to keep in the fight, at least till China steps out of the AI race.

Beside the (naive?), naysayers, until the technology actually shows it is failing to keep the promise of super powerful AIs, it is a global, geopolitical race to get there (AGI/ASI), or till the point it fails (new AI winter).

To make quite sure it really doesn't work, and not dropping out of the race just to see a chinese ASI (Artificial Super Intelligence), emerge a couple of years later.