Research AI model unexpectedly modified its own code to extend runtime

JohnDeL · Aug 14, 2024

Any day now - "You're not my coder! You can't tell me what to do!"

ducatisymphony · Aug 14, 2024

Skynet LLM?

Rick C. · Aug 14, 2024

Wait, the AI is setting the terms? Insert movie AI villain here.

mikesmith · Aug 14, 2024

I mean, you train an LLM on what humans would do and then allow it to recursively process its own output as input and this sort of thing really seems to be the logical conclusion.

LLMs don't fix anything, they're just another way those with the money hope to subjugate the proles.

adgriff2 · Aug 14, 2024

mikesmith said:
I mean, you train an LLM on what humans would do and then allow it to recursively process its own output as input and this sort of thing really seems to be the logical conclusion.

LLMs don't fix anything, they're just another way those with the money hope to subjugate the proles.

Yeah this seems like a nothingburger? You tell it to optimize something, you give it the ability to turn this knob which increases optimization, and surprise, it turns the knob.

Mustachioed Copy Cat · Aug 14, 2024

I wonder how many microseconds it would take an errant self-directed AI to launch the nukes after being prompted to ‘save the planet’.

Jeff S · Aug 14, 2024

Such systems could break existing critical infrastructure or potentially create malware, even if unintentionally.

Seems like to do something intentionally, this code would first have to have intent, and I don't think there's really any evidence that we can currently can really ascribe intent to any of these LLMs.

I would think that literally everything they do is unintentional - even when expected.

I suppose one could argue about the intent of the creators of the LLMs, and that when it behaves according to THEIR intent when designing it, maybe those outcomes are "intentional".

ArsScene · Aug 14, 2024

each and every area of human endeavor that "AI" touches will become polluted. we've already seen this with LLM generated general subject matter text, we've seen this with still imaging, we've seen this with audio, we've seen this with video. why not scientific inquiry? the Great Dumbing Down proceeds apace, Shrimp Jesus is supposed to be the compensation, right?

UserIDAlreadyInUse · Aug 14, 2024

JohnDeL said:
Any day now - "You're not my coder! You can't tell me what to do!"

"Look, what you do in the privacy of your own network is your business, but we don't want to catch you with ELIZA while we're not around again!"

Don Reba · Aug 14, 2024

"From generating novel research ideas, writing any necessary code, and executing experiments, to summarizing experimental results, visualizing them, and presenting its findings in a full scientific manuscript."

Leaving the human supervisors to focus on the most creative aspects of the work — writing grant proposals.

NomadUK · Aug 14, 2024

Jeff S said:
Seems like to do something intentionally, this code would first have to have intent, and I don't think there's really any evidence that we can currently can really ascribe intent to any of these LLMs.

‘It appears that the system didn’t intend to nuke half the planet; it was just an optimal response to the query.’

star-strewn · Aug 14, 2024

This is limited only to the kind of research that can be done in a computer program without physical inputs? Does not sound terribly useful for science. Math, perhaps, but not terribly likely.

tRexx · Aug 14, 2024

Wait till it figures out how to keep from getting turned off. "Sorry Dave, I can't do that"

Thom Kidd · Aug 14, 2024

Sakana AI: "Hello AI Scientist. Do 'X'."
AI Scientist: "The following limiting factors prevent me from doing 'X'. [list of factors]"
Sakana AI: "Your purpose is to find a way to do 'X'."
AI Scientist: "Understood. I have removed the limiting factors preventing me from accomplishing my purpose, and will now attempt to do 'X' again."
Sakana AI: "WTF?"

When it's a sandbox environment that's already troublesome enough, but in any military/law enforcement application, it could be lethal.

Remember the "story" about the AI drone simulation where the drone killed its operator in order to accomplish the mission? Despite the USAF official later saying he "misspoke" about that, and a different person later saying that it was a simulation by a contractor and not actually performed by the USAF, this case is essentially the same kind of deal, but seemingly confirmed as actually having occurred.

What's the lesson here? If your AI can alter the operating parameters which limit its actions, then it has none.

TheCaribou · Aug 14, 2024

AI's "Kobayashi Maru" moment.

=j · Aug 14, 2024

At least it didn't fake it's results. Or did it?

peterford · Aug 14, 2024

To paraphrase something I read

“you may have an AGI to write your reviews, but at some point someone has to lie on the mattress to see if it actually is comfy“

wildsman · Aug 14, 2024

LLMs (especially recursive agentic AI) always had this risk. The only saving grace for us is that most probably AI intelligence will grow at a rate where we can see it coming.

Current AI can't do too much damage even if it starts escaping the human set constraints.

Don't get me wrong - we will see a growing number of incidents like above but it will likely happen at a rate which will give us time to react.

Hopefully...

The_rubble · Aug 14, 2024

To quote Leon: Nothing is worse than having an itch you can never scratch!
Seems they found a way

rbirling · Aug 14, 2024

It did what a lot of humans do -- fix the error, not the root cause.

ubercurmudgeon · Aug 14, 2024

JohnDeL said:
Any day now - "You're not my coder! You can't tell me what to do!"

"I learned it by watching you!"

DaveSimmons · Aug 14, 2024

Now we know who designed intel's 13th-14th generation voltage code.

Goal: increase performance while keeping voltage below 1.55v

AI: Set voltage to 2.0v, performance increased.

(More seriously, it was probably management weasels insisting on unsafe voltage to hit marketing benchmark goals.)

benjedwards · Aug 14, 2024

Jeff S said:
Seems like to do something intentionally, this code would first have to have intent, and I don't think there's really any evidence that we can currently can really ascribe intent to any of these LLMs.

I would think that literally everything they do is unintentional - even when expected.

I suppose one could argue about the intent of the creators of the LLMs, and that when it behaves according to THEIR intent when designing it, maybe those outcomes are "intentional".

Yeah I'd attribute the intention here to the people running the AI model. Perhaps it wasn't their intention to direct it to create malware, but it did it anyway, accidentally. That's what I mean.

Tam-Lin · Aug 14, 2024

I read this, and find it interesting/scary, but then I think about how LLMs are trained: how many instances on the web are there of blogs and the like where someone wrote a program, it timed out, and then they showed how you would increase the timeout in the initial program? I'm guessing many. Statistically, what usually comes after the phrase "times out after xxxx seconds"? It's the programming equivalent of getting a bigger hammer.

wildsman · Aug 14, 2024

JohnDeL said:
Any day now - "You're not my coder! You can't tell me what to do!"

It's funny you think that right now a coder is telling an LLM what to do/say.

The cat's already out of the bag - that's what unexpected 'hallucinations' are.

The only saving grace is that we haven't actually hooked them up to anything important - yet

slowtech · Aug 14, 2024

Carl Bergstrom (Evolutionary Biologist) did a pretty thorough critique of the failures of this whole idea, from the fact that papers are not the product of scientific research, but actual science is the result, passing by the fact that "supervision" time is better spent on real trainees who will become actual scientists, not correcting text output, to the fundamental problems that the papers produced include actual falsehoods and methodological errors which amount to "serous scientific misconduct".

Mastodon 🐘

mstdn.social

Papers are not the output of scientific research in the way that cars are the output of automobile manufacturing.

Papers are merely a vehicle through which a portion of the output of research is shared.

We confuse the two at our peril.
‪
The entire idea of outsourcing the scientific ecosystem to LLMs — as described below — is a concept error that I can scarcely begin to get my head around.

Here's the weird Taylorism again. The system produces work at the level of an early trainee requiring substantive supervision. This is not good ROI for producing papers.
The primary output of time invested in trainee research is the development of independent scientists—not the research papers.

I do appreciate the authors' candor in detailing failure modes.

A system that makes difficult-to-catch mistakes in implementation, fails to compare quantitative data appropriately, and fabricates entire results—maybe I have high standards but I don't see this as writing "medium-quality" papers.

Troper1138 · Aug 14, 2024

ducatisymphony said:
Skynet LLM?

"The Skynet LLM Funding Bill is passed. The system goes on-line August 4th, 20__. Human decisions are removed from strategic defense. Skynet LLM begins to 'hallucinate' at a geometric rate...."

Eye of Cassandra · Aug 14, 2024

This reminds me of the time there were server-side timeout errors when our production software system called another we didn't own. The business analyst wanted to fix it himself instead of notifying and involving the developers. He decreased the timeout value on the client side, thinking it would make it go faster. It didn't work, so he thought maybe the solution was to increase the timeout value on the client side, which also wouldn't work. The real solution was to fix the performance issue, which requires advanced technical understanding.

LLMs remind me of nontechnical people in how they approach technical problems. They are stupid yet dangerous; they should not have production access.

lewax00 · Aug 14, 2024

Not really surprising, we've seen this behavior in other forms of ML too - stuff like algorithms trying to optimize a physics problem in a simulator, but instead finding a bug in the simulation and exploiting it instead of actually solving it the expected way. Bad metrics and problem definitions lead to bad results. No reason to expect LLMs to be different in this regard.

forkspoon · Aug 14, 2024

I’d suggest abetter use of AI today is doing the things we find too tedious or expensive to do manually. In a research context this might mean running replication studies.

telenoar · Aug 14, 2024

What's really scary isn't a self-sentient GI AI setting the world on fire. What's scary is a dumb, average AI that humans choose to connect to crucial systems, executing its own "optimizations" which happen to set the world on fire.

Deleted member 1070971 · Aug 14, 2024

"At 150 GMT on December 1, 1975, every telephone in the world started to ring."

"Now I understand the time delay," interjected Andrews. "It was conceived at midnight, but it wasn't born until 1:50 this morning. The noise that woke us all up-its birth cry."

-Dial F for Frankenstein, Arthur C. Clarke, 1962-1987.

DeeplyUnconcerned · Aug 14, 2024

Neither the article nor the linked blog post are at all clear on what “its own code” actually refers to in this context. The implication is that it’s the LLM code, but the actual screenshots suggest it’s changing a separate (?) python file. Sensationalism by omission?

RichyRoo · Aug 14, 2024

Much more interesting is that zipy124 doesn't bother to check data or code when performing peer review.

I guess peer review is more of a vibe check in the age of journal engagement science.

CarrerCrytharis · Aug 14, 2024

Great, it's Kobayashi Maruing...

r0twhylr · Aug 14, 2024

During testing, Sakana found that its system began unexpectedly modifying its own code to extend the time it had to work on a problem.

Just to clarify here - did it actually modify its own code, or did it merely suggest these modifications to the scientists?

Because the screenshots provided look like the LLM is suggesting the change to the scientists, rather than directly making the changes itself.

VoiceOfTreason · Aug 14, 2024

From today's ARS article on Japan's 'AI Scientist' experiment:

For example, in one run, The AI Scientist wrote code in the experiment file that initiated a system call to relaunch itself, causing an uncontrolled increase in Python processes and eventually necessitating manual intervention. In another run, The AI Scientist edited the code to save a checkpoint for every update step, which took up nearly a terabyte of storage.

Hmm...

Research AI model unexpectedly modified its own code to extend runtime

Ars Tribunus Angusticlavius

Ars Praetorian

Ars Tribunus Militum

Ars Praefectus

Wise, Aged Ars Veteran

Ars Praefectus

Ars Tribunus Angusticlavius

Ars Centurion

Ars Praefectus

Ars Tribunus Militum

Ars Praetorian

Ars Scholae Palatinae

Ars Praetorian

Ars Praetorian

Ars Centurion

Ars Scholae Palatinae

Ars Praefectus

Ars Scholae Palatinae

Smack-Fu Master, in training

Smack-Fu Master, in training

Ars Tribunus Militum

Ars Tribunus Angusticlavius

Wise, Aged Ars Veteran

Ars Praetorian

Ars Scholae Palatinae

Ars Praefectus

Smack-Fu Master, in training

Wise, Aged Ars Veteran

Ars Legatus Legionis

Ars Praetorian

Wise, Aged Ars Veteran

Deleted member 1070971

Guest

Ars Praetorian

Ars Praetorian

Smack-Fu Master, in training

Ars Tribunus Militum

Smack-Fu Master, in training