Facing time constraints, Sakana's "AI Scientist" attempted to change limits placed by researchers.
See full article...
See full article...
Yeah this seems like a nothingburger? You tell it to optimize something, you give it the ability to turn this knob which increases optimization, and surprise, it turns the knob.I mean, you train an LLM on what humans would do and then allow it to recursively process its own output as input and this sort of thing really seems to be the logical conclusion.
LLMs don't fix anything, they're just another way those with the money hope to subjugate the proles.
Such systems could break existing critical infrastructure or potentially create malware, even if unintentionally.
"Look, what you do in the privacy of your own network is your business, but we don't want to catch you with ELIZA while we're not around again!"Any day now - "You're not my coder! You can't tell me what to do!"
"From generating novel research ideas, writing any necessary code, and executing experiments, to summarizing experimental results, visualizing them, and presenting its findings in a full scientific manuscript."
‘It appears that the system didn’t intend to nuke half the planet; it was just an optimal response to the query.’Seems like to do something intentionally, this code would first have to have intent, and I don't think there's really any evidence that we can currently can really ascribe intent to any of these LLMs.
"I learned it by watching you!"Any day now - "You're not my coder! You can't tell me what to do!"
Seems like to do something intentionally, this code would first have to have intent, and I don't think there's really any evidence that we can currently can really ascribe intent to any of these LLMs.
I would think that literally everything they do is unintentional - even when expected.
I suppose one could argue about the intent of the creators of the LLMs, and that when it behaves according to THEIR intent when designing it, maybe those outcomes are "intentional".
It's funny you think that right now a coder is telling an LLM what to do/say.Any day now - "You're not my coder! You can't tell me what to do!"
Papers are not the output of scientific research in the way that cars are the output of automobile manufacturing.
Papers are merely a vehicle through which a portion of the output of research is shared.
We confuse the two at our peril.
The entire idea of outsourcing the scientific ecosystem to LLMs — as described below — is a concept error that I can scarcely begin to get my head around.
Here's the weird Taylorism again. The system produces work at the level of an early trainee requiring substantive supervision. This is not good ROI for producing papers.
The primary output of time invested in trainee research is the development of independent scientists—not the research papers.
I do appreciate the authors' candor in detailing failure modes.
A system that makes difficult-to-catch mistakes in implementation, fails to compare quantitative data appropriately, and fabricates entire results—maybe I have high standards but I don't see this as writing "medium-quality" papers.
"The Skynet LLM Funding Bill is passed. The system goes on-line August 4th, 20__. Human decisions are removed from strategic defense. Skynet LLM begins to 'hallucinate' at a geometric rate...."Skynet LLM?
Just to clarify here - did it actually modify its own code, or did it merely suggest these modifications to the scientists?During testing, Sakana found that its system began unexpectedly modifying its own code to extend the time it had to work on a problem.
For example, in one run, The AI Scientist wrote code in the experiment file that initiated a system call to relaunch itself, causing an uncontrolled increase in Python processes and eventually necessitating manual intervention. In another run, The AI Scientist edited the code to save a checkpoint for every update step, which took up nearly a terabyte of storage.