Confusion over whether or not OpenAI’s o3-mini has reached the major milestone of artificial general intelligence (AGI) or not deepened on Monday following a post on X by CEO Sam Altman that completely contradicts what he said two weeks ago in an interview with Bloomberg.
In the post he wrote, “twitter hype is out of control again. We are not gonna (sic) deploy AGI next month, nor have we built it. We have some very cool stuff for you but pls chill and cut your expectations 100x.”
OpenAI introduced its o3 model, the successor to the o1 model released last September, as part of its “12 Days of OpenAI” in December. A Reuters report last week stated, “the o1 models are capable of reasoning through complex tasks and can solve more challenging problems than previous models in science, coding, and math, [OpenAI] had said in a blog post. The new o3 and o3-mini models would be more powerful than the previously launched o1 models, the company had said previously.”
Three days ago, in another post from Altman on X, he thanked the “external safety researchers who tested o3-mini. We have now finalized a version and are beginning the release process; planning to ship in — a couple of weeks. Also, we hear the feedback: will launch API and ChatGPT at the same time! (it’s very good.)”
Coverage earlier this month that focused on his interview with Bloomberg revealed that OpenAI’s o3, which was first announced in December, is currently being safety tested, and, according to Altman, had “passed the ARC-AGI challenge, the leading benchmark for AGI.”
Altman also said the company is now “setting its sights on superintelligence, which is leaps and bounds beyond AGI, just as AGI is to AI.”
A report written by François Chollet, an independent software engineer and AI researcher, said, “ARC-AGI serves as a critical benchmark for detecting such breakthroughs, highlighting generalization power in a way that saturated or less demanding benchmarks cannot. However, it is important to note that ARC-AGI is not an acid test for AGI — as we’ve repeated dozens of times this year. It’s a research tool designed to focus attention on the most challenging unsolved problems in AI, a role it has fulfilled well over the past five years.”
Despite Chollet’s praise of the o3 model, he also stated, “passing ARC-AGI does not equate to achieving AGI, and, as a matter of fact, I don’t think o3 is AGI yet; o3 still fails on some very easy tasks, indicating fundamental differences with human intelligence.”
Furthermore, he wrote, “early data points suggest that the upcoming ARC-AGI-2 benchmark will still pose a significant challenge to o3, potentially reducing its score to under 30% even at high compute (while a smart human would still be able to score over 95% with no training). This demonstrates the continued possibility of creating challenging, unsaturated benchmarks without having to rely on expert domain knowledge. You’ll know AGI is here when the exercise of creating tasks that are easy for regular humans but hard for AI becomes simply impossible.”
Brian Jackson, principal research director at Info-Tech Research Group, said Monday, “we can debate about what AGI means, and whether one company or specific model has achieved it. It seems like the community expects it to be achieved this year or next. But, for CIOs, does it really matter?”
The fact is, he said, is “AI is getting more capable at a stunning rate and can solve more problems than it ever has. It’s likely that most of the use cases where organizations could benefit from AI don’t really require it to achieve ‘AGI’ at all. … CIOs don’t necessarily need to get caught up in knowing exactly which model has achieved AGI, and instead focus more on AI implementation and execution guided by a responsible AI governance framework that is aligned with the organization’s overall strategy.”
According to Jackson, “CIOs aren’t sitting in ivory tower offices discussing the virtues of one AGI benchmark over another; they’re asking their software developers to automate complex knowledge-based tasks and processes with foundation models. They’re thinking about how to set up an infrastructure environment they can trust, manage costs that scale up based on transactional demand, and monitor performance to ensure they will meet SLAs.”
Read More from This Article: Altman now says OpenAI has not yet developed AGI
Source: News