OpenAI touts new approach to AI safety after troubling executive transfer

OpenAI announced on Wednesday a new approach to AI development, one that it said “aligns models to behave safely without extensive human data collection,” although some have raised concerns about AI interacting with AI.

The move coincided with an announcement to address questions about safety priorities, following the reassignment of a key safety executive.

On Tuesday, OpenAI Chief Executive Sam Altman said that AI safety leader Aleksander Madry is working on “a new research project”, according to a news report from Reuters. Other reports, including one from CNBC, said that Madry was being reassigned “to a job focused on AI reasoning.” Madry was among the executives that OpenAI announced in May would be improving the company’s security and safety efforts.

It also came at a time of renewed US government agency and congressional questions about the vendor’s dedication to safety and protections.

Ashish Thakkar, an AI programmer based in Mumbai, said the timing of the move was alarming.

“Something feels fishy about the whole thing, because just last month, OpenAI whistleblowers filed a complaint to SEC stating that the company does not allow them to speak openly about the safety concerns related to their AI technology. Is it possible they are re-structuring the entire AI safety team because of this?” Thakkar asked. “What I think could be going on is they are reshuffling the entire safety team, a team which would be led and controlled by a close group of people so that no leaks, or instances like June 2024 happen again. This is why there needs to be AI regulations in the US and worldwide now and not later. EU and China seem to understand this.”

When CNBC asked OpenAI about the move, an unnamed spokesperson said, without elaboration, that “Madry will still work on core AI safety work in his new role.”

Various industry players said that they were concerned that the move might be another indication that OpenAI’s focus is veering away from safety and data protection.

“While it’s impossible to know exactly what’s going on behind the scenes at OpenAI, shifting crucial personnel from a safety role to a job focused on reasoning, innovation, and implementation says one of two things: He wasn’t effective in his previous role, or Open AI is shifting priorities toward innovation, potentially at the expense of ethics,” said Brian Prince, CEO of TopAITools.com.

Rob Rosenberg, a New York entertainment attorney, said that he is also concerned.

“The re-assigning of one of OpenAI’s top safety executives from his role feels like a continuation of this pattern we’re seeing from OpenAI, where they announce initiatives towards safety and then undo those initiatives. We’ve already seen two of OpenAI’s senior leaders, Ilya Sutskever and Jan Leike, leave the company in May, citing issues over safety culture,” Rosenberg said. “OpenAI has been anything but Open.”

“Sam Altman has not been very forthcoming with OpenAI’s plans, including his recent post on X where he says Aleksander Madry is being reassigned to a new project, but does not disclose what that new project is,” he added. “An arms race is taking place among these generative AI companies to keep rolling out newer, better and faster products, and it feels like safety is repeatedly taking a backseat to those other initiatives at OpenAI.”

On Wednesday, OpenAI introduced what it said was a “new method leveraging Rule-Based Rewards (RBRs) that aligns models to behave safely without extensive human data collection.” It also published a technical document exploring the method in more detail.

The company said that it introduced the new approach to address some weaknesses in its current efforts.

“To ensure AI systems behave safely and align with human values, we define desired behaviors and collect human feedback to train a reward model. This model guides the AI by signaling desirable actions. However, collecting this human feedback for routine and repetitive tasks is often inefficient. Additionally, if our safety policies change, the feedback we’ve already collected might become outdated, requiring new data,” the vendor said.

“Thus, we introduce Rule-Based Rewards (RBRs) as a key component of OpenAI’s safety stack to align model behavior with desired safe behavior. Unlike human feedback, RBRs uses clear, simple, and step-by-step rules to evaluate if the model’s outputs meet safety standards. When plugged into the standard RLHF pipeline, it helps maintain a good balance between being helpful while preventing harm, to ensure the model behaves safely and effectively without the inefficiencies of recurrent human inputs. We have used RBRs as part of our safety stack since our GPT-4 launch, including GPT-4o mini, and we plan to implement it in our models moving forward.”

However, the company acknowledged potential drawbacks. “Shifting safety checks from humans to AI can reduce human oversight of AI safety and might amplify potential biases in the models if biased models are used to provide RBR rewards,” the statement said. “To address this, researchers should carefully design RBRs to ensure fairness and accuracy and consider using a combination of RBRs and human feedback to minimize risks.”

Read More from This Article: OpenAI touts new approach to AI safety after troubling executive transfer
Source: News

OpenAI touts new approach to AI safety after troubling executive transfer

Related posts