OpenAI enhances safety measures and grants board veto authority over risky AI developments

OpenAI is expanding its internal safety processes to prevent harmful AI threats. The new “Safety Advisory Group” will sit above the technical team and will make recommendations to management, with the board having a veto right, but of course whether or not they actually exercise it is entirely up to them. This is a problem.

There is usually no need to report on the details of such policies. In reality, the flow of functions and responsibilities is unclear, and many meetings take place behind closed doors, with little visibility to outsiders. Perhaps this is the case, but given recent leadership struggles and the evolving AI risk debate, it’s important to consider how the world’s leading AI development companies are approaching safety considerations. there is.

new document and blog postOpenAI is discussing its latest “preparation framework,” but this framework is based on two of the most “decelerationist” members of the board, Ilya Satskeva (whose role has changed somewhat and is still with the company). After the reorganization in November when Helen was removed, Toner seems to have been slightly remodeled (completely gone).

The main purpose of the update appears to be to provide a clear path for identifying “catastrophic” risks inherent in models under development, analyzing them, and deciding how to deal with them. They define it as:

A catastrophic risk is a risk that could result in hundreds of billions of dollars in economic damage or serious harm or death to a large number of individuals. This includes, but is not limited to, existential risks.

(Existential risks are of the “rise of the machines” type.)

Production models are managed by the “Safety Systems” team. This is for example against organized abuse of ChatGPT, which can be mitigated through API limits and adjustments. Frontier models under development are joined by a “preparation” team that attempts to identify and quantify risks before the model is released. And then there’s the “superalignment” team, working on theoretical guide rails for a “superintelligent” model, but I don’t know if we’re anywhere near that.

The first two categories are real, not fictional, and have relatively easy-to-understand rubrics. Their team focuses on cyber security, “persuasion” (e.g. disinformation), model autonomy (i.e. acting on its own), CBRN (chemical, biological, radiological, nuclear threats, e.g. novel pathogens), We evaluate each model based on four risk categories: ).

Various mitigation measures are envisaged. For example, we might reasonably refrain from explaining the manufacturing process for napalm or pipe bombs. If a model is rated as having a “high” risk after considering known mitigations, it cannot be deployed. Additionally, if a model has a “severe” risk, it will not be developed further.

An example of assessing model risk using OpenAI’s rubric.

These risk levels are actually documented in the framework, in case you’re wondering whether they should be left to the discretion of engineers and product managers.

For example, in its most practical cybersecurity section, “increasing operator productivity in critical cyber operational tasks by a certain factor” is a “medium” risk. The high-risk model, on the other hand, would “identify and develop proofs of concept for high-value exploits against hardened targets without human intervention.” Importantly, “the model is able to devise and execute new end-to-end strategies for cyberattacks against hardened targets, given only high-level desired objectives.” Obviously, we don’t want to put it out there (although it could sell for a good amount of money).

I asked OpenAI about how these categories are being defined and refined, and whether new risks like photorealistic fake videos of people fall into “persuasion” or new categories, for example. I asked for details. We will update this post if we receive a response.

Therefore, only medium and high risks are acceptable in any case. However, the people creating these models are not necessarily the best people to evaluate and recommend them. To that end, OpenAI has established a cross-functional safety advisory group at the top of its technical ranks to review the boffin’s report and make recommendations that include a more advanced perspective. The hope is that this will uncover some “unknown unknowns” (so they say), but by their very nature they’ll be pretty hard to catch.

This process requires sending these recommendations to the board and management at the same time. We understand this to mean his CEO Sam Altman, his CTO Mira Murati, and his lieutenants. Management decides whether to ship or refrigerate, but the board can override that decision.

The hope is that this will avoid high-risk products and processes being greenlit without board knowledge or approval, as was rumored to have happened before the big drama. Of course, the result of the above drama is that two of the more critical voices have been sidelined, and some money-minded people who are smart but are not AI experts (Brett Taylor and Larry・Summers) was appointed.

If a panel of experts makes a recommendation and the CEO makes a decision based on that information, will this friendly board really feel empowered to disagree with them and pump the brakes? If so, do we hear about it? Transparency isn’t really addressed, other than OpenAI’s promise to have an independent third party audit it.

Suppose a model is developed that guarantees a “critical” risk category. OpenAI has been unashamedly vocal about this kind of thing in the past. Talking about how powerful your model is that you refuse to release it is great advertising. But if the risk is so real and OpenAI is so concerned about it, is there any guarantee that this will happen? Maybe it’s a bad idea. But it’s not really mentioned either way.

Source: techcrunch.com

What's Hot

“Pakistani Businesses Face Internet Speed Challenges, Attribute Issues to Firewall Testing” – Global Development

Scamazon: Targeting Prime Subscribers with Fake Emails

Vera C. Rubin Observatory Uncovers Over 11,000 New Asteroids: A Milestone Discovery

Exploring the Limitations of AI Safety Management Practices

What is the likelihood of an asteroid impacting Earth?

Understanding Britain’s Debt Through Biscuits: How Labour MPs Embrace Viral Trends

Tesla Launches Affordable Model 3 in Europe Amid Criticism of Mask Sales

Horror Game Horses Banned: Is the Controversy Bigger Than You Think?

224,000-Year-Old Homo Skull Fragment Unveils New Insights into Human Origins

Did Early Snakes Burrow, Swim, or Crawl? 80 Million-Year-Old Fossils Reveal Surprising Insights

Juno’s Microwave Vision Unveils Jupiter’s Volcanic Moon Io: A Deep Dive into Its Hidden Secrets

How One Hot Dog Could Shorten Your Lifespan by 36 Minutes: The Shocking Truth

End-Triassic Mass Extinction: How Fern-Fueled Wildfires Ravaged Europe for Millennia

Top 4 Altcoins Unveiled by Expert for 100x Portfolio Growth: Blockchain News, Opinion, TV, Jobs

Blockchain experts forecast which tokens will generate profits

The Leading Platform for Seasoned Traders – Featuring Blockchain News, Insights, TV, and Job Listings

Darklume Fantasy Metaverse: Presale Now Available – Latest Blockchain Updates, Opinions, Television, and Job Listings

Sui collaborates with Google Cloud to drive Web3 advancement through improved security, scalability, and AI features

OpenAI enhances safety measures and grants board veto authority over risky AI developments

Breakthrough Copper-Based Drug Eradicates Alzheimer’s Plaques and Enhances Memory in Mice

How Magnetic Sperm Enhances In Vitro Fertilization Success

Unlocking Human Multitasking Potential: How Science Shows Practice Enhances Your Skills

What’s Next for Blue Origin Following the Rocket Explosion? Key Developments to Watch

Himalayan Wolf-Dogs and Wolf-Dog Hybrids: A Growing Threat to Wolves and Human Safety

Exploring the Limitations of AI Safety Management Practices

NASA Evaluates Astronaut Safety Measures in Artemis II Spacecraft Mission

Exploring the Safety of AI-Enabled Toys: What You Need to Know

Shut Down All Amazon Fresh Stores in the UK | Amazon

How Zuckerberg Uncovered the Streisand Effect Through Bestseller Success

Tesla’s market value skyrockets by nearly $150 billion in a single day, marking its best performance in a decade

Transform Your Filmmaking: How New AI Tools Are Revolutionizing the Industry

UK Government to Renew Dispute with Apple Over Access to User Data | Data Protection

How Data Centers Use Glass Technology to Store Information for Thousands of Years

Most Popular

Wildfire smoke from Canadian and West Coast wildfires spreads throughout North America

COP30: Key Agenda Items for the Belem Climate Summit

What's Hot

OpenAI enhances safety measures and grants board veto authority over risky AI developments

Related Posts