Us News

META’s AI Security Leader Defense Removes Guardrails in AI Model

Meta hopes to have more neutrality in its AI output. Cooper Neal/Zuffa LLC

Ella Irwin, head of generative AI security at Meta, said that as Meta continues to accept its commitment to free expression, its AI model focuses on priorities for its AI model. “It’s not a free everything, but we do want to move towards achieving freedom of speech,” Owen said in a speech at SXSW yesterday (March 10). “That’s one of the reasons why you see a lot of companies realizing and starting to go back a little too much guardrail.”

Such guardrails often filter or remove content that is considered toxic, biased or inaccurate to ensure the safe and ethical behavior of AI systems. But Irwin, who previously led X’s trust and security team and served as senior vice president of stable AI integrity, claimed that many tech players are reconsidering their effectiveness.

“If you think about the last years, many organizations have built more and more guardrails that are almost overcorrected,” said Owen. “What you start to see is the company really assessing the impact these guardrails have on the help and reliability of the products they provide.”

In this case, Meta is working to make its AI model more neutral and non-biased. According to Irving, a system of advice to respond to sensitive topics such as immigration is “not what we are looking for.” “We’re looking for facts, we’re looking for information. We’re not looking for advice,” she said.

Other examples of bias output include models that respond to problems when positively or negatively structured the subject’s problems and refuse to provide information in an alternative viewpoint. “No one uses our products really want you to guide you in one direction in terms of your perception of business,” Irwin said.

But guardrails are still needed for clear or illegal content, such as involuntary nudity or child sexual abuse materials, she added.

Company-wide transformation

Earlier this year, Meta cited freedom of preventive freedom of expression and prejudice as motivating factors in its decision to terminate its fact-check policy nine years later. In January, CEO Mark Zuckerberg announced that the company would repeal the plan, which relies in part on third-party organizations in favor of the “community notes” crowdsourcing model (as opposed to the model used by X), which relies on users to mark false information. Meta has also unveiled plans to reduce censorship on platforms like Facebook and Instagram and cut many of its DEI programs.

Irwin, who first started when he was working on the community notes but didn’t work on the program, described himself as a “huge supporter” of the approach. “It helps with bias, because you just had a group of people assessing and providing feedback,” said Owen, who left X in 2023 after he fought with Elon Musk for content review principles.

Musk has long been a supporter of relaxing the pace of content on social media. The AI ​​chatbot Grok, developed by his XAI, is an alternative to other “Wake” AI products. In February, his company took this strategy a step further by releasing a new voice mode for Grok, which offers an avant-garde response.

Other AI developers are also exploring potential biases in the output of the test model, Irwin said. “It’s not just meta,” she noted, adding that “everyone is moving in that direction.” Last month, for example, Openai announced that its model will increasingly engage in controversial topics, partly because of avoiding perceptions of promoting either one agenda.

“Sometimes, what you see as a ‘guardrail’ actually significantly affects freedom of speech,” Owen said. “So, it’s really hard to get to the right balance.”

META's AI Security Leader Defense Removes Guardrails in AI Model



Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Check Also
Close
Back to top button