The Future Is Here
We may earn a commission from links on this page

OpenAI’s Trust and Safety Head Steps Down as Devs Pledge to Spend More Time Fixing ChatGPT

ChatGPT can now remember users' previous commands, and OpenAI is promising it will spend more time fine-tuning its GPT-3.5 and GPT-4 models.

We may earn a commission from links on this page.
Students raise their hands for a question as OpenAI Chief Executive Officer Sam Altman looks on during an event at Keio University on June 12, 2023 in Tokyo, Japan.
OpenAI CEO Sam Altman has been fielding a fair few questions recently from federal agencies about the dangers of his ChatGPT chatbot. Unfortunately, he just lost his trust and safety head.
Photo: Tomohiro Ohsumi (Getty Images)

ChatGPT-maker OpenAI is going through a bit of a shakeup. The company is adding more features that will help the chatbot remember instructions, though developers will also be spending a lot more time making sure the current version of the company’s precious language model is working as intended. But it’s one step forward and one step back as the company is losing its head of trust and safety.

On Thursday, OpenAI said Plus users paying for ChatGPT can add “custom instructions” to the chatbot. The new instructions essentially allow users to get a more consistent response from the bot. According to the company’s fact sheet, users can simply add these instructions into a special field found under their account name. Instructions could include forcing the chatbot to respond more formally or under a specified word count. Instructions can even tell ChatGPT to “have opinions on topics or remain neutral.”


Just how “opinionated” AI should appear is a constant topic of discussion in AI circles. Unfortunately, the ChatGPT will need to find somebody else to lead on those considerations. OpenAI’s head of trust and safety Dave Willner wrote in a LinkedIn post late Thursday that he would be moving into an “advisory role” at the company. He cited the work’s conflicts with his home life and taking care of his children especially as “OpenAI is going through a high-intensity phase in its development.”

In an email statement to Gizmodo, OpenAI said:

We thank Dave for his valuable contributions to OpenAI. His work has been foundational in operationalizing our commitment to the safe and responsible use of our technology, and has paved the way for future progress in this field. Mira Murati will directly manage the team on an interim basis, and Dave will continue to advise through the end of the year. We are seeking a technically-skilled lead to advance our mission, focusing on the design, development, and implementation of systems that ensure the safe use and scalable growth of our technology.


It’s a rather important position for the well-scrutinized company regarding its ultra-popular chatbot. The Federal Trade Commission recently demanded mountains of documentation from OpenAI on its AI safeguards. This is all while company CEO Sam Altman is trying to position himself as a thought leader on AI regulation.

Now, OpenAI is going to be spending a lot longer fine-tuning its current AI models, rather than moving on to the next big thing. On Thursday, the company announced it would be extending support for the GPT-3.5 and GPT-4 language models on its API until at least June 13 next year. The news came just a day after a group of Stanford and UC Berkeley researchers released a study showing ChatGPT has gotten significantly worse at some tasks like math and coding compared to earlier versions.

Though some fellow academics questioned some of the paper’s findings, the study put a damper on claims from OpenAI execs that “we haven’t made GPT-4 dumber.” Some developers questioned whether language models like GPT-4 could be useful to businesses if even minor changes could upend the model’s capabilities. OpenAI now says it will try and give more devs an idea of what changes they’re making to the model. Still, it’s unlikely that OpenAI will give regular users any more clues about what’s contained within their rather opaque large language model.

“While the majority of metrics have improved, there may be some tasks where the performance gets worse,” the company wrote in their updated blog post. “We understand that model upgrades and behavior changes can be disruptive to your applications. We are working on ways to give developers more stability and visibility into how we release and deprecate models.”


Part of these improvements have focused on “factual accuracy and refusal behavior,” something that was noted in the researchers’ report as having changed significantly in the few months since GPT-4’s release. The researchers noted that GPT-4 still responds to “harmful” prompts that could include misogyny or instructions on committing a crime 5% of the time. It’s the kind of subject a trust and safety head would need to focus on—if OpenAI can get a new one to stick.