OpenAI’s ChatGPT ‘Voice Mode’ Doesn’t Meet Safety Standards; Rollout Pushed to July

27 June 2024 at 07:41

Voice Mode, OpenAI Voice Mode

Experts are raising eyebrows after OpenAI announced a one-month delay in the rollout of its highly anticipated “Voice Mode” feature for ChatGPT, citing safety concerns. The company said it needs more time to ensure the model can “detect and refuse certain content.”

“We’re improving the model’s ability to detect and refuse certain content. We’re also working on enhancing the user experience and scaling our infrastructure to support millions of users while maintaining real-time responses.” - OpenAI

The stalling of the rollout comes a month after OpenAI announced a new safety and security committee that would oversee issues related to the company’s future projects and operations. It is unclear if this postponement was suggested by the committee or by internal stakeholders.

Features of ChatGPT’s ‘Voice Mode’

OpenAI unveiled its GPT-4o system in May, boasting significant advancements in human-computer interaction. “GPT-4o (‘o’ for ‘omni’) is a step towards much more natural human-computer interaction,” OpenAI said at the time. The omni model can respond to audio inputs at an average of 320 milliseconds, which is similar to the response time of humans. Other salient features of the “Voice Mode” promise real-time conversations with human-like emotional responses, but this also raises concerns about potential manipulation and the spread of misinformation. The May announcement gave a snippet at the model’s ability to understand nuances like tone, non-verbal cues and background noise, further blurring the lines between human and machine interaction. While OpenAI plans an alpha release for a limited group of paid subscribers in July, the broader rollout remains uncertain. The company emphasizes its commitment to a “high safety and reliability” standard but the exact timeline for wider access hinges on user feedback.

The ‘Sky’ of Controversy Surrounding ‘Voice Mode’

The rollout delay of “voice mode” feature of ChatGPT follows the controversy sparked by actress Scarlett Johansson, who accused OpenAI of using her voice without permission in demonstrations of the technology. OpenAI refuted the claim stating the controversial voice of “Sky” - one of the five voice modulation that the Voice Mode offers for responses – belonged to a voice artist and not Johansson. The company said an internal team reviewed the voices it received from over 400 artists, from a product and research perspective, and after careful consideration zeroed on five of them, namely Breeze, Cove, Ember, Juniper and Sky. OpenAI, however, did confirm that its top boss Sam Altman reached out to Johannson to integrate her voice.

“On September 11, 2023, Sam spoke with Ms. Johansson and her team to discuss her potential involvement as a sixth voice actor for ChatGPT, along with the other five voices, including Sky. She politely declined the opportunity one week later through her agent.” - OpenAI

Altman took a last chance of onboarding the Hollywood star this May, when he again contacted her team to inform the launch of GPT-4o and asked if she might reconsider joining as a future additional voice in ChatGPT. But instead, with the demo version of Sky airing through, Johannson threatened to sue the company for “stealing” her voice. Owing to the pressure from her lawyers, OpenAI removed the Sky voice sample since May 19.

“The voice of Sky is not Scarlett Johansson's, and it was never intended to resemble hers. We cast the voice actor behind Sky’s voice before any outreach to Ms. Johansson. Out of respect for Ms. Johansson, we have paused using Sky’s voice in our products. We are sorry to Ms. Johansson that we didn’t communicate better.” – Sam Altman

Although the issue seems to have resolved for the time being, this duel between Johannson and Altman brought to the fore the ethical considerations surrounding deepfakes and synthetic media.

Likely Delays in Apple AI and OpenAI Partnership Too

If the technical issues and the Sky voice mode controversy weren’t enough, adding another layer of complication to OpenAI’s woes is Apple’s recent brush with EU regulators that now casts a shadow over the future of ChatGPT integration into Apple devices. Announced earlier this month, the partnership aimed to leverage OpenAI's technology in Cupertino tech giant’s “Apple Intelligence” system. However, with Apple facing potential regulatory roadblocks under the EU’s Digital Markets Act (DMA), the integration’s fate remains unclear. This confluence of factors – safety concerns, potential for misuse, and regulatory hurdles – paints a complex picture for OpenAI's “Voice Mode.” The cybersecurity and regulatory industry will undoubtedly be watching closely as the technology evolves, with a keen eye on potential security vulnerabilities and the implications for responsible AI development.

Reading view

Features of ChatGPT’s ‘Voice Mode’

The ‘Sky’ of Controversy Surrounding ‘Voice Mode’

Likely Delays in Apple AI and OpenAI Partnership Too