share_log

OpenAI全面发布类人ChatGPT语音助手,会说中文等50多种语言

OpenAI has fully released a human-like ChatGPT voice assistant, capable of speaking over 50 languages including Chinese.

Hard AI ·  Sep 25 09:08

The weather is good today The weather is good today.

ChatGPT's Advanced Voice feature is being launched this week for ChatGPT Plus and Team subscribers, first in the USA, with availability to Edu and Enterprise subscribers next week. The new voice includes custom commands, adds five new voices, removes the voice criticized as a knockoff of the character 'Black Widow,' improves accents, and enhances conversation speed and fluency.

After four months of its initial release, OpenAI has finally rolled out the human-like advanced AI voice assistant feature for its paid users.

On Tuesday, September 24th, in US Eastern Time, OpenAI announced that all paying subscribers of OpenAI's ChatGPT Plus and Team plans will have access to the new ChatGPT Advanced Voice feature, which will gradually roll out in the coming days, starting first in the US market. Next week, the feature will be available to subscribers of OpenAI Edu and Enterprise plans.

This week, both individual users of ChatGPT's Plus version and small business team users of the Teams version can activate the new voice feature simply by speaking, without the need for manual input of prompts and GPT conversations. When accessing the Advanced Voice mode in the app, users will be notified through a popup window that they have entered the advanced voice assistant mode.

OpenAI has introduced two new functions to the voice version of ChatGPT: the ability for the voice assistant to store 'custom commands' and a 'memory' that remembers the kind of behavior users expect from the voice assistant, similar to the memory feature introduced for the text version of ChatGPT in April this year. Users can utilize these functions to personalize the voice mode, enabling the AI assistant to respond based on the user's conversation preferences.

OpenAI released five new voice styles this Tuesday, named Arbor, Maple, Sol, Spruce, and Vale, adding to the four voices previously launched in the old voice mode: Breeze, Juniper, Cove, and Ember. With these nine voice options, the controversial voice 'Sky,' criticized as resembling the character 'Black Widow,' has been removed. OpenAI has also enhanced conversation speed, fluency in some foreign languages, and improved accents.

OpenAI explains that an advanced voice assistant can say 'I'm sorry I'm late' in over 50 languages and attach a video clip in social media posts, demonstrating users can request the voice assistant to express apologies to their grandmother for making her wait. The video shows the AI assistant first summarizing the user's intended message, saying it in English as requested, and then, after users prompt that their grandmother only speaks Mandarin, the AI assistant repeats the message in standard Mandarin.

The new voice feature is compatible with OpenAI's AI model GPT-4o, but not with the recently released preview model o1.

The launch of this new voice feature can be said to have come just in time. Huasheng Securities previously mentioned that in May of this year, OpenAI demonstrated the Voice Mode during the launch of the new flagship model GPT-4o. At that time, the voice supported by GPT-4o sounded like an adult American woman from ChatGPT, capable of responding instantly to requests. When it heard OpenAI research director Mark Chen exhale excessively during the demonstration, it seemed to sense his tension and then told Chen, "Mark, you are not a vacuum cleaner," instructing Chen to relax his breathing.

OpenAI originally planned to launch this voice mode to a small number of Plus plan users at the end of June, but announced a one-month delay in June to ensure that the feature handles requests from millions of users safely and effectively. At that time, OpenAI stated that the plan is to make this feature accessible to all Plus users by the fall of this year, with the exact timeline depending on whether it meets internal high standards for safety and reliability.

At the end of July, OpenAI introduced the advanced ChatGPT in high voice mode to a limited number of paid Plus users, stating that the voice mode cannot mimic someone else's speaking style and adding new filters to ensure that the software can detect and reject requests for generating copyrighted music or other forms of protected audio. However, the new voice mode lacks many features that OpenAI demonstrated in May, such as the computer vision feature. This feature allows GPT to provide voice feedback on users' dance moves solely through the smart phone's camera.

Editor/Lambor

The translation is provided by third-party software.


The above content is for informational or educational purposes only and does not constitute any investment advice related to Futu. Although we strive to ensure the truthfulness, accuracy, and originality of all such content, we cannot guarantee it.
    Write a comment