25 February 2024

OpenAI’s ChatGPT: Now Talks and Sees, Setting New AI Standards

OpenAI, the artificial intelligence powerhouse located in San Francisco, has introduced a new version of its chatbot that can both talk and see. ChatGPT, the chatbot, can now engage with users through spoken words and reply to photos they post.

Enhanced OpenAI ChatGPT Features: Voice and Image Recognition

ChatGPT’s latest version, published on Monday, has two new features that make it more human-like than ever before. First, it can now converse with users using synthetic voices that, according to reports, sound more lifelike than previous digital assistants. Users can select between five distinct voices, including male and female voices. Second, it may now reply to user-uploaded photos. Users may, for example, upload a snapshot of the interior of their refrigerator, and ChatGPT will recommend meals based on the items they have.

ChatGPT is driven by a large language model, or LLM, which has learnt to produce natural language by examining billions of online phrases. With the inclusion of speech capability, ChatGPT may appear to be comparable to voice assistants such as Siri and Alexa. It is, however, distinct since it is driven by LLM technology and can thus handle a broad number of themes and jobs without being pre-programmed. On the fly, it can compose and (now) read emails, poems, term papers, and jokes.

According to OpenAI, these improvements have the goal to make ChatGPT more accessible and valuable to everyone. It also claims that the voices employed by ChatGPT are more believable than those used by popular digital assistants. The feature may be seen of as a more natural method of communicating with its chatbot, particularly for persons who are not comfortable typing or reading.

Users may begin using voice by going to Settings > New features on the mobile app and selecting the option to participate in voice discussions.

Meanwhile, the picture function is quite useful. Users can, for example, submit a photograph, chart, or diagram, and ChatGPT will offer a full explanation of the image as well as answers to inquiries regarding its contents.

While OpenAI displayed the picture tool in the spring, it claimed it delayed public distribution owing to concerns about assault. The corporation was concerned that the device may be used to swiftly identify persons in images, among other things.

Over the following two weeks, the updated version of ChatGPT will be made accessible to everyone who subscribes to ChatGPT Plus, a $20-per-month subscription, and Enterprise. The voice capability, however, is only available on iPhones, iPads, and Android smartphones. The picture function is accessible via both online and mobile devices.

OpenAI’s Ongoing AI Advancements

In recent weeks, OpenAI has been rapidly releasing AI tools. It previously announced a new version of its DALL-E picture generator, which it has incorporated into ChatGPT, allowing users to ask the chatbot to produce photos for them.

Since its introduction in November of last year, ChatGPT has garnered hundreds of millions of users. It has also inspired numerous other firms, like Google Bard and Microsoft Bing, to develop similar services. OpenAI is gaining ground on its rivals in the field of conversational AI with the new version of ChatGPT, while simultaneously competing with older technologies such as Alexa and Siri.

