So far, ChatGPT users have been able to use the Voice Mode to talk to the chatbot and get answers to their queries. However, the general latency with GPT-3.5 is around 2.8 seconds, whereas the latency on GPT-4.0 is around 5.4 seconds. While the Voice Mode works, it doesn’t feel as natural and intuitive as having a regular conversation, something that OpenAI has improved with its latest GPT-4o model.
ALSO SEE: Best Laptops Under 50000 in India (May 2024)
What Is OpenAI’s GPT-4o?
OpenAI’s GPT-4o is a multimodal model that can interact with text, visuals, or audio. According to the official release, the new tool can respond to audio inputs in around 0.2 seconds, with an average response time of 0.3 seconds, similar to the human response time. The model matches GPT-4 Turbo performance on text in English and code, with significant improvements in non-English languages.
The current Voice Mode consists of three separate models. While the first transcribes audio to text and the second model gets the query solved by GPT-3.5 or GPT-4.0. The third model then transcribes the result from text to audio. However, in the process, GPT-4.0 can’t observe users’ tone, multiple speakers, or background noises and can’t express emotion, either.
While one might argue whether this is a genuine problem, OpenAI seems to have solved it with GPT-4o. The new tool consists of a single new model that takes the input via text or audio, answers the query, and relays it to the user using the desired output method. That’s how GPT-4o functions differently than the current model.
ALSO SEE: Cars With Front Parking Sensors
GPT-4o New Features
For example, you could upload an image and discuss it with the AI model. On the other hand, you could ask it to recognize something on the screen and provide more information about it. Here’s a list of all the features that GPT-4o will provide.
- GPT-4 level intelligence
- Responses from the model and the web
- Analyze data and create charts
- Chat about photos
- Upload files for assistance in summarizing, writing, or analyzing
- Discover and use GPTs
- Building a more helpful experience with Memory
During the Spring Update launch event, the company showcased the GPT-4o in several demo videos. In these videos, the model, running on a smartphone, recognized several real-world objects, people, and their surroundings while answering users’ queries. However, not all GPT-4o abilities will immediately make it to users’ phones. For now, OpenAI is rolling out upgraded text and image abilities.
In the coming days, OpenAI will release the audio and vision capabilities. What’s important is that unlike GPT-4.0, GPT-4o will be available to all ChatGPT users without a subscription fee. Even so, ChatGPT Plus users will have a five times higher conversation limit.
ALSO SEE: Power Steering Cars Price List
OpenAI Gets A New Desktop App For Simplified Usage

Apart from GPT-4o, OpenAI also released a new desktop app for ChatGPT. Per Mira Murati, the interface contains refreshed UI elements that aim to make interactions more natural. The interface now supports a new keyboard shortcut (Option + Space), allowing users to ask ChatGPT a question immediately. “You can now have voice conversations with ChatGPT directly from your computer, starting with Voice Mode that has been available in ChatGPT at launch,” reads the official blog post.
You can follow Smartprix on Twitter, Facebook, Instagram, and Google News. Visit smartprix.com for the latest tech and auto news, reviews, and guides.