ChatGPT Can Now See, Hear, And Speak As OpenAI Introducing Voice & Image Features

0
30
ChatGPT Can Now See, Hear, And Speak As OpenAI Introducing Voice & Image Features


New Delhi: OpenAI has introduced that it’s rolling out new voice and picture capabilities in ChatGPT for extra intuitive kind of interface. It will permit customers to have a voice dialog or present ChatGPT what they’re speaking about. Until now, ChatGPT is just restricted to textual content kind the place you can provide info solely in textual content enter.

“Snap a picture of a landmark while traveling and have a live conversation about what’s interesting about it,” OpenAI weblog mentioned.

OpenAI is rolling out the brand new replace in coming two weeks for ChatGPT plus and Enterprise customers. Voice characteristic will solely out there on iOS and Android whereas photographs will probably be out there on all platforms.

How to start out voice dialog in cellphone

Step 1: To get began with voice, head to Settings → New Features on the cellular app and decide into voice conversations. 

Step 2: Then, faucet the headphone button situated within the top-right nook of the house display screen and select your most well-liked voice out of 5 totally different voices.

Step 3: The new voice functionality is powered by a brand new text-to-speech mannequin, able to producing human-like audio from simply textual content and some seconds of pattern speech.

Step 4: We collaborated with skilled voice actors to create every of the voices. We additionally use Whisper, our open-source speech recognition system, to transcribe your spoken phrases into textual content.

Chat about photographs

You can now present ChatGPT a number of photographs. Troubleshoot why your grill received’t begin, discover the contents of your fridge to plan a meal, or analyze a posh graph for work-related information. To concentrate on a particular a part of the picture, you need to use the drawing instrument in our cellular app.

How to start out picture choice

Step 1: To get began, faucet the photograph button to seize or select a picture. If you’re on iOS or Android, faucet the plus button first. 

Step 2: You may talk about a number of photographs or use our drawing instrument to information your assistant.

Step 3: Image understanding is powered by multimodal GPT-3.5 and GPT-4. These fashions apply their language reasoning abilities to a variety of photographs, reminiscent of pictures, screenshots, and paperwork containing each textual content and pictures.





Source hyperlink