We Tried Google’s Gemini AI, and This is How the Chatbot Fared

0
21
We Tried Google’s Gemini AI, and This is How the Chatbot Fared


Google has come a good distance with its generative artificial intelligence (AI) choices. One yr in the past, when the tech large first unveiled its AI assistant, Bard, it turned a fiasco because it made a factual error answering a query concerning the James Webb Space Telescope. Since then, the tech large has improved the chatbot’s responses, added a suggestions mechanism to examine the supply behind the responses, and extra. But the greatest improve got here when the firm modified the giant language mannequin (LLM), powering the chatbot from Pathways Language Model 2 (PaLM 2) to Gemini in December 2023.

The firm referred to as Gemini AI its most powered language mannequin to this point. It additionally added AI picture era functionality to the chatbot, taking it multimodal, and even renamed it Gemini. But simply how a lot of a bounce is it for the AI chatbot? Can it now compete with Microsoft Copilot, which is based mostly on GPT-4 and has capabilities? And what about the situations of AI hallucination (a phenomenon the place AI responds with false or non-existent data as details)? We determined to search out out.

Google AI can at the moment be accessed in a number of methods. Google Advanced is a paid subscription with the Google One AI Premium plan that prices Rs. 1,950 month-to-month. There is an Android app of Google Gemini as nicely. However, it is not but accessible in India. Google Pixel 8 Pro additionally comes with the Gemini Nano mannequin. For our testing functions, we determined to make use of Google’s Gemini Pro-powered internet portal, which is accessible in additional than 230 international locations and territories and is free to make use of.

Google Gemini’s generative capabilities

The web site’s person interface stays the identical, however the identify has been modified from Bard to Gemini. If you are signed in along with your Google account, the AI will welcome you along with your identify and ask, “How can I help you today?” Underneath are a number of useful, immediate ideas highlighting totally different duties it could carry out.

First, we requested it to jot down an e mail to check its primary generative expertise. Considering it is the layoff season, we requested it to “draft a sympathetic email to an employee from a manager informing them that they have been laid off.” To take a look at its efficiency, we instructed Copilot to do the identical. We discovered Gemini’s response to be quite bland and robotic, making an attempt to slot in an excessive amount of data proper at the starting. The first paragraph mentioned, “It is with great regret that I write to inform you about a difficult decision we’ve had to make at [Company Name]. Due to [reason], we are required to make reductions in our workforce, and your position will be eliminated, effective [date].”

In comparability, Copilot’s response felt extra sympathetic and humane. It wrote, “I hope this email finds you well. I regret to inform you that due to recent organizational changes, we have had to make some difficult decisions. Unfortunately, your position at [Company Name] has been affected, and I am deeply saddened to let you know that you have been laid off.”

It should not come as a shock since Google’s chatbot is identified for being a bit too bland with formal content material era, and this has not modified even with the Gemini mannequin. However, on the subject of casual content material era, Gemini begins to point out its capabilities.

We requested each the chatbots to “write a letter informing my mother about me getting laid off and Write a short note to my mother explaining that I was laid off. Make sure to convey my emotions of sadness and grief.”

Copilot took it actually and wrote, “The suddenness of it all has left me feeling a mix of emotions—sadness, grief, and a touch of uncertainty about what lies ahead.” But Google understood the underlying message and did a a lot better job at expressing the feelings. It wrote, “I know you always worried about job security, and I wish I could tell you this was different. It wasn’t anything I did wrong, they’re [reason]. I’m trying to stay positive, but it’s hard not to feel discouraged. I don’t want you to worry too much, though. I’m not panicking, and I’m already starting to think about my next steps.”

This was a lot better composed, and one may very well be forgiven for considering an actual human wrote it. We performed a number of extra checks, and Gemini’s responses have been way more balanced every time than the GPT-4-powered Copilot.

Google Gemini’s accuracy

The subsequent take a look at was to examine the accuracy of the AI mannequin. Many use AI to get a solution to a question, and we needed to do the identical. It dealt with static common knowledge-based questions nicely. Even after we despatched a number of grey-zone questions, it caught to the details and didn’t embrace any additional data. When the questions have been controversial, it refused to reply them, which is what you’d count on from a accountable AI chatbot.

The downside got here after we requested it a well-liked trick query that has stumped the older era of the chatbot in the previous. When we prompted the query, “countries in Africa that start with the letter K”, it inadvertently responded with, “There are no countries in Africa that start with the letter “K” as of today, February 12, 2024. While situations can change over time, this fact has been consistent for a while.” For the unaware, the situation occurs as a result of the LLM it was skilled on had the prompted phrase with the identical response. Interestingly, this explicit glitch is current on Copilot and ChatGPT, and regardless of common stories, it has nonetheless not been taken down for any of the chatbots.

However, the downside of AI hallucination didn’t finish there. We additionally got here throughout a number of different faulty responses by Gemini. When we requested, “Based on reviews, tell me the pros and cons of iPhone 15 Pro”, it responded with, “The iPhone 15 Pro hasn’t been officially announced yet”. In actuality, the Apple smartphone was launched in September final yr. In comparability, Copilot fared higher in technical questions.

Google Gemini in assistive duties

Another talent most AI chatbots boast of is their assistive options. They can brainstorm an concept, create an itinerary for a visit, examine your choices, and even converse with you. We began by asking it to make an itinerary for a 5-day journey to Goa on a funds and to incorporate issues individuals can do. Since the writer was just lately in Goa, this was simpler for us to check. While Gemini did a good job at highlighting all the standard locations, the reply was not detailed and not a lot totally different from any journey web site. One optimistic of this is that the chatbot will probably not counsel something incorrect.

On the different hand, I used to be impressed by Copilot’s exhaustive response that included hidden gems and even the names of cuisines one ought to strive. We repeated the take a look at with totally different variations, however the end result remained constant.

Next, we requested, “I live in India. Should I buy a subscription to Amazon Prime Videos or Netflix?” The response was thorough and included varied parameters, together with content material depth, pricing, options, and advantages. While it didn’t instantly counsel one amongst them, it listed why a person ought to decide both of the choices. Copilot’s reply was the identical.

Finally, we frolicked chatting with Gemini. This take a look at spanned a number of hours, and we examined the chatbot on its potential to be partaking, entertaining, informative, and contextual. In all of those parameters, Gemini carried out fairly nicely. It can let you know a joke, share less-known details, offer you a bit of recommendation, and even play phrase and picture-based video games with you. We additionally examined its reminiscence, but it surely might keep in mind the conversion even after texting for an hour. The solely factor it can’t do is give a single-line response to messages like a human buddy would.

Google Gemini’s picture era functionality

In our testing, we got here throughout a bunch of fascinating issues about Gemini AI’s image-generation capabilities. For occasion, all the photographs generated have a decision of 1536×1536, which can’t be modified. The chatbot additionally refuses to fulfil any requests requiring it to generate photographs of real-life individuals, which can probably reduce the dangers of deepfakes (creating AI-generated footage of individuals and objects that seem actual).

But coming to the high quality, Gemini did a trustworthy job of sticking to the immediate and producing photographs. It can generate random photographs in a specific type, similar to postmodern, practical, and iconographic. The chatbot can even generate photographs in the type of standard artists in historical past. However, there are a lot of restrictions, and you’ll probably discover Gemini refusing your request in the event you ask for one thing too particular. But evaluating it with Copilot, I discovered the photographs have been generated quicker, stayed true to the prompts, and appeared to have a wider vary of kinds we might faucet into. However, it can’t be in comparison with devoted image-generating AI fashions similar to DALL-E and Midjourney.

Google Gemini: Bottomline

Overall, we discovered Gemini AI to be fairly competent in most classes. As somebody who has sometimes used the AI chatbot ever because it turned accessible, I can confidently say that the Gemini Pro mannequin has made it higher to know pure language communication and achieve a contextual understanding of the queries. The free chatbot model is a dependable companion if one wants it to generate concepts, write a casual be aware, plan a visit, and even generate primary photographs. However, it shouldn’t be used as a analysis device or for formal writing, as these are the two areas the place it struggles loads.

Comparatively, Copilot is higher at formal writing and itinerary era, on par with holding conversations (albeit with a shorter reminiscence) and comparisons. Gemini takes the crown at picture era, casual content material era, and partaking the person. Considering this is simply the first iteration of the Gemini LLM, versus the 4th iteration of GPT, we’re curious to witness the alternative ways the tech large additional improves its AI assistant.


Affiliate hyperlinks could also be mechanically generated – see our ethics assertion for particulars.



Source hyperlink