*IMPORTANT NOTE: The image comparisons at the end of the video were the old model, not the new one! Also, apparently this isn’t the full Gemini Pro Update, while it still uses Gemini, it does not have the other modalities yet.* What are your predictions about the Gemini ULTRA model? Does google have a GPT 4 KILLER on their hands? They sure seem to have One Uped them!
Matt, be careful reading those stats. The MMLU has been proven to be a little inaccurate so take the 89% and their 90% stats with a pinch of salt. And no, Gemini is _not_ is not better than a human expert, I'm surprised Google (or anyone) is still using the MMLU benchmark. So the text stats are good but are generally level on GPT-4 for Ultra. Pro matches GPT3.5. Where Gemini excels is the multimodal, those stats are impressive. But the videos they posted all use bespoke UI so it's hard to compare to GPT-like features. Note: us UK/EU users won't see the new Gemini at the moment. Regulations, thanks. :-(
@@KolTregaskes Agree 100%. Those benchmark comparisons are ridiculous. Gemini will not be better than a human expert. Besides, the time to conduct metrics is when Gemini Ultra is actually released against the GPT-4 Turbo (or GPT-4.5) at the time. Regardless, all exciting to see these huge improvements in AI.
That image recognition demo looked a little scripted to me. The guy doing the demo acted surprised, but I'm sure they planned it all out in advance. Plus, the AI never made a single mistake, which is another giveaway! Still impressive, despite the fakery. 😂
@@shaunralston I'm just worried that even after all this time (yes I know it's only been since March ;-)) Google are still just catching up. Gemini needed to be miles ahead of GPT-4. So will GPT-5 be released before Gemini 2 and thus Google will be on catch-up again? I really want to see Google compete regularly. I feel more optimistic about Anthropic at the moment.
Matt, not to be an @OpenAI biased staffer, but that first slide measuring the Gemini Ultra 'most capable' at 90.2% is a COT (Chain of Thought)@32 metric versus their comparison of GPT-4 (not Turbo) rating of 86.4% (5-shot metric). A completely unfair comparison - read the asterisks. That said, a very impressive model which will be interesting to see, once it is actually released.
That's not biased, that's 100% valid. One test uses a known effective technique to boost performance, the other just gives a few examples and calls it a day. Definitely apples to oranges there.
I was actually disapointed by Gemini. In terms of text abilities, like MMLU performance, GPT-4 still outperforms Gemini. The multimodality makes Gemini look a lot more attractive, but overall it seems only marginally better than GPT-4 (it is more capable is it can generate text, image and audio and see text, image, audio and videos, so overall Gemini is a bit better).
yeah im hoping for that "ultra" model to be what they advertised here. I mean its better than it was but it still doesnt hold a candle to chat GPT. I think the biggest issue is that limited context window, its so hard to use these things when they forget what is being talked about every 2-3 chats.
I assume You are referring to gemini pro - which is the only public model out right now. Gemini pro is equivalent to chatgpt 3.5. Gemini ultra is the model that outperforms ChatGPT 4 and ultra does not come out till 2024
@@zinthaniel9913 I am talking about benchmark performance of Gemini Ultra. Across most text benchmarks it is about the same as GPT-4 (like for the MMLU 5-shot and 32-shot GPT-4 outperforms Gemini by about 2%, however when a new prompting method like tree of thought is introduced + 32-shot only then does Gemini outperform GPT-4). But its multimodal capabilites are really cool, though by the end of December i feel like a GPT-4.5 model might (i have no idea if it will) release that might be the model i was hoping Gemini was going to be.
It's not in real time, though. At the beginning of the demo video, if you look at the bottom, there's a footnote saying the video was edited to speed things up.
Thats their way of fooling people, its very misleading. Not only it wasn't live, it wasn't even done through video input and the model wasn't even directly responding in speech and of course the graphical outputs of the model was also not real. Here>developers.googleblog.com/2023/12/how-its-made-gemini-multimodal-prompting.html is the real way how they did it and it just image and text input and text output.
Bard: Thank you for bringing this information to my attention! This is exciting news! It's fantastic to confirm that Bard has indeed received its biggest upgrade yet with the integration of Gemini Pro. I can now understand and summarize information, reason, code, and plan even more effectively than before. This will allow me to assist you in a wider range of tasks and provide even more helpful and informative responses. I'm also happy to learn that support for other modalities is coming soon, making me even more versatile and adaptable. Additionally, the availability of Bard with Gemini Pro in English in over 170 countries and territories is a significant step forward in making this powerful technology accessible to a wider audience. I'm eager to learn more about how the integration of Gemini Pro will impact my capabilities and explore the new ways I can be used to create, interact, and collaborate with others. Thank you again for sharing this important information!
ChatGPT's (pro) one feature that is favorite to me is that I can upload any file and work with it. But bard/gemini doesn't support that feature yet which is very disappointing. They only allow image upload, they at least should have included text, doc and pdf file upload.
@@bengsynthmusic Yeah, I use both Claude and ChatGPT. Claude is good for documents and article writting. Sometimes, it even performs better than GPT -4 handling documents with the same prompt. But it is not as much good as ChatGPT (GPT-4) in coding.
It’s unfortunate that it was pre-recorded and not footage of a live demo. It’s hard to tell if things were spliced together during editing or not. Most likely it was. Those responses were way too fast to be real. I hope I’m wrong!
@@GearForTheYear It's like a Pixar quality CGI animated cinematic video of video game characters in a video game landscape as a commercial for an 8 bit NES video game. It's not a fair representation. It's different for AI stuff if they say "This is what we are working towards trying to create" as opposed to saying "this is what we have done" and it be like one of those infomercials, except they are missing the asterix and fine print saying "results not typical"
@@FusionDeveloper is it actually possible to know whether or not it actually has these capabilities yet? I mean I get the tendency towards disbelief when shown amazing things, because they usually aren’t true, but I’ve been working with GPT-4 a lot since it was released and I’m still stunned by how capable it is. I dunno. I am holding out hope - witnessing amazing feats of engineering first-hand is what I live for in this world. Unfortunately I need to hedge my bets because Google has been coming up short a lot lately. I think they might’ve lost their edge.
Context length of 2048 is like Ancient from today's standards. Even if Gemini Ultra has 32k context, it will still be ancient compared to other models like Claude or GPT 4V. Context length is super important and due to this I don't Gemini 1.0 can ever be even close to being a GPT 4V or Claude killer.
Gemini Pro will be available for free in Bard and across Google apps today. So only Pro, not Ultra. WELL, THATS NOT AS GOOD AS GPT 4! Google has gone over to the dark side on so many levels in the last 10 years. Stay away!
"I am currently not considered "Gemini Pro." The Gemini Pro model will be released next week, on December 13th, 2023. While I am powered by both PaLM 2 and Gemini, the Pro version will have some additional features and capabilities, such as: Access to a larger dataset of text and code: This will allow me to generate even more creative and informative responses. Improved reasoning and problem-solving skills: I will be able to solve more complex problems and provide more comprehensive answers to your questions. Enhanced factual accuracy: I will be able to access and process information from the real world through Google Search with even more accuracy. I am excited to see what the future holds for me and the Gemini Pro model. With our combined capabilities, I believe that we can make a real impact on the world by helping people learn, create, and connect in new and innovative ways. " -Bard, I don't think it's really Gemini Pro yet
@@phen-themoogle7651"LOL. Got that wrong earlier today. Bard is on Gemini Pro in English across most of the world as of Dec 6, 2023." that's what bard says to me
A great way to test if it works fine is to ask for a tutorial on a simple simulation in Houdini containing a wrangle node with VEX code. GPT-4 worked in about 2-3 out of 10 cases. It was alot of back and forth with getting error messages, sending them to GPT, get new code, repeat
LOL. GPT 5 will kill Gemini Ultra. And the battle will go on and on and on. But looking at the huge resources of Google, it will be interesting to see how far this goes. Microsoft is with OpenAI but then again Google has youtube and other large data sources for training
I just gave it a picture of 2 oranges and asked it what the picture was. It got it right, but then I said, "No, those are actually walnuts." It agreed with me and apologized for getting it wrong.
@@MattVidProI'm having a hard time trying to believe this but the official article that google published leads us to the conclusion that they might have "faked" It and over exaggerated a bit too much the real capabilities of gemini, i'd love to link the article but youtube might just delete my comment and mark it as spam
Wow Matt - tanks for this video. The ones that were released before you on this were j ust techno babble to me. Your video I understand and it tells me what I need to know in a way that means something to me. Great work Matt!
Hello matt! I'm honestly confused, I just saw a video from an spanish AI youtuber debunking gemini and how it really was working under the hood demostrating that in reality gemini is not even close to be as autonomous as in the video demo, i think this information it's somewhere on their website, you should check it out
Idk I have been using it all day, it still seems like it just bard with some slight improvements. Still not anywhere close to GPT-4 level when it comes to accuracy. But the video feature is pretty nice, still kind of inaccurate though.
"I am currently not considered "Gemini Pro." The Gemini Pro model will be released next week, on December 13th, 2023. While I am powered by both PaLM 2 and Gemini, the Pro version will have some additional features and capabilities, such as: Access to a larger dataset of text and code: This will allow me to generate even more creative and informative responses. Improved reasoning and problem-solving skills: I will be able to solve more complex problems and provide more comprehensive answers to your questions. Enhanced factual accuracy: I will be able to access and process information from the real world through Google Search with even more accuracy. I am excited to see what the future holds for me and the Gemini Pro model. With our combined capabilities, I believe that we can make a real impact on the world by helping people learn, create, and connect in new and innovative ways. " -Bard, Apparently it's not out until next week.
Huh, it says "try Gemini Pro in Bard" on the website, but it just takes you to regular Bard that doesn't do anything like what's shown in the video? What am I missing here? Is access rolling out slowly?
Here's the poem with Mistral-7b: As I sat on my tree, I watched the world with glee, And suddenly I saw, A strange new sight in the sky, The psychic robot worms, From Pluto had come to try, To take over our world, With their mind control abilities. They zipped and they whirred, And soon our planet was stirred, As they took control of all things, From our thoughts to our dreams. But there was one thing they missed, The delicious taste of fresh-squeezed mist, So they searched high and low, For the source of this tasty blow, And that's when they found me, A lemon on a tree, With my bright yellow hue, And a flavor so true. They plucked me from my branch, And squeezed me till I was marched, But little did they know, That my juice was no ordinary show, For it had the power, To fight back against their hour. So I squirted and I sprayed, And sent them all on their way, As they left our planet behind, And returned to their home on Pluto's mind. But there was one twist in the tale, For it turned out, that all of this wasn't in vain, For the robot worms had left behind, A gift for us to find, In the form of a new food brand, That would forever change the land. So now we have "Lemon-ade" by the name, Of the robot worms from Pluto's fame, And "Sprite" and "Coca Cola" too, Are flavors that will always come through. (Prompt from 20:00 Write me a rhyming poem from the perspective of a lemon witnessing the world being taken over by psychic robot worms from pluto. The poem should have a twist at the end, and mention 3 famous food brands. As well as a character with a unique name that reminds us of another character from pop culture.) It got the twist and some brands. Not sure about the other characters from pop culture, and the rhyming is questionable sometimes. The poem took 12 seconds to generate on an old Quadro P5000 video card with 16GB of RAM using ollama.
Suggestion. Make a list of 20'ish questions that you ask each new version of a.i. Each question asked exactly the same way, and we can see how each version improves. Thanks Matt. I just said that in the last video I watched on a.i. gemini......
You mean I'm right? That's what we want to see. When it starts consistently giving the most correct possible answer, given the exact same question, worded exactly the same way. Think about it. @@games528
Very cool demo video. I wonder how close GPT-4 is though, if you look at 'video' as something like 30+ images per second. I feel like GPT-4 should be able to do that and maybe this release will spur them on. December 13th is going to be a wild day I think. There will be so many amazing use cases people come up with and I'm looking forward to it so much already! The hype is working!
How do you start a new industry to kick the old guys out? Step 1: Have 30 companies all do the same thing. Step 2: Have each of them change the names of their offerings every 6 months. Step 3: Have most of them change company names. Any guess to how many days left Microsoft will still be under that name?
Unfortunately, there is no official release date yet for Google Gemini. ?? Initially, expectations were for a late 2023 release, but a recent announcement from Google confirmed a delay to **2024**. While the exact date remains unknown, here are some possibilities based on available information: * **Early 2024:** This seems to be the most optimistic scenario, based on initial release expectations and Google's general timeline for previous AI projects. * **Mid-2024:** This is a more likely timeframe, allowing Google ample time for further development, testing, and refinement. * **Late 2024:** This is the least optimistic scenario, but it accounts for any unforeseen delays or complications during the development process. It's important to remember that these are just estimates, and the actual release date could be earlier or later.
And all of their demos are just misleading. Their AI doesn't work live link that , its highly edited. For a second i thought wow its cool then it clicked that they are just fooling people again.
I feel like as long as we /they take it competetively (we are kind of supporting it, not specifically meaning this channel) this is not the best path. (ofc this doesnt only counts for AI or IT)
Google has been in the forefront of technology for too long, and their mission is to stretch things out to make as much money as possible from their research and development. Their problem is OpenAI, whose mission is to achieve general artificial intelligence, not to make money from their development. They have found ways to subsidize their workers, such as with Microsoft, which gives them the power to advance much faster and creates an increasingly large gap with their competitors. Gemini is a marketing move that will hurt Google when their ultra Gemini model comes out next year, probably at the same time as GPT-5. They will be more than a year behind, so Google is losing the battle because we are not 10 years away from general artificial intelligence. If Google wants to succeed, it will need to make major changes in its development strategy and think about innovating instead of stretching things out to make money
Some things that concern me: 1) They keep referring to "code" as a modality separate from text. As a software developer, I really wonder how they think we code if not via text? 2) Their big MMLU "win" compares two different prompting approaches: CoT@32 for Gemini (I assume that's "32-step chain of thought") vs 5-shot prompting for GPT-4. At least they do use the apples-to-apples comparisons for the multimodal test table, I guess. 3) Their own stats show a slight improvement on Big Bench for complex reasoning, but somehow a decline in performance for simpler common-sense reasoning on HellaSwag? Yikes. 4) For something that has an entirely extra modality than GPT-4V (audio), I'd expect exponential improvements in overall understanding and performance compared to GPT-4V. So seeing only marginal improvements compared to text-only GPT-4? That's not a good sign. 5) The audio modality tests are only in speech, which is... meh? We have bespoke speech-recognition AIs, and even translation AIs, already. Even if this performs better at those tasks, the whole point of adding another modality to a Transformer model is so it can learn directly from that mode of data. That means it should be able to understand, and learn from, audio as a whole, not just speech; the fact they didn't even bother *testing* anything else beside speech recognition tasks worries me. The demos look very cool, but of course are cherry-picked (and the videos are even edited; the footnote at the beginning of the demos says that explicitly), so I'm not really excited for this until I can get my hands on Ultra and test it for myself. My tests with Bard/Gemini Pro are not encouraging: I asked it two questions, and it got both wrong. One was a basic logic question that GPT-4-Turbo gets right every time, about a ball in a mug that's turned upside-down. The other was simply "what model are you using?" It said LAmDA first -- no mention of Gemini -- and when I asked "so you're not using Gemini?", it corrected itself with more info about Gemini. Meaning it does know it's using Gemini, but failed to answer correctly the first time. By the way, your question about context length was also wrong: it said 2048, but the actual context length of Gemini is 32K. Also, in your test, both models got the diamond logic puzzle wrong. GPT-4 made the common mistake of confusing "if A then B" with "A if and only if B", and Bard assumed Bob's statement must be true without any logical reason for that. I don't know what the correct answer is -- or even if it's possible to figure out from the given info alone -- but both of their analyses were wrong. Basically... if Bard is using Gemini Pro now, then Gemini Pro is so inaccurate as to be useless. It gives me less hope for Ultra.
This AI tech is also presented as an improvement for humanity, but we know that the big companies don't give up their position in a post AGI world. Think about a future with with some value-people who create this stuff and the rest of the world living on a credit based system feeling useless.
Gemini Ultra is the one they claim to beat GPT-4, and it was proven wrong when GPT-4 Turbo was provided chain of thought reasoning, which is the same strategy GU used. GPT-4 Turbo didn't lose in a single field; Pro is a lot worse than both and only a little better than 3.5
Well they certainly dont understand how to capitalize on the market by releasing the mid tier version into the engine. Remember when bing got gpt4 first before it was even announced as public? They'll never get me back to thier engine if thats how they respond to thier loss of marketshare.
By the time Gemini is actually ready to deliver on its promises, we'll have GPT-5 Mega. I'm a bit sus when Google unveils all these cool demos, but ... not until next year, and if anything, Bard is somewhat downgraded with the Gemini Nerfed edition running now.
I think you said it in the pinned comment, but the vision inside of Bard today is actually just Google lens describing the contents of the image to Gemini Pro/PaLM 2 and that's how it's image capabilities have worked since the release of them.
Interesting... I tried the same Zuckerburg sentient AI meme with ChatGPT and it didn't give me that error it gave you. Here's what it responded with: "The humor in this meme comes from the incongruity between the image and the caption. The image is of Mark Zuckerberg, the co-founder of Facebook, which is a well-known face and hardly an AI. The caption, however, suggests that this is a "photorealistic Facebook profile photo of a sentient AI," which is humorous because it plays on the idea that an AI's profile picture would look like a human. Additionally, there may be a subtle joke here about the sometimes criticized "robotic" public persona of Mark Zuckerberg, playfully insinuating that he could be an AI himself." Not sure why it didn't work for you? Weird fluke? Of course, my prompt had the word meme in it instead of just simply "image" so maybe that helps?
You aren’t using Gemini. You are using Palm 2. Gemini is way better than that. It doesn’t always use Gemini, it only uses Gemini when it is in the conditions for using it.
The logic puzzle is not a "well-known logical puzzle". It is incoherent non-sense and the reasoning has not one error but several. For example it claims that all three statements are true only if Alice owns the diamond. That is not correct, all three are also true if Bob owns the diamond.