Google Gemma 2B on LM Studio Inference Server: Real Testing

Подписаться 723

Просмотров 2,8 тыс.

50% 1

In my candid review of Google's Gemma 2B, I run the model on LM Studio's Local Inference Server. In my sincere evaluation, I base my assessment on tasks that align with my typical usage of Language Model Systems (LLMs). Without relying on benchmarks or scores, I go through some personal use cases to determine the practical utility of this model. Join me as I provide insights into my exploration, offering genuine perspectives on the effectiveness of Google Gemma.
#google #gemma #lmstudio #huggingface #llm #2bparameterllm
#googledeepmind :
"Gemma is a family of lightweight, state-of-the-art open models built from the same research and technology used to create the Gemini models. Developed by Google DeepMind and other teams across Google, Gemma is inspired by Gemini, and the name reflects the Latin gemma, meaning “precious stone.” Accompanying our model weights, we’re also releasing tools to support developer innovation, foster collaboration, and guide responsible use of Gemma models."
blog.google/technology/develo...
Is Google Gemma the answer we've been waiting for? Has Google finally risen to the level of OpenAI? In this video, I'll also touch on my findings and provide insights into LM Studio's latest app update. I'll share candid experiences, shedding light on both the promise of Google's AI advancements and the areas where enhancements are eagerly awaited. Whether you're a tech aficionado, a developer, or simply intrigued by the current state of AI, join me as we delve into the intricate landscape of Google's AI progress and the nuanced dynamics of working with Google Gemma.
🌐 My website VideotronicMaker :
videotronicmaker.com

Наука

Опубликовано:

20 фев 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 21

@nashad6142 3 месяца назад

Bro make more content don't give up 😢

@errorgradov8050 5 месяцев назад

OMG bro i wouldn't notice such thing between versions,thanks a lot i was really sad that i wasn't able to run 2b model 💀now everything is okay

@Beauty.and.FashionPhotographer 3 месяца назад

Grok's LLM 334Gb on an external SSD card and on a M2Pro Mac , can it be used with LM studio ?

@videotronicmaker 3 месяца назад

I haven't tried it yet. I see some gguf Grok models on Hugging Face. Give it a shot. But I would not use a file that size on an SD card. I would use a file smaller than 15gb on an internal SSD drive otherwise you will have to wait 2 days to get a response to "Hello."

@Beauty.and.FashionPhotographer 3 месяца назад

@@videotronicmaker external thunderbolt 4, newest version , with a throughput of 40Gb /sec, and a external NVME 20 GB/sec 2 TB ,..not an SD ...LOL

@Artificial-Cognition 20 дней назад

Typo? If you have a 334GB LLM then you'll need to run it on cloud hardware because otherwise you'd need 334GB of VRAM or RAM in order to run it locally. I tried a 60GB model and it was generating one token per minute. Not per second, per minute 😂

@Alex29196 5 месяцев назад

i tried yesterday on hugging face playroom, and the 8k model, and the result was pretty bad.

@videotronicmaker 5 месяцев назад

I definitely understand. The way I look at it is that it's not going to perform at the level of GPT 3.5 and 4 or at the level of Gemini Pro and Ultra however, out of all the small and fast models that you can find on hugging face I really believe this is the absolute best one. It has the ability to brainstorm and have conversations but it's definitely no GPT 3.5. I see it as a great way to get started on a local server if you have a small or low end computer. It's great for testing out and I believe that there is great potential for it to be trained and fine-tuned. I guess the short of it is that it may be the best starting point so far for models at that size. I would definitely be interested to hear some details of your experience because I am getting ready to make a part two to this where I go a little more in depth and take my time a little bit and make a more organized video. I made this video last night at 3:00 am and I was scrambling because it was a breaking news situation. Today I want to give it a little more time and use the q8 version.

@Techn0man1ac 2 месяца назад

It's need more context

@madushandissanayake96 5 месяцев назад

Ask this question from the Gemma 2B model " Who is the first person to set foot on the moon" . You will get an amazing answer 😂

@videotronicmaker 5 месяцев назад

That was funny! Thanks for the laugh. Gemma said, "There is no evidence or record of a person setting foot on the moon." Well...at least it's consistent. 🤣

@madushandissanayake96 5 месяцев назад

@@videotronicmaker yeah 😂, may be Google has begun to use data from an alternate universe to train these AI models. That's why Google's AI products are behaving so weirdly recently.

@videotronicmaker 5 месяцев назад

No matter what I change with the Hugging Face model hyperparameters via LM Studio it responds the same. So that is probably my error because when I go on Vertex AI, choose the Gemma model and use these settings on the right where it says, "Try out Gemma" : temperature = .8 top-p = .7 top-k = 30 ** Gemma responds: "Neil Armstrong was the first man to walk on the moon. So the answer is Neil Armstrong. I hope this helps! Please let me know if you have any further questions.**Best regards,** **The Answer Guy****Additional information:** Neil Armstrong was born in 1930 and was an American astronaut. He was the first human to set foot on the moon on July 20, 1969.He was a pioneer in the field of space exploration and his accomplishment is one of the greatest in human history. I hope this additional information is helpful.**" Here is the link to where I got that response: console.cloud.google.com/vertex-ai/publishers/google/model-garden/

@madushandissanayake96 5 месяцев назад

@@videotronicmaker Thank you so much for the answer, I still haven't tried to use it on vertex AI, so I will try it now. Thanks!

@videotronicmaker 5 месяцев назад

Nope. I was wrong. I think I had the 7B selected in Vertex AI. I tried again to verify and no matter what, it said , "...no evidence..." For now, here is my conclusion. Google states: USE CASES Content Creation and Communication Text Generation: These models can be used to generate creative text formats such as poems, scripts, code, marketing copy and email drafts. Chatbots and Conversational AI: Power conversational interfaces for customer service, virtual assistants, or interactive applications. Text Summarization: Generate concise summaries of a text corpus, research papers, or reports. ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- I don't think the 2B models are trained on enough data to know the answer. The 7B does. I'm guessing that this is an example of the compromise when we choose a small model. I am making a video of my user experience now.

@Artificial-Cognition 20 дней назад

How do you feel about Gemma 2 (9B and 27B)? They're way better than Gemma 2B and 7B, I haven't found a better LLM.

@videotronicmaker 20 дней назад

I recently broke my main PC and I have been preoccupied with figuring out how to use non quantized models on my old MacBook Pro. I have still been stuck on Gemma 2b and Stable LM Zephyr 1.6b. I tried to inference the base Gemma 9b with no success. I hope to inference it very soon. I will say that because I was very impressed with 2b, I am sure that 9b and 27b will be impressive. I am gpu poor these days so...In time. Also, Google has really started to impress me lately with Gemini. All good things! I will be sure to edit this in a few days after I get a chance to try them. I guess I will try it out on Vertex AI in the meantime. Thanks for the question. I now have a next project to get into!