Тёмный

Microsoft's Phi 3.5 - The latest SLMs 

Sam Witteveen
Подписаться 66 тыс.
Просмотров 14 тыс.
50% 1

Опубликовано:

 

18 сен 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 31   
@thenoblerot
@thenoblerot 28 дней назад
Thanks Sam! You always have good content in a sea of clickbait nonsense :)
@samwitteveenai
@samwitteveenai 27 дней назад
Thanks this is what I am trying to go for. This who space has gotten sop hype focused over the past couple of years.
@supercurioTube
@supercurioTube 28 дней назад
Thanks for the coverage, I'd be interested in a tool use / RAG and other utilities comparison with Llama 3.1 8B quantized aggressively to bridge the gap in RAM and performance!
@thmo_
@thmo_ 26 дней назад
the MoE wasn't wrong, the correct answer for that calculation was exactly 9.9996, rounding _is_ the next step. So I'd say it did better at that specific question..
@Alex29196
@Alex29196 28 дней назад
Phi 3.5 is mindblowing. Works crazy fast and accurate for function calling, and json answers also.
@NoidoDev
@NoidoDev 25 дней назад
Which version, what functions?
@mukilanru
@mukilanru 21 день назад
Is it faster than Llama-3.1-8b-Instruct float16 for json response? Also which model, mini, right?
@blossom_rx
@blossom_rx 26 дней назад
Unfortunately every Phi model I tested so far had a model collapse after 3 to 5 queries. I have this only with Microsoft models OR models I truncated on my own. I do not understand the hype and do not trust the benchmarks. Just to make clear: I have about 15 different official models running locally that were not tampered with and NONE except the Microsoft models have this issue.
@jeremybristol4374
@jeremybristol4374 27 дней назад
Surprisingly good. Better than v3. But still get's stuck in loops as the response context length grows. Experimenting with prompts to avoid this.
@user-th7cu9ll4j
@user-th7cu9ll4j 20 дней назад
What are some different use cases for Mini and MoE? For example if you want to do a RAG application, which would be more suitable?
@erniea5843
@erniea5843 28 дней назад
Nice overview!
@0cano
@0cano 28 дней назад
Always top notch content Sam!
@Diego_UG
@Diego_UG 28 дней назад
Is there any cheap way to finetune these small models with proprietary data?
@samwitteveenai
@samwitteveenai 27 дней назад
yeah you can do FTs with Unsloth etc quite easily for these.
@NetZeroEarth
@NetZeroEarth 28 дней назад
🔥 🔥 🔥
@WillJohnston-wg9ew
@WillJohnston-wg9ew 28 дней назад
Does anyone know of a source for community/conversation on LLMs and business? I'm a technologist developing an app and would really like to find a good source for discussing ideas and what's working/not working.
@xthesayuri5756
@xthesayuri5756 28 дней назад
It's funny. Every time a new Phi model comes out I get so insanely bearish for LLMs because they always suck. Just gaming the benchmark but are horrendous to use.
@hidroman1993
@hidroman1993 28 дней назад
100% agreed, just ask a slightly different question and Phil goes NUTS
@Spathever
@Spathever 28 дней назад
This is what I noticed too. Went crazy on the 2nd time. There was no 3rd. Maybe newer bigger ones would work. Probably will need to fine-tune.
@Alex29196
@Alex29196 28 дней назад
This kind of models are like gold for people working with NLP.
@SavinaAzzahra-i9k
@SavinaAzzahra-i9k 28 дней назад
😂
@samwitteveenai
@samwitteveenai 27 дней назад
Can I ask what you are using it for that you are finding it sux. Curious is it a chat kind of app etc?
@hidroman1993
@hidroman1993 28 дней назад
Definitely first
@ArianeQube
@ArianeQube 28 дней назад
o fucks given.
@IdPreferNot1
@IdPreferNot1 28 дней назад
How much longer are we going to pretend that these are in any way practical? No on prem running for anyone except large corp and many of the privacy issues open source was supposed to address arise come back once you start using someone else's hardware. Guess Its great to see smaller models improve and push foundation models, but if you want to do stuff with any off these, especially with agentic processes gobbling thousands of tokens, latency and performance demand hosted service.... might as well go free flash, mini with no setup or hosting issues.
@pwinowski
@pwinowski 27 дней назад
Well, you actually can run a crew of Phi models on a MacBook Pro. The M3 Pro with 36 GB of system memory, can allocate around 27 GB of that pool solely to GPUs for inference.
@IdPreferNot1
@IdPreferNot1 27 дней назад
@@pwinowski Its not about can/cant. What is the tokens/sec doing that locally? Now consider hitting the gemini-flash API with 128k tokens 15 times a minute for free.
Далее
AI can't cross this line and we don't know why.
24:07
Просмотров 571 тыс.
Ice Bear would appreciate some cheese 🧀
00:18
Просмотров 12 млн
ДОМИК ДЛЯ БЕРЕМЕННОЙ БЕЛКИ #cat
00:38
Linux Creator Reveals the Future Of Programming with AI
19:46
When AI is Just Badly Paid Humans!
23:55
Просмотров 432 тыс.
The Secrets To Making LLMs More Reliable
22:46
Просмотров 115 тыс.
How might LLMs store facts | Chapter 7, Deep Learning
22:43
AWS CEO - The End Of Programmers Is Near
28:08
Просмотров 463 тыс.
Optimize Your AI Models
11:43
Просмотров 10 тыс.