LLAMA-3.1 405B: Open Source AI Is the Path Forward

Prompt Engineering

Подписаться 172 тыс.

Просмотров 6 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

27 сен 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 54

@henkhbit5748 2 месяца назад

Hopefully we get multi modal LLama soon. 👋Meta. Thanks for the update 👍

@paul1979uk2000 2 месяца назад

For about a year now, I came to the realisation that the future of A.I. is open, not because of morals or that closed models are bad, but because of privacy, security and other concerns, especially as closed A.I. models could become very dangerous in the future if 2 or 3 corporations or governments control them, it would give them a massive advantage over everyone else. Because of all that, I feel open source is the only way to go to level the playing field, and even thought that carries its own risk, I feel the risk is much lower than A.I. being controlled by so few people and I like to think that A.I. being open, we can make the right decisions on it together, as the risk to us all is the same so we have a collective impulse on getting it right. I welcome what Meta is doing but we should remember, it's not fully open source in what you think of as open source, but it's close enough to do the job and we need counterweights to what the online closed models are doing, because long term, I think most of us will want to run A.I. at a local level on our own hardware because of privacy and security reasons, this will especially be the case as A.I. becomes better and more useful in more areas and once A.I. starts having long term memory so it can change and adapt to the user, I highly doubt most of us would feel comfortable with data going back and forth to a centralised service, and if that isn't frightening enough, wait till we have robotics around the house, these being run by online A.I. services would be a massive privacy invasion that would make the likes of Google and Facebook look tame in comparison, open source and locally run is the only way forward I think over the long run.

@pylanookesh8227 2 месяца назад

Great video as always ❤. GPU requirements is what I was looking for these models and you correctly mentioned about it.

@xspydazx 2 месяца назад

It's a way of working with the hardware companys to help push their upgrades ! . We have still not seen a metric , to prove when it's right to up the parameters past 1b or 3b .. It seems that they are generating large models to displace people from the technology . If you can't train it (unless you have the money to upgrade) ... As well as giving them time to add serial number identification to graphics processors ! Giving them the ability to track GPUs !!! Oh my gosh !!

@xspydazx 2 месяца назад

Stay at 7b Keep training it for the tasks you need , stay with your model and keep training it. You will find your personal model can function better than these commercial models ! .. My mistral out perform the original mistral ? And they did not catch up yet either ? So it's only fine-tuning !! Get too it ! Specialized your own model and adapt your code base for the model , integrating your input output structure .. so you can enter and obtain any media !

@azaph328 2 месяца назад

HELLO HOW MUCH MONEY AROUND I NEET TO FINE TUNING LAMMA 3 70 B IN GCP PLSE

@azaph328 2 месяца назад

@@xspydazx HELLO HOW MUCH MONEY AROUND I NEET TO FINE TUNING LAMMA 3 70 B IN GCP PLSE

@MeinDeutschkurs 2 месяца назад

Yo cannot use groq, because it is restricted to 16000 tokens. 128000 tokens with 16000 useable window. OMG! Disappointing!

@N4X_Blocks500 2 месяца назад

Thanks for the walkthrough, quite helpful. It's just hard to think that mr. Zuckerberg has the best interest of people at heart, given the impact of his contribution to the decrease in wellbeing of young people.

@paul1979uk2000 2 месяца назад

He probably doesn't, but this is a win-win for Meta and for the open community. Meta have the resources to create these big models but the open source community have far more resources when it comes to testing and fine-tuning them to get better results over time, Meta benefits from the open source community doing its own thing on these models that Meta can take advantage off in its own products, so both sides benefit from this.

@azaph328 2 месяца назад

HELLO HOW MUCH MONEY AROUND I NEET TO FINE TUNING LAMMA 3 70 B IN GCP PLSE

@finalfan321 2 месяца назад

wow great video dude

@azaph328 2 месяца назад

HELLO HOW MUCH MONEY AROUND I NEET TO FINE TUNING LAMMA 3 70 B IN GCP PLSE

@StraussBR 2 месяца назад

Calm down guys it is just out I am very happy with this I also think in the furuetae all want our own model rather than sending our data into a third party that is too much trust is unsustainable model

@azaph328 2 месяца назад

HELLO HOW MUCH MONEY AROUND I NEET TO FINE TUNING LAMMA 3 70 B IN GCP PLSE

@MeinDeutschkurs 2 месяца назад

It did not cach up. Far behind! GPT4o/Claude 3.5 sonnet: 70.000 tokens in, prompt: Write a summary by chapter, and you get it. Llama 3.1? Just outputs jibberish. (Tested M2 Ultra, 192GB unified RAM, unquantized 8B + Q4, 70b Q4) - in comparison, the 3.0 gradient version was able to, but it didn’t stop, it then hallucinated further chapters. Cannot test 405b, not enough VRAM.

@kalilinux8682 2 месяца назад

Llama 3 family is extremely sensitive to quantization. Try to run it in Q5 or Q8. Else use some API. This is not limited to LLAMA 3 family, any LLM with lot of tokens went into them are sensitive. Only the smaller LLMs like mistral and older llama are fine with quantization.

@paul1979uk2000 2 месяца назад

Trying to compare the 8B and 70B to OpenAI best offering is kinda laughable, when really you should be comparing it to the 405B model.

@kalilinux8682 2 месяца назад

@@paul1979uk2000 even 70b cam hold up fine against 4o, but you have to run it in FP16 or at very least 8bit

@MeinDeutschkurs 2 месяца назад

@@kalilinux8682 output was the same for unquantized.

@MeinDeutschkurs 2 месяца назад

@@paul1979uk2000 this was not the point. The point was that llama 3.1 was not able to summarize a long content input. (But it was benchmarked). So how can they benchmark if the model produces jibberish?

@xspydazx 2 месяца назад

Em I don't think you guys understand ! The code base did not change ... Hence it's only a finetuned model ! Or different settings ( they are playing with you and creating brainwashed AI models with bad settings ! And giving them to the public , whilst they enjoy a totally different settings and training set ! ... Unless the code base changes the model is the same !! The most important is the correct settings ! Hence 8b is actually incorrect ! The internal layer count and hidden sizes are incorrect as they are not dividable to binary values , hence training is a bitch and often unstable ! Remebr they used the common crawl first .. now they have used some structured synthetic dataset generated using guardrails and used this dataset to DPO the common crawl , hence after guardrailed the model is highly unturned ..once you get into the Hidden codebase and tokenizer you will find the sub prompts (alao adding abother layer of guaedrailing ) hence making the model tough to get your response no mater the prime pronpt ! . Changing the prompt or query that was given is summon t to intercepting the question and creating a new one ! It's a prompt in the middle attack ! I'm surprised nobody has exposed the hidden prompts in the llama tokenizer and hugging face library ! As well as the unsloth library and better transformer library as well as in the pretrained model !

@linklovezelda 2 месяца назад

Could you please stop the discord sounds in your future videos? Thanks :)

@engineerprompt 2 месяца назад

will try :)

@Jeerleb 2 месяца назад

i heard you need close to 800 gigs of ram to run the 405

@engineerprompt 2 месяца назад

Yeah just 810 GB :D I go over the details in the video.

@Jeerleb 2 месяца назад

@@engineerprompt yeah i commented before i got there haha

@engineerprompt 2 месяца назад

@@Jeerleb :)

@MeinDeutschkurs 2 месяца назад

So, we have to wait for the Apple M8 Extreme Chip. 😂

@azaph328 2 месяца назад

@@engineerprompt HELLO HOW MUCH MONEY AROUND I NEET TO FINE TUNING LAMMA 3 70 B IN GCP PLSE

@PeteDunes 2 месяца назад

The quality of the 405B model is horrendous, it failed so many tests I threw at it, while both ChatGPT and Copilot performed way better.

@engineerprompt 2 месяца назад

Which provider are you using for it? Seems like it really dependent on that api you use. If you are running it locally, you need to check your settings. Llamacpp had some issues with this model.

@PeteDunes 2 месяца назад

@@engineerprompt API? LLM? No, I'm using it directly from their web site. It needed 3 tries to get a simple Windows batch script working, and when I asked it to tell what it knows about my hometown, it got monuments, location, events, tourist attractions, industry, history, etc wrong. I knew he was talking about the right town by confirming a few things correctly, but it got a lot of things wrong. I asked OpenAI and CoPilot the exact same questions, and they both got the script right in their first attempt, and their description of my town was almost flawless.

@azaph328 2 месяца назад

HELLO HOW MUCH MONEY AROUND I NEET TO FINE TUNING LAMMA 3 70 B IN GCP PLSE

@o1ecypher 2 месяца назад

would be nice if they port it to android and make it work as an offline stand alone app

@engineerprompt 2 месяца назад

I think the 8B quantized version can be an option

@Content_Supermarket 2 месяца назад

Wouldn't it work the same on the Hugging Face chart interface?

@azaph328 2 месяца назад

HELLO HOW MUCH MONEY AROUND I NEET TO FINE TUNING LAMMA 3 70 B IN GCP PLSE

@o1ecypher 2 месяца назад

@@azaph328 1 gazillion dooolers or 3 chickens

@shaonsikder556 2 месяца назад

How do you edit your video? It's really interesting and smooth to see zooming following your pc cursor.

@engineerprompt 2 месяца назад

I am using screen.studio Its for mac

@ThomasConover 2 месяца назад

What kind of hardware do I need to run a 405B model

@unclecode 2 месяца назад

Thanks for the delightful review! We now know LLMs weren't OpenAI's moat. I wonder if GPT-3 was the pivotal moment for AI and "us," or if it is now, the release of 405B Meta open weights model?

@engineerprompt 2 месяца назад

How about the new Mistral Large V2? :) I think good quality data and longer training will do magic for smaller models

@unclecode 2 месяца назад

@@engineerprompt It's so funny, just in a few hours a new model drops and you can ask such questions! What a time to be alive! I checked their blog and was surprised to see the license isn't open. What still makes me lean toward the 405B, besides its open-weight license, is the permission to use it to generate synthetic data. To me, that's the main reason to have such a large model these days.

@unclecode 2 месяца назад

@engineerprompt It's so funny, just in a few hours a new model drops and you can ask such questions! What a time to be alive! I checked their blog and was surprised to see the license isn't open. What still makes me lean toward the 405B, besides its open-weight license, is the permission to use it to generate synthetic data. To me, that's the main reason to have such a large model these days. Also, I'm not surprised anymore when a smaller model beats a bigger one on a benchmark, because that's exactly the goal of having a teacher model! To generate better data so a 123B parameter model can outperform the teacher. What do you think?