Chris Hay

Chris Hay

147
826 422

Подписаться

hi there, welcome to my channel, hope you enjoy it and are as having as much fun as i am.

my name is chris hay and i love code, software architecture and technology in general.

as you may have figured, i work for IBM (who i love working for) but all opinions are most definitely my own and do not represent the views of my employer

what’s underneath the mystery gemini 2 models?

18:01

what’s underneath the mystery gemini 2 models?

14 дней назад

I built an AI Math Compiler that emits synthetic datasets rather than code

38:44

I built an AI Math Compiler that emits synthetic datasets rather than code

21 день назад

Understanding STaR and how it powers Claude and Gemini/Gemma 2 (and maybe OpenAI Q* or Strawberry)

22:49

Understanding STaR and how it powers Claude and Gemini/Gemma 2 (and maybe OpenAI Q* or Strawberry)

Месяц назад

Multi-Head vs Grouped Query Attention. Claude AI, Llama-3, Gemma are choosing speed over quality?

20:30

Multi-Head vs Grouped Query Attention. Claude AI, Llama-3, Gemma are choosing speed over quality?

Месяц назад

NVIDIA's Nemotron-4's is totally insane for synthetic data generation

28:35

NVIDIA's Nemotron-4's is totally insane for synthetic data generation

2 месяца назад

i really want to say goodbye to copilot...

35:21

i really want to say goodbye to copilot...

2 месяца назад

The future of AI agents is WebAssembly (get started now)

39:51

The future of AI agents is WebAssembly (get started now)

2 месяца назад

getting started with typespec

28:51

getting started with typespec

2 месяца назад

Creating ReAct AI Agents with Mistral-7B/Mixtral and Ollama using Recipes I Chris Hay

21:33

Creating ReAct AI Agents with Mistral-7B/Mixtral and Ollama using Recipes I Chris Hay

3 месяца назад

Fine-Tune Llama3 using Synthetic Data

37:03

Fine-Tune Llama3 using Synthetic Data

3 месяца назад

why llama-3-8B is 8 billion parameters instead of 7?

25:40

why llama-3-8B is 8 billion parameters instead of 7?

4 месяца назад

Getting Started with ReAct AI agents work using langchain

43:33

Getting Started with ReAct AI agents work using langchain

4 месяца назад

Inside the LLM: Visualizing the Embeddings Layer of Mistral-7B and Gemma-2B

26:59

Inside the LLM: Visualizing the Embeddings Layer of Mistral-7B and Gemma-2B

5 месяцев назад

How the Gemma/Gemini Tokenizer Works - Gemma/Gemini vs GPT-4 vs Mistral

33:36

How the Gemma/Gemini Tokenizer Works - Gemma/Gemini vs GPT-4 vs Mistral

6 месяцев назад

HuggingFace Fundamentals with LLM's such as TInyLlama and Mistral 7B

30:29

HuggingFace Fundamentals with LLM's such as TInyLlama and Mistral 7B

6 месяцев назад

Getting Started with OLLAMA - the docker of ai!!!

18:19

Getting Started with OLLAMA - the docker of ai!!!

6 месяцев назад

how the tokenizer for gpt-4 (tiktoken) works and why it can't reverse strings

24:00

how the tokenizer for gpt-4 (tiktoken) works and why it can't reverse strings

7 месяцев назад

Natural Language Processing (NLP) is still a thing

25:10

Natural Language Processing (NLP) is still a thing

7 месяцев назад

What is Retrieval Augmented Generation (RAG) and JinaAI?

37:12

What is Retrieval Augmented Generation (RAG) and JinaAI?

7 месяцев назад

abstract syntax tree's are gonna be IMPORTANT in 2024

20:53

abstract syntax tree's are gonna be IMPORTANT in 2024

8 месяцев назад

Real-Time Rust: Building WebSockets with Tokio Tungstenite

29:11

Real-Time Rust: Building WebSockets with Tokio Tungstenite

8 месяцев назад

superduperdb supercharges your database for AI

40:55

superduperdb supercharges your database for AI

8 месяцев назад

Mistral-7B: Text Classification Thoroughbred or Doddling Donkey?

6:46

Mistral-7B: Text Classification Thoroughbred or Doddling Donkey?

9 месяцев назад

Bun Web Sockets are really kinda awesome (bun.js tutorial)

27:04

Bun Web Sockets are really kinda awesome (bun.js tutorial)

9 месяцев назад

mistral 7b dominates llama-2 on node.js

18:27

mistral 7b dominates llama-2 on node.js

10 месяцев назад

functional programming with nim language

11:50

functional programming with nim language

11 месяцев назад

nim language - arrays, sequences and stacks.

5:56

nim language - arrays, sequences and stacks.

Год назад

fine tuning llama-2 to code

27:18

fine tuning llama-2 to code

Год назад

conditionals and loops in nim language

17:55

conditionals and loops in nim language

Год назад

Комментарии

@KhushalPS-n6b 13 часов назад

Excellent explanation of the full ReAct mechanism. Many thanks 🙏

@DePhpBug 3 дня назад

does anyone encounter strange issue ,whereby if you wanna do a publish , websocket.publish doesn't seems to work but instead server.publish works?

@raihanrafi3665 7 дней назад

Please reverse engineering the ransomware in rust

@greghayes9118 7 дней назад

Don't use your fingerprint. Swipe useing a knuckle.

@Winter_Sand 11 дней назад

With the final running of the code, running the assembly hello world, the code still works if I don't link the SDK and libraries during the "ld" function (I can just do "ld hello.o -o hello -e _start"), and doing ./hello still works. Does that mean that the rest of the function linking the libraries and defining the main function is unnecessary? Genuine question, just trying to reduce the amount of complex code I'm not entirely sure I understand

@cagataydemirbas7259 12 дней назад

Hi, how can I find the data template of llama 3.1 base model ? How can I prepare research papers and books for fine-tuning the base model in right data format ?

@foreignconta 14 дней назад

Very nice test!! Subscribed!

@chrishayuk 14 дней назад

turns out X AI created this model, which explains the issues i had with the math and reasoning parts.

@Redgta6 14 дней назад

good job for fixing the title lol

@chrishayuk 14 дней назад

wasn't a massive change looool

@danielhenderson7050 11 дней назад

Tbh I didn't even consider Grok in the possible models!

@BlessBeing 15 дней назад

just to let you know, you were wrong. it is by XAI confirmed. lol elon got you

@chrishayuk 14 дней назад

Super interesting

@chrishayuk 14 дней назад

I am actually super cool about being wrong

@shuntera 15 дней назад

You need a wee edit at 0:53 :-)

@chrishayuk 15 дней назад

hahaha, i missed this, was a quick edit last night

@chrishayuk 15 дней назад

fixed, and updated, thanks for the heads up

@danielhenderson7050 15 дней назад

I love your videos. You should be way more popular than someone I won't mention 😅

@chrishayuk 15 дней назад

very kind, but honestly not about popularity, this channel is really just about getting thoughts out my head

@LombardyKozack 15 дней назад

LLaMA2-70b uses GQA (only its 7b version used MHA)

@chrishayuk 15 дней назад

fair point

@iliuqian 16 дней назад

Thank you Chris. Could you show us how to create a ReAct agent using langgrahp?

@everyhandletaken 16 дней назад

Nice one Chris, interesting!

@xxxNERIxxx1994 15 дней назад

evenphototaken

@chrishayuk 15 дней назад

Glad you enjoyed it

@QrzejProductions 16 дней назад

Great explanation, great content. Keep doing a good work, your channel's worth much more reach and subs :)

@chrishayuk 15 дней назад

very kind but I’m actually kinda cool with the reach

@PseudoProphet 17 дней назад

Probably coming with the Pixel 9 pro.

@arindam1 18 дней назад

This is epic. You got a sub!

@chrishayuk 15 дней назад

very kind

@reza2kn 18 дней назад

Nice find Chris! enjoyed the video!❤

@chrishayuk 15 дней назад

glad you enjoyed my weird ai detective show

@Mercury1234 20 дней назад

Someone please correct me if I'm wrong here. I think that neither of the examples you showed comes from reasoning. The order is flipped, they should first provide the reasoning and then the answer, not the other way around as in your examples. The models take all the tokens into account from the input and the output (generated up to that point). What is giving the right answer a better chance is if the previously generated tokens contain the reasoning steps. In your examples the previous tokens did not contain the reasoning steps as those were generated after the answer.

@BlunderMunchkin 22 дня назад

When you said "math" I thought you meant symbolic math, not arithmetic. Using an LLM to do arithmetic is pointless; a calculator does a far better job.

@chrishayuk 21 день назад

@@BlunderMunchkin symbolic math is coming but you have to start with a foundation…. but in order to do symbolic math, the llm still needs to know how to count

@ErfanShayegani 22 дня назад

Great content as always! Thank you!

@hosseinbred1061 23 дня назад

Great explanation

@billfrug 23 дня назад

does seem a bit verbose for ADTs : you need to define the tag type and use a case of on it to get the different member variables ( similar to record case in pascal)

@chrishayuk 23 дня назад

totally agree, it's heavily heavily pascal influenced

@novantha1 24 дня назад

I don't have time to go through the whole video at this specific moment, but it seems to me that you came to a fairly similar answer to myself: LLMs are pretty strong at presenting data and handling noisy inputs, while traditional computer programs are pretty good at doing the math (numerical instability notwithstanding). One obvious opportunity that I'm not seeing in the first ten minutes (though I'll certainly allow it's possible that I'll have egg on my face after finishing), is that this seems like an efficient way to embed agentic function calling into a model; if the steps to solve the problem contain a call to a remote system with the equation as an argument, and the remote system can solve the equation, that seems a lot like a function call to my eyes. Beyond that, there's also probably some room to reverse engineer a problem with the synthetic generator LLM based on the equation and answer, in order to encourage semantic problem solving, as seen in certain benchmarks which have word problems encoding ideas best solved with mathematics. Overall, this is a super cool project, and is probably going to be very beneficial for people doing continued pre-training or experimenting with certain ideas like grokking. I'm pretty excited to have a hack at it myself.

@chrishayuk 23 дня назад

Absolutely spot on, I cover later in the video that the same technique can be used for function calling for complex expressions and can also be used for teaching code generation etc

@asimabusallam3147 24 дня назад

❤

@tiympc 25 дней назад

Tremendous explanation. Thank you so much Chris!

@poochum4595 27 дней назад

Great vid! Any chance we can get a link to the repo with this code?

@chrishayuk 27 дней назад

github.com/chrishayuk/how-react-agents-work

@poochum4595 27 дней назад

@@chrishayuk does this include the "recipe" code?

@brunodebouexis5341 20 дней назад

@@poochum4595 I can't find it

@jimmyporter8941 27 дней назад

Rust doesn't have "object orientation".

@ilkkalehto8507 28 дней назад

Brilliant!

@gandalfgrey91 Месяц назад

I honestly forgot that Nim has func

@rmschindler144 Месяц назад

note that you don’t need the `-o` flag when calling `wat2wasm`, and it will simply use the filename

@rmschindler144 Месяц назад

installing WABT: on macOS, with Homebrew: `brew install wabt`

@joseluisbeltramone599 Месяц назад

Tremendous video. Thank you very much!

@_Spartan-107_ Месяц назад

These videos are insanely awesome. I LOVE the verbosity. Most of the internet videos are a high level abstraction of "what is programming". This breakdown of "What is happening when we program" is what's lacking in engineering these days! Well done :)

@omarei Месяц назад

Great content 👍😁

@ckpioo Месяц назад

so this is why gpt-4o is so much better at maths

@venim1103 Месяц назад

You have to check about the Claude 3.5 sonnet system prompt leak and all the talk about “artifacts” and persisting data with LLMs.

@chrishayuk Месяц назад

Oooh persisting with llms sounds interesting, I’ll find out about that

@venim1103 Месяц назад

@@chrishayukit seemed to me they are using clever prompt engineering with their “artifact” system in a way that resembles memory management and tool usage with the help of the massive context window. They must have also finetuned their models to support this syntax. Just crazy to think how the system message itself is able to help the AI with coherence and task management. All this seems fascinating as I’m trying to figure out why the Claude 3.5 sonnet is so good at code related tasks especially related to re-editing and updating the code compared to most other models. I can’t wait to see some open source models reach this level! Maybe finetuning and clever prompt engineering is all that is needed for now 👍

@chrishayuk Месяц назад

@@venim1103 i'll check out their system prompt... but i'm convinced they're using STaR backed by a reinforcment learning policy. the new mistral nemo model has followed this approach also. not checked out how they implemented artificat yet. but i'm convinced this is all now in the fine tune phase, hence these videos

@marilynlucas5128 Месяц назад

If you put Gemma in your title, you'll get low views. Gemma is absolutely disgusting. One of the dumbest models out there

@theklue Месяц назад

Very good content, thanks! I was comparing models manually, and I'll integrate Nemotron into the eval. One off-topic question, the over imposed screen on top of your your video is a post prod edit or is there any software that let's you record the video like this? Thanks!

@chrishayuk Месяц назад

awesome glad it was useful. the overimposed screen effect is a post prod edit that i do. the way i set the lights, screen backdrop, combined with lumetric settings and use of opacity, allows me to achieve the effect

@theklue Месяц назад

@@chrishayuk Thank you! it looks very good

@chrishayuk Месяц назад

Thank you, I like to think it’s one of the techniques that give a little uniqueness, glad you like it

@user-rs4sg2tz6k Месяц назад

I believe 4o's judges only 90%

@chrishayuk Месяц назад

interesting, where did you get that info from?

@kusanagi2501 Месяц назад

I really liked the video. it was a mystery for me for a while.

@testales Месяц назад

I don't like that very much. Why? I absolutely hate getting walls of text and code thrown at me for simple yes/no questions all the time! Both ChatGPT and Claude have this issue. So in the end It's just that you hardcode a system prompt like "think step by step" into your model and it's very hard then make it giving quick and short answers again. A hidden scratch pad is a good compromise but still slows down responses and could by achieved with a system prompt too. The system-prompt method could also include multiple agents or personas with different strengths to provide input. The best would be to also train the model to estimate the complexity of a question and then decide whether to do additional thinking or not. Also I've seen open weight models answering harder questions with just one or very few words correctly where others generated a text wall and still came to the wrong result. So whether an explicity step-by-step thinking is really required remains debatable. Obviously the chances for a correct answer increase the more relevant (!) information is in the context and that's all what CoT etc. actually does: pulling more information into context. Another similiar thing that I see doing Claude quite often and which I like is that it does summarizations before responding. If the problem is complex and there was a lot of back and forth the perceptions of it may diverge. Summarizations greatly help to create a synchronization point between the LLM and the user and then focus on the established and relevant intermediate results.

@chrishayuk Месяц назад

I agree, it’s a balance and a trade off, and I think this is where RL can be used to bring this down to a more succinct response.

@raymond_luxury_yacht Месяц назад

That explains why Claude 200k context is more like 50k for me. So much taken up with the scratchpad

@mrpocock Месяц назад

The private scratch pad in claude 3 5 explains why it seems to behave as if it had private state in addition to the text visible in the conversation.

@chrishayuk Месяц назад

Yeah really nice technique for giving readable answers but not losing chain of thought reasoning

@rodneyericjohnson Месяц назад

How can a full grown adult hide behind some decorations?

@chrishayuk Месяц назад

Merry Christmas

@tommy516 Месяц назад

Claude is NOT as good as GPT, sorry it is not. When you ask it to update code at least the way it works for me, is it sends only a new block of code that it changed not the whole class, to keep the response size down, it is really limited in the length of its response compared to ChatGPT. With all these Claude videos, I have to believe RU-vidrs are getting paid to shrill for it. Also, the Artifacts only works on a very small sample size of types of code, so all this selling like its a viable thing is disingenuous.

@chrishayuk Месяц назад

GPT is better at many things especially narrative, q&a, summarization and generative content (see my video on multiheaded attention) but Claude is definitely better on code. Not being paid by anyone, you will notice that I switch off ads for all my vids and I never take sponsorships

@Leo-ph7ow Месяц назад

Great great content! Please, make a local finetune tutorial. Thanks again!

@chrishayuk Месяц назад

it's on the list, i promise

@bamh1re318 Месяц назад

Can you please give a tutorial on how to load private data, train/Rag/evaluate and deploy an open-source model on WatsonX or other online platform (AWS, Azure or Huggingface)? Many thanks! BTW Nemotron4 broke down this noon (PST), maybe due to too many users. I was in line 771 with a simple question. It gave out some sort of communication problem, after two minutes od waiting

@chrishayuk Месяц назад

Sure, will add to the backlog

@leeme179 Месяц назад

I believe you are correct in that both Claude and Lllama 3 are finetuned using STaR generated dataset but this method still needs a ranker or human to mark the correct answers, whereas from what I have read online is that OpenAI's Q* is a combination of "A* search" algorithm combined with Q learning from reinforcement learning to self improve, where the model generates 100 different answers and picks the best answer to improve similar to AlphaCode2 from Deepmind.

@spoonikle Месяц назад

It does not. There is no human marker needed. For example, you can use a series of prompts + the dataset to judge aspects of the answers with really well trained fine tuners. You can even train a model to predict the human evaluation, then you just need to human eval in a given domain until an evaluator model is ready. In addition, this incentivizes further investment in synthetic datasets. Finally - the best argument for this. Big model prunes dataset to make a small model - that prunes the dataset for the next big model repeat ad infinitum. the smaller model is cheaper and faster which means you can prompt more data for the next big one - which will make the next improved small model.

@chrishayuk Месяц назад

Some folks use human feed back with RL and some folks use synthetic at the end of the video I talk about how it could be done with a mixture of judges and I show how you could use Nemotron for your reward model. I will do a video on RL for this soon to cover the Q part

@testales Месяц назад

I still don't get how a pathing algorithm like A* can be utilized to find the best answer. I mean it's not like navigating some terrain with exactly known properties. Maybe it's a thing in the latent space? So actually the explanation that this is a modified version of the STaR approach seems to be more plausible but if so then again it doesn't seem to be such a big thing.

@chrishayuk Месяц назад

I’m only covering the star part for now. I’ll cover the RL part in a later video

@GodbornNoven Месяц назад

@@testalesQ* (Q-star) is a concept from reinforcement learning, a type of machine learning. In simple terms, it's a way to measure the best possible future rewards an agent can expect if it follows the optimal strategy from any given state. Think of it like a guide that tells you the best move to make in a game to maximize your chances of winning, based on all the possible future outcomes. Kinda like in chess.

@msssouza2 Месяц назад

Thanks, for another great video Chris. I've been through some LLM courses on Udemy but your channel is helping me to clear many doubts I have on the whole thing. I'm glad I found your channel. It's really the best on this subject. Congratulations. Marcelo.

@chrishayuk Месяц назад

Very kind, my rule is to try and always go one level below. It means that my vids are never short, glad the content is useful

@msssouza2 Месяц назад

Hi. I was looking for dozens of videos on how to make ReAct work on 7B models (to make a low cost Text to Sql solution) and the only video the answer my question so far is yours. Thank you. By the way, I'm from Rio and the current time is 09:42 AM

@chrishayuk Месяц назад

Lol, hello Rio, glad you like the example. Glad the video is useful, my back rule is to always go one level below and unveil the magic, glad it helped in this case