Тёмный
Vivek Haldar
Vivek Haldar
Vivek Haldar
Подписаться
I'm a software engineer, tech lead and engineering manager. I have a PhD in Computer Science, and still try to keep up with the latest research.

Laypeople cannot prompt LLMs
8:55
21 день назад
Studying GSM8K Leaderboard
8:17
Месяц назад
Think-and-execute prompting for LLMs
7:37
2 месяца назад
AutoGen: Programming LLM Agents
9:17
2 месяца назад
Fine-tuning LLMs encourages hallucinations
5:01
2 месяца назад
Fine-tuning or RAG?
8:05
3 месяца назад
Fixing RAG with GraphRAG
15:04
3 месяца назад
LLMs improve writing-based knowledge work
6:59
3 месяца назад
Co-intelligence: book review
13:06
4 месяца назад
Winning prompt! $10k LLM reasoning challenge
10:09
4 месяца назад
$10k for LLM reasoning
6:24
4 месяца назад
LLM agents do software engineering
7:40
4 месяца назад
LLM benchmarks
11:02
4 месяца назад
LLMs eat entry-level SWEs
9:06
5 месяцев назад
LLMs can debug with prints
8:54
5 месяцев назад
Determinism ⇒ Fast LLMs (Groq)
10:07
5 месяцев назад
GPT-4 passed the Turing Test!
6:58
6 месяцев назад
I remember @AndrejKarpathy 's deleted tweet
4:18
6 месяцев назад
LLMs for real world knowledge work
9:28
7 месяцев назад
Watch me build a GPT for journaling
7:38
7 месяцев назад
LLMs can "breed" their own prompts
8:30
7 месяцев назад
LLMs with infinite context?
9:35
7 месяцев назад
Can prompt engineering beat fine-tuning?
9:44
7 месяцев назад
Can LLMs discover new math and CS?
7:48
8 месяцев назад
Комментарии
@dawid_dahl
@dawid_dahl День назад
Thanks so much, great content.
@andrew.derevo
@andrew.derevo 5 дней назад
Surprised, save me few weeks of testing 🙌❤️
@andrew.derevo
@andrew.derevo 6 дней назад
🙌 great stuff
@somanshukumar1344
@somanshukumar1344 10 дней назад
Always wanted to see this type of content
@rtos
@rtos 21 день назад
Unfortunately even the so called power users of LLM with their own RU-vid channels, always seem to have a small set of stock prompts, which get repeated with every new review. If LLMs were trained on these specific questions then they're going start appearing super intelligent! Things like 'why is the sky blue', or write a 'snake game in python' are hardly a test of machine intelligence, as all that is needed is to be trained in accurate code or factual data.
@matty-oz6yd
@matty-oz6yd 24 дня назад
I really value what you do my dude <3
@RAHUDAS
@RAHUDAS 25 дней назад
Bot designer ltd crashed, I not able to access
@ashwinnair5803
@ashwinnair5803 25 дней назад
Why not just use RAPTOR instead?
@yiwensin5913
@yiwensin5913 26 дней назад
Excellent! I didn't know you before and I just stumbled upon your video while searching for material on prompting LLMs (for a local LLM project). You now have a new sub :)
@VivekHaldar
@VivekHaldar 25 дней назад
Welcome aboard!
@user-wr4yl7tx3w
@user-wr4yl7tx3w 26 дней назад
can you discuss DSPy and give your opinion on it given how it is related to prompting
@arthurdhonneur276
@arthurdhonneur276 26 дней назад
Nice video thank you very much !
@Starhopp3r
@Starhopp3r Месяц назад
Excellent review! Thank you. I really enjoyed this book and have been recommending it to people in order to help them set expectations about “AI” without excessive optimism or pessimism. Currently reading Deep Utopia; hope to see your review soon!
@user-bw6oi5mf9y
@user-bw6oi5mf9y Месяц назад
I think the main problem is how they traverse the graph by asking LLM at each step. I'm not sure if this is feasible in production.
@sasha297603ha
@sasha297603ha Месяц назад
Great paper, thanks for covering!
@user-wr4yl7tx3w
@user-wr4yl7tx3w Месяц назад
Excellent content. Well explained.
@PreetiGuptaAvril
@PreetiGuptaAvril Месяц назад
Sir, Do you explore the vision domain as well or do you any such youtuber whom i can follow for paper understanding.
@themax2go
@themax2go Месяц назад
very well "ragged"... both on the local domain (details) and global domain (overview of pros-cons) 😉😎
@wayneqwele8847
@wayneqwele8847 Месяц назад
Thank you for the video, that was a great paper to go through. I find RAG research techniques have so much insight to how we can develop and identify our own cognitive impediments to our own judgement. The Comprehensiveness, Diversity of perspective, Empowerment and Directness is such a good mental model to use in our own human judgement.
@fintech1378
@fintech1378 Месяц назад
super excellent video
@goelnikhils
@goelnikhils Месяц назад
Amazing explanation Vivek
@awakenwithoutcoffee
@awakenwithoutcoffee Месяц назад
great presentation Vivek. Some questions: - is graphRAG production ready ? if not, would it be difficult to upgrade RAG methods once we are in production ? - is there a RAG provider/stack that you prefer ? (datastax, pinecone, weaviate + a bunch of others who are all competing for attention) - what are your thoughts on LangChain vs LangGraph ?
@christopherconyers767
@christopherconyers767 Месяц назад
Awesome review - thanks for the great work!
@brandonheaton6197
@brandonheaton6197 Месяц назад
can you pontificate on the combination of upcoming transformer inference ASICs with deep agentic workflows employing GraphRAG style strategies? Seems like we will be close to our personal assistants writing a PhD thesis in the background whenever we ask a question. SOHU is reporting 500,000 tokens per second with Llama3 70B....
@RoulDukeGonzo
@RoulDukeGonzo Месяц назад
Seems clear that for 'current events' rag is going to win, but for broader, domain specific themes or logic, how does fine tuning stack up? E.g. create code using our internal suite of APIs... If context is big enough, icl should be fine, but rag may miss some key docs based on semantic similarity alone... I guess... I should write a paper 😂
@sasha297603ha
@sasha297603ha Месяц назад
Very interesting papers, thanks for covering!
@stevenwatson2927
@stevenwatson2927 Месяц назад
It's surprising to see ChatGPT achieving below 99% when Wolfram Alpha can basically answer anything just by having specific knowledge. It's also surprising to see that "playing" with the word prompt does anything at all yet alone give a better result. It makes no sense especially when we can clearly see from the research the information entropy is basically the same between prompts with and without extra steps.
@therobotocracy
@therobotocracy Месяц назад
Is it flattening out because it maxes at %100?
@VivekHaldar
@VivekHaldar Месяц назад
Yes, that too! People have started looking at harder benchmarks like GSM8k-Hard and MATH.
@karinlv890
@karinlv890 2 месяца назад
Thank you for saving my group meeting! Your video helps a lot!
@wanfuse
@wanfuse 2 месяца назад
wouldn't it cut to the chase to train an llm on your own data? theres your graph use one of these OpenAI's GPT-3/4 Hugging Face Transformers (e.g., GPT-2, GPT-3 via third-party providers) Google's T5 (Text-to-Text Transfer Transformer) Meta's BART and BlenderBot Anthropic's Claude every week update the llm summarization is the death of real data, better off one level of summarization? Just a thought!
@mccleod6235
@mccleod6235 2 месяца назад
Maybe you don't want to send all your valuable business data to third party companies.
@wanfuse
@wanfuse 2 месяца назад
@@mccleod6235 thats true but its not necessary, there are models that are open source you can train air gapped from a jetson
@bohnohboh676
@bohnohboh676 2 месяца назад
"every week update the llm" yeah no way unless you have tons of cash, compute, and time
@wanfuse
@wanfuse 2 месяца назад
maybe maybe not, let you know! your probably right, will see if my idea pans out
@rafikyahia7100
@rafikyahia7100 2 месяца назад
Excellent content summarizing cutting edge approaches, thank you!
@sasha297603ha
@sasha297603ha 2 месяца назад
Very interesting paper! Looks like team lead model and a bunch of juniors 😅 Thanks for covering!
@christopherd.winnan8701
@christopherd.winnan8701 2 месяца назад
Are there any models where we can try this think and exe method for ourselves?
@VivekHaldar
@VivekHaldar 2 месяца назад
As described in the paper, authors tried it with GPT-3.5, and Llama. They have prompts in the paper, you could try it with any LLM of your choice.
@vida91963
@vida91963 2 месяца назад
Nice presentation thank you!
@jordycollingwood
@jordycollingwood 2 месяца назад
Really great explanation, I’m currently struggling to decide on my own KG structure for a 2000 medical pdf corpus, so this was very helpful
@awakenwithoutcoffee
@awakenwithoutcoffee Месяц назад
same here brother. There are so many techniques, everyday I learn something new which is both good and terrifying ha. What stack are you thinking of using ? We are researching DataStax, Pinecone, Weaviate and are learning to build agents with LangGraph.
@kaixiliu7469
@kaixiliu7469 2 месяца назад
Thanks for sharing the review Vivek! Would you mind sharing your book list as well?
@VivekHaldar
@VivekHaldar 2 месяца назад
Hey Kaixi! Don't have an explicit list, just pick up what looks interesting at the time... :-)
@btscheung
@btscheung 2 месяца назад
Really appreciate your in depth review of the book! This provides more thoughtful reading when I start the book.
@thankqwerty
@thankqwerty 2 месяца назад
Thanks for sharing the paper. In my experience with using Llama3-8B, in my benchmark dataset, I noticed that LLM has learned an incorrect fact or in contradiction with my application. I tried to clarify that in the prompt, but noticed the LLM is actually quite stubborn, and lead to quite fragile responses, i.e. the LLM sometimes get it right sometimes get it wrong with minimal changes in the prompt, could be as small as adding spaces. I wonder if you have come across similar situation or papers that discuss this behavior. Thanks.
@VivekHaldar
@VivekHaldar 2 месяца назад
Yes that kind of brittleness is a common issue unfortunately.
@harivarsha4016
@harivarsha4016 2 месяца назад
I love this kind of content, please never stop !!!
@atomwalk
@atomwalk 2 месяца назад
Awesome work! Thanks🤗
@user-wr4yl7tx3w
@user-wr4yl7tx3w 2 месяца назад
More agent paper please. Thanks 😊
@willtipton1698
@willtipton1698 2 месяца назад
Nice video ty
@colinwar
@colinwar 3 месяца назад
You ask vanilla questions if you can't un-cloak a machine response!. The reasoning is not there with language models, how stupid are people to not be able to ask the right questions? I call lies on these claims. Show the test as proof, I doubt you can or will show the actual test. This is absurd.
@gilinachum
@gilinachum 3 месяца назад
But why is the paper's fine tuning different than the original pre-training and alignment fine tuning that came before it. All expose the model to a mix of existing and new data...
@VivekHaldar
@VivekHaldar 2 месяца назад
You are correct -- in principle fine-tuning works the same way as pre-training (updating weights), so FT can be thought of as continued PT. Difference is in data used. One will FT when they have a domain-specific set of data that's very different from the PT data.
@hosseinmohammadi4574
@hosseinmohammadi4574 3 месяца назад
Interesting! Tnx
@sasha297603ha
@sasha297603ha 3 месяца назад
Very interesting paper, thanks for covering!
@HampusAhlgren
@HampusAhlgren 3 месяца назад
Just wanted to say I really appreciate your videos. Everything is short and concise and I love that you’re always using papers as the foundation for the conclusions. Keep it up!
@VivekHaldar
@VivekHaldar 3 месяца назад
Thanks for the kind words. That's the idea!
@dennyoviedo4102
@dennyoviedo4102 3 месяца назад
Good brother 😊thanks for an excellent explanation , Peer 2 peer of BTC formula. I’ll eat this info into my brain 🧠 until my neurons starting a new circuit.😂.