IndyDevDan

IndyDevDan

187
753 716

Подписаться

On this channel we think, plan, and build.

Right now on the channel, we're on the path to evolve into Agentic engineers.
Engineers that build software that works for them while they sleep.

Here are principles of my engineering philosophy and ideologies this channel holds as facts.

- Avoid hype, focus on real valuable tools, technology and products.
- Build real products. There are enough news, hype, trend channels. Let's make something real.
- Listen to feedback but always think for yourself.
- Use the best technology for the job period full stop.
- Keep learning, forever.
- Do what you share. I don't share anything on this channel I'm not betting on with my time, energy and money.
- Cancel out the noise and focus on the signal of value creation.
- Great things happen in the flow. Look for it everyday.

I'm not the perfect programmer, designer, or creator but to succeed you don't have to be perfect, you just have to try, over and over in success or failure.

Engineer your Prompt Library: Marimo Notebooks with o1-mini, Claude, Gemini

26:43

Engineer your Prompt Library: Marimo Notebooks with o1-mini, Claude, Gemini

День назад

"We were right" - How to use o1-preview and o1-mini REASONING models

33:02

"We were right" - How to use o1-preview and o1-mini REASONING models

14 дней назад

SECRET SAUCE of AI Coding? AI Devlog with Aider, Cursor, Bun and Notion

36:36

SECRET SAUCE of AI Coding? AI Devlog with Aider, Cursor, Bun and Notion

21 день назад

Cursor is great BUT Aider is the OG AI Coding King (Mermaid Diagram AI Agent)

22:06

Cursor is great BUT Aider is the OG AI Coding King (Mermaid Diagram AI Agent)

Месяц назад

Why fine-tune LLMs? GPT-4o fine-tune for PERFECT FLUX Image Prompts

20:58

Why fine-tune LLMs? GPT-4o fine-tune for PERFECT FLUX Image Prompts

Месяц назад

Cursor Composer: MULTI-FILE AI Coding for engineers that SHIP

24:11

Cursor Composer: MULTI-FILE AI Coding for engineers that SHIP

Месяц назад

Coding RELIABLE AI Agents: Legit Structured Outputs Use Cases (Strawberry Agent?)

11:14

Coding RELIABLE AI Agents: Legit Structured Outputs Use Cases (Strawberry Agent?)

Месяц назад

CONTROL your Personal AI Assistant with GPT-4o mini & ElevenLabs (AI TTS & STT)

20:13

CONTROL your Personal AI Assistant with GPT-4o mini & ElevenLabs (AI TTS & STT)

Месяц назад

BEST Prompt Format: Markdown, XML, or Raw? CONFIRMED on Llama 3.1 & Promptfoo

22:21

BEST Prompt Format: Markdown, XML, or Raw? CONFIRMED on Llama 3.1 & Promptfoo

2 месяца назад

GPT-4o mini Prompt Chain: Legit TRICK for DIRT CHEAP AI with SOTA Accuracy

18:54

GPT-4o mini Prompt Chain: Legit TRICK for DIRT CHEAP AI with SOTA Accuracy

2 месяца назад

Fusion Chain: NEED the BEST Prompt Results at ANY COST? Watch this…

19:35

Fusion Chain: NEED the BEST Prompt Results at ANY COST? Watch this…

2 месяца назад

AI Coding Devlog - Aider ON Sonnet 3.5 - CURATE your ELITE information DIET

26:43

AI Coding Devlog - Aider ON Sonnet 3.5 - CURATE your ELITE information DIET

2 месяца назад

USEFUL Agentic Workflow: AUTO-Updating Blog with Claude 3.5 Sonnet

18:27

USEFUL Agentic Workflow: AUTO-Updating Blog with Claude 3.5 Sonnet

3 месяца назад

When to use Prompt Chains. DITCHING LangChain. ALL HAIL Claude 3.5 Sonnet

22:29

When to use Prompt Chains. DITCHING LangChain. ALL HAIL Claude 3.5 Sonnet

3 месяца назад

MASTER the Prompt: TOP 5 Elements for Reusable Prompts, AI Agents, Agentic Workflows

15:47

MASTER the Prompt: TOP 5 Elements for Reusable Prompts, AI Agents, Agentic Workflows

3 месяца назад

AI Coding Tool Breakdown: AI Copilots vs AI Coding Assistants vs AI Software Engineers

16:01

AI Coding Tool Breakdown: AI Copilots vs AI Coding Assistants vs AI Software Engineers

3 месяца назад

ALL ROADS LEAD to AI CODING: Cursor, Aider in the browser, Multi file Prompting

20:07

ALL ROADS LEAD to AI CODING: Cursor, Aider in the browser, Multi file Prompting

4 месяца назад

From Prompts to Products: Four KEY Pillars to MAX your GenAI OUTPUT

13:28

From Prompts to Products: Four KEY Pillars to MAX your GenAI OUTPUT

4 месяца назад

No, ChatGPT SKY is NOT an AI Assistant: How to LEVERAGE GPT-4o, GenAI, and Gemini

18:15

No, ChatGPT SKY is NOT an AI Assistant: How to LEVERAGE GPT-4o, GenAI, and Gemini

4 месяца назад

Llama-3 70b OMNI-complete: AUTO Improving AUTOcomplete Prompt for EVERYTHING (Groq)

17:51

Llama-3 70b OMNI-complete: AUTO Improving AUTOcomplete Prompt for EVERYTHING (Groq)

4 месяца назад

GPT-5 is coming: 3 ways to prepare for a 100x improvement in SOTA LLMs

16:36

GPT-5 is coming: 3 ways to prepare for a 100x improvement in SOTA LLMs

4 месяца назад

ZERO Cost AI Agents: Are ELMs ready for your prompts? (Llama3, Ollama, Promptfoo, BUN)

21:37

ZERO Cost AI Agents: Are ELMs ready for your prompts? (Llama3, Ollama, Promptfoo, BUN)

5 месяцев назад

Two-Way Prompts: SIMPLIFY your AI Agents, Agentic Workflows, Personal AI Assistant

12:57

Two-Way Prompts: SIMPLIFY your AI Agents, Agentic Workflows, Personal AI Assistant

5 месяцев назад

MOST Important AGENTIC Application - Speech to Text to AI Agents (TTS, STT, LLM Router)

17:55

MOST Important AGENTIC Application - Speech to Text to AI Agents (TTS, STT, LLM Router)

5 месяцев назад

Agent OS: LLM OS Micro Architecture for Composable, Reusable AI Agents

20:25

Agent OS: LLM OS Micro Architecture for Composable, Reusable AI Agents

5 месяцев назад

7 Prompt Chains for Decision Making, Self Correcting, Reliable AI Agents

23:09

7 Prompt Chains for Decision Making, Self Correcting, Reliable AI Agents

6 месяцев назад

Top 3 BEST AI Coding Assistant ONE SHOT Prompts (Claude 3 Opus + Cursor)

16:31

Top 3 BEST AI Coding Assistant ONE SHOT Prompts (Claude 3 Opus + Cursor)

6 месяцев назад

Do companies NEED software engineers? Let's talk Devin, Layoffs, AI Coding Assistants.

17:01

Do companies NEED software engineers? Let's talk Devin, Layoffs, AI Coding Assistants.

6 месяцев назад

Last LLM Standing WINS: Groq LPU - Anthropic OPUS - OpenAI - Gemini Pro - LLM Benchmarks

16:39

Last LLM Standing WINS: Groq LPU - Anthropic OPUS - OpenAI - Gemini Pro - LLM Benchmarks

6 месяцев назад

Комментарии

@pubfixture 22 часа назад

4-way gold medal of 7 contestants means you need harder questions at the top end to separate them out.

@shockwavemasta День назад

Thanks for continuing this series - it's been super helpful

@davidpower3102 3 дня назад

I found it hard to understand how you benched the models. Was this mostly down to personal opinion? Maybe you might explain your tests before discussing the results. Your test tooling looks really nice!

@indydevdan 2 дня назад

100% personal opinion and vibes. I use promptfoo for more hands on assertion based testing. This notebook is more about understanding what the models can do at a high level.

@CheekoVids 4 дня назад

I know you don't do much of model training on this channel. But have you considered training the some of the local models on your good test results then seeing how the refined models perform?

@wedding_photography 4 дня назад

12:55 you completely missed that llama3.2:1b failed at SQL. It's missing authed=TRUE.

@indydevdan 2 дня назад

nice catch

@wedding_photography 4 дня назад

"ping" is the dumbest test I have seen. Go tell random people "ping" and see what they respond with.

@DanielBowne 4 дня назад

Hands down before local model I have seen for function/tool calling.

@husanaaulia4717 4 дня назад

isn't qwen2.5 has 3B parameter model?

@Jason-ju7df 4 дня назад

I wish that you put the model parameter sizes in the video description. Makes it easier to really give weight to your comparisons when you're comparing a 1B model to a 7B model

@johnkintree763 4 дня назад

Thanks for including generation of SQL queries among the tested tasks. The ability of models to interface with databases is crucial.

@ariramkilowan8051 4 дня назад

Would be cool to test image understanding. Basic OCR to start with then counting objects and doing reasoning over the images. LLM providers often tell us what their models can't do, or can't do well. Use that info as a signal of improvement would be very useful IMHO. Best still is that you can use code to check exactly how correct each model is, this can be harder when dealing with text where you need a human judge or an LLM as a judge (which then needs to be aligned with a human anyway). Also thanks for the video, I check in every Monday. Keep on keeping on. 👍

@zkiyyeller3525 4 дня назад

THANK YOU! I really appreciate your honest testing and taking us along with you on this journey!

@peciHilux 4 дня назад

Wow. nice. What I am missing is technical metrics for comparison, like response time, memory used to run the model...

@NLPprompter 4 дня назад

I'm curious do you use 5k context in ollama default model right?

@aerotheory 5 дней назад

Lots of subs to be had in the SLM area, so many edge cases. Try 70b_q4 compared to 8b models.

@matthewjfoster1 5 дней назад

good video, thanks!

@Canna_Science_and_Technology 5 дней назад

Lama 3.2 hallucinates really bad.

@zakkyang6476 5 дней назад

Interesting project.Since I am a lazy person, I will use another LLM model to score the output each time rather than manually.

@DanInVirtualReality 5 дней назад

Awesome! I'd love to see some FIM prompt tests on FIM-purposed models like deepseek coder - I had that as my 'auto complete copilot' in Twinny on VSCode before I moved to Cursor and I was impressed with how often it nailed it, for a really small model (no point in auto complete if it can't offer a completion quicker than my own brain and fingers! And I only have a lowly 1060 6Gb 😅). It strikes me that FIM code completion could be a way to leverage those model's strengths in code reasoning, which could outperform natural language reasoning in instruct-tuned models of a similar size. e.g. a logical setup and a logical next step presented as code with a FIM request on the intermediate action... I'm thinking of tool-choice in particular for my own use case. All the assistant demo scripts I've seen show picking between, like, five tools max which is not a realistic sized toolbox available to an assistant. Keyword-based context-stuffing would help of course, or RAG techniques on the tools and their descriptions, but timeliness may prevent that in-practice. I can imagine code with concise named and typed functions declared, and a comment describing the purpose of the next step - that should perform well with these, I suspect. I just haven't got to the experiment yet 😄 The main benefit is it would address your particular displeasure with extraneous explanation - if present at all it would be commented out code or, at worst, print/log statements.

@samsara2024 5 дней назад

Thanks for the video! Could you make a tutorial in which a local installation of Llama can learn from the chats you have with the IA. I mean you just talk and somehow it is storing this information internally and not losing it when you close the computer.

@prozacsf84 5 дней назад

Bro, it's useless to compare without o1-preview. It is times better

@indydevdan 2 дня назад

This was a local model focused test. o1-preview would score 100% on these tests, nothing to learn there.

@prozacsf84 2 дня назад

@@indydevdan gpt-4o is local ?

@billybob9247 5 дней назад

What quantization sizes where you using for the models?? Love your channel! Keep it coming !!!

@DARKSXIDE 5 дней назад

maybe see how they perform with anthropics new contextual rag. then we can download devdocs and make even the slms smarter for coding

@amitkot 5 дней назад

Great comparison, thanks for making this! I'm off to compare qwen2.5:latest with qwen2.5-coder:latest.

@indydevdan 2 дня назад

Thank you! Qwen2.5 was the real shocker here. When qwen 3 hits - it's prime time for on device models.

@billydoughty7243 5 дней назад

@IndyDevDan - you da man, dan. experienced engineers can appreciate your methodology and the value of your content and the tools you create. inexperienced engineers can learn the value of a methodical, structured approach to software development, which includes analyzing, comparing, and building tools to maximize your productivity. great videos. keep 'em coming.

@techfren 5 дней назад

Thank you for continuing to post great content

@DARKSXIDE 5 дней назад

u too techfren you guys both rock! big fan of both channels !

@vitalis 5 дней назад

Checkout Molmo then

@acllhes 5 дней назад

What happened to your ai personal assistant?

@indydevdan 2 дня назад

We've been waiting for the realtime_api 🚀

@albertwang5974 5 дней назад

what an inspiration video!

@amitkot 6 дней назад

Thanks for sharing this video!

@adamviaja 7 дней назад

I'm pretty new to coding and I'm def a little lost in this video but I'll have to come back as I learn more!

@BA-ve7xp 10 дней назад

Is it possible to do this without Cursor? I think I saw a video of vscode plugins for aideer, continue, and cluadedev. I'm new to all of this so I appreciate any pointers.

@JannisSchulze-xz7um 10 дней назад

HOW HARD IS IT TO COPY PASTE A PROMPT INTO THE VIDEO DESCRIPTION :(

@saaaashaaaaa 10 дней назад

just pivot it to be able push it to host it on vercel as a working frontend with styling ability

@---xu8kc 10 дней назад

@IndyDevDan what is so special about duck database?

@user-pt1kj5uw3b 10 дней назад

That Mr. Beast production document is literal gold. Tens of thousands of hours of expert advice perfectly distilled. He comes off like a dick but sometimes you need to be a dick to do something great.

@indydevdan 8 дней назад

I 100% agree.

@blarvinius 10 дней назад

I can't see, its all too small. Please remember mobile viewing for your videos.

@RainerSt0ff 10 дней назад

You're missing a point here, more lines doesn't equal better software. Performance of AI code tends to degrade quickly with length and complexity, especially if you want to build applications that are more advanced than a demo. Still, quite amazing how good LLMs have gotten, but the metric we measure shouldn't be the length of code produced

@BigFattyNat 10 дней назад

Is this whole thing kinda not really but kinda really just built around what if your app was based on a slider?

@hamzaessahbaoui7053 11 дней назад

It would be great if each response is saved for the prompt.

@indydevdan 8 дней назад

Definitely coming in v2.

@ctcsys 11 дней назад

But could it work as Aider front/IDE too? Yes I bet

@ctcsys 11 дней назад

Sounds great, thanks

@paul310paul 11 дней назад

Fantastic tool! Thank you so much!

@user-pt1kj5uw3b 12 дней назад

This is pretty clean and really powerful, I could really see this catching on

@ginocote 12 дней назад

This guy is so surexcited and go so fast... how many times on adverage do you click pause in each video? > 10 > 25 > 50 >100 >more 😂

@AgenticAI 11 дней назад

You can adjust the speed to your liking.

@GiovanneAfonso 12 дней назад

you made my day

@uhtexercises 12 дней назад

Best stuff as always. Thank you so much for sharing

@egparker5 12 дней назад

Mathematica does the same reactive style reference updates with Dynamic[] expressions. It also has lots of widgets etc., and the killer feature is true symbolic computation.

@dinoscheidt 12 дней назад

Uff… another toy that pushes data scientists away from writing unit tests 😮‍💨 we made so much progress getting them into VSCode and its Notebooks feature with pytest right there

@akshayagrawal6755 12 дней назад

marimo notebooks are stored as pure Python files. Cells can be named and used in other python files, and in this way they can be readily tested with pytest. More features to help with this are roadmapped.

@jindrichsirucek 12 дней назад

Great content!! I was breaking my head with the way how to structure instructions especialy for meta prompting and first I was thinking about json bcs of its unlimited nesting nature. then I realized that XML might be better bcs of the problem closing brackets.. and then I realized the reason why XML is the best format is bcs LLM are trained on websites - tudum tudum tudum tada - XML formated content :D I kind of realized all those things on my own and I was thinking, why is nobody talking about it and then 2 days lateer booom - this video :D thx for references - Ill study what others came up with, since I kinda reinvented wheel on my own :D Thx