How To Build Generative AI Models Like OpenAI's Sora

Подписаться 1,4 млн

Просмотров 79 тыс.

50% 1

If you read articles about companies like OpenAI and Anthropic training foundation models, it would be natural to assume that if you don’t have a billion dollars or the resources of a large company, you can’t train your own foundational models. But the opposite is true.
In this episode of the Lightcone Podcast, we discuss the strategies to build a foundational model from scratch in less than 3 months with examples of YC companies doing just that. We also get an exclusive look at Open AI's Sora!
Read more about the YC AI companies from this episode on our blog: www.ycombinator.com/blog/buil...
Chapters (Powered by bit.ly/chapterme-yc) -
00:00 - Coming Up
01:13 - Sora Videos
05:05 - How Sora works under the hood?
08:19 - How expensive is it to generate videos vs. texts?
10:01 - Infinity AI
11:23 - Sync Labs
13:41 - Sonauto
15:44 - Metalware
17:40 - Guide Labs
19:29 - Phind
24:21 - Diffuse Bio
25:36 - Piramidal
27:15 - K-Scale Labs
28:58 - DraftAid
30:38 - Playground
33:20 - Outro

Наука

Опубликовано:

18 июн 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 88

@chapterme 2 месяца назад

Chapters (Powered by ChapterMe) - 00:00 - Coming Up 00:49 - Intro: Generative AI for Video 01:13 - Sora Videos 05:05 - How Sora works under the hood? 08:19 - How expensive is it to generate videos vs. texts? 08:55 - How do YC companies build foundation models with just $500K? 10:01 - Demos: Infinity AI 11:23 - Sync Labs' hack to train a Lip Sync Model with a single A100 GPU 12:45 - YC deal with Azure 13:41 - How Sonauto Built a Text-to-Song Model 15:44 - Metalware: Hardware Co-Pilot 17:40 - Guide Labs: Explainable Foundation Model 18:20 - Building your own models vs. Using existing open source models 19:29 - Phind's Clever Hack: Synthetic Data 22:03 - Simulating real-world physics: Atmo (Foundational model for weather prediction) 24:21 - AI in Biology: Diffuse Bio 25:36 - Piramidal: Foundational model for the human brain 27:15 - AI in Robotics: K-Scale Labs 28:58 - DraftAid: AI Models for CAD Design 30:38 - Playground going against giants and Suhail Doshi Background 31:42 - Companies pivoting into AI 32:44 - Takeaway Message 33:20 - Outro

@theniii 2 месяца назад

All you're really saying here is that people can build any foundational models as long as openai doesn't also do it. That's not very reassuring to hear. We started with words, now pictures and videos, why would anyone not expect music, robotics, hardware etc down the line?

@BrianMPrime 2 месяца назад

The lipsynching on Tim Ferriss looked way off. There was a bit of an uncanny valley with the deepfake switchover as well.

@danielmarco7863 2 месяца назад

This is definitely a launched product that the founders are embarrassed by. In the sense that they understand this is not representative of the final product, which many will suggest is indicative of the proper time to launch. Definitely applying the "law of papers" to my understanding of the state of the art video generation.

@jks234 2 месяца назад

Interestingly... the podcast's lipsyncing is also a bit off already. So perhaps it's just an audio sync issue.

@joythought 2 месяца назад

@@jks234 yes, YT is terrible for lipsync at times so probably best to download the episode and then watch as a local copy to have some hope of seeing it the way they saw the demo.

@BrianMPrime 2 месяца назад

@@danielmarco7863 I appreciate that attitude towards building, kudos to the team for launching early!

@juanortega7509 2 месяца назад

I've been waiting for a new episode for weeks!! Thanks for the content guys!

@jks234 2 месяца назад

20:15 I personally find the concept of synthetic data to be a fascinating spur for more neuroscientific research. People dream about what they study and are constantly reviewing problems they are working on in their head. In other words, I feel that humans use simulations in their own mind to build out the models they use to understand their world. We might be able to think of this as "generating 1000x more data" than can is directly extracted from the real world. Another example of this that was done to awesome effect is AlphaGo's self-play training.

@andybrice2711 2 месяца назад

I would maybe argue Synthetic Data isn't inherently circular, it's just inverted. Whenever you've got a transformation which is easy in one direction, but difficult in the other. Synthetic Data is a sensible approach. Like it's easy to rasterize vector graphics, but it's more difficult to vectorize raster graphics.

@avi2125 2 месяца назад

The text/prompt for the video was quite detailed n informational. Even as a bad programmer I was able to mentally construct an algorithm for a video on the fly...maybe I have to watch this podcast more than the first 5 mins to understand why Sora etc is a big deal...

@alejandroVigano 2 месяца назад

Thanks for sharing this talks!

@atchutram9894 2 месяца назад

11:40 Hindi demo is perfect. My first language is not Hindi but can definitely tell it is great translation.

@GusKesaranond 2 месяца назад

Thank you so much!!!!!!!!!!!!

@alicapwn 2 месяца назад

They didn’t source robotics papers for Sora’s architecture. They combined Diffusion Transformers (developed by Peebles) with the video diffusion methods released by Stability/Google/Meta/Nvidia.

@DiasporaPay 2 месяца назад

This is awesome thanks!

@fil4dworldcomo623 2 месяца назад

I think Sora is better positioned on imagining a new world and totally a different world than to simulate our perception of what the world is and what the world was.

@drgoldenpants 2 месяца назад

Are there links to the sora videos they are showing?

@bahlechonco211 2 месяца назад

Great insight

@awesomeo4510 2 месяца назад

Yes but how do you find the datasets to train for new foundational models? Like their EEG example - how did she acquire this data to train the models?

@LuisPerez-uh9ik 2 месяца назад

Just take it!

@joythought 2 месяца назад

Isn't she an expert in the field with papers published in Nature? If so, she has the data. If you want similar data you need to partner with researchers.

@minc33 2 месяца назад

Where there’s a will, there’s a way!

@FunwithBlender 2 месяца назад

Alibaba is also doing some interesting things with AI video, we (open source community) have almost destructured the process.

@vikalpjain1098 2 месяца назад

At 4:17 to 4:20 in one of the column one ladder joint got added.

@kog0824 2 месяца назад

M 17:20 here seems an interesting approach… but sorry that I am new to this AI space, what does it mean by building its own foundation model but with gpt2.5. Does it mean it fine tune through gpt2.5 with its own data?

@fortunefubara1244 2 месяца назад

Yes.

@jess-e 2 месяца назад

Who can share the papers which are necessary to get to a level of understanding that is actionable? As explained in the video :)

@AdityaVG10 2 месяца назад

I have been looking for those papers ! Tell me if you get some .

@AfeezAbdulAziz 2 месяца назад

@@AdityaVG10me too! I’m still finding out about this

@xilluminati 2 месяца назад

̶f̶ i̶r̶s̶t̶…. no… early adopter

@pandainvestingco 2 месяца назад

😂

@sgdfly8715 2 месяца назад

An idea that anyone can take (though it might already exist): Use AI to help recreate crime scenes and make recommendations on what data might help better understand and solve cases. The ideal solution would be able to use data from other cases in order to improve recommendations.

@pandainvestingco 2 месяца назад

I love this series

@sergismael 2 месяца назад

best episode so far.

@raymond_luxury_yacht 2 месяца назад

interesting that raytracing in games might be done and games will be diffused not rendered

@reza2kn 2 месяца назад

I appreciate the show and encouraging people to go for it, and I get hyping up the early YC-backed products, but the first couple weren't even super impressive by March 2024 standards, let alone being "the best thing" on the market. I'm not bashing any of the products and I hope they do awesome, I'm just saying these are not at all good examples of "the best we have right now", and is discouraging to hear from you guys. @ 11:42 The lip sync is completely off. This while perfecting lip sync motion was already accomplished last year. @15:40: Check out Suno AI v3. That's like GPT-4 compared to GPT-2 (what you showed here)

@LuisPerez-uh9ik 2 месяца назад

They also are young founders. Looks to me like they are pushing this to encourage ai in yc

@Alice8000 2 месяца назад

NICE VIDEO MY FRIENDS

@Authormatthewtaylordotcom 2 месяца назад

Thanks for sharing! Love the content. Any great repositories for the latest academic papers/journals to read up on as mentioned near the end?

@FunwithBlender 2 месяца назад

Respectfully stable diffusion is way better than anything else to act like mid journey or playground is better is to not understand the flexibility and creativity you have with stable diffusion. Stable diffusion can combine with control net there is a massive community Civitia with LoRA and textual inversion etc and there is a thousand tings you can do from deforum to you name it. Stable diffusion is the only model that can give you precision when needed if you know how to use it, yes its more complex but it is the best model

@Affableluckyvlogs Месяц назад

Great ❤

@FunwithBlender 2 месяца назад

the lipsync has some better open source free solutions but still cool

@rodi4850 2 месяца назад

4:47 there's tons of videos of the golden gate in 360 - gaussian splatting can do it much better 😁

@george_davituri 26 дней назад

it's mind blowing, how newly graduates create AI driven products without even 10 years experience and research in ML as well as without spending decent amount of cash. Fascinating !

@-Jason-L 25 дней назад

I dont think it is safe to assume the devs creating these are junior, fresh BS grads

@george_davituri 25 дней назад

@@-Jason-L bachelor’s degree takes about 4 years, the average age on completion it about 22 and 23 so by experience and age they belong to juniors. At the same time we should not call them juniors after creating stunning product

@JohnSmith-he5xg 2 месяца назад

12:40 Really burying the lede here to the question "How are YC companies able to create these models with only $500k?" We arranged for free compute with MSFT (she didn't say how much, but said hundreds of times more than they'd get otherwise)

@adiveena 2 месяца назад

How to work this type startup

@rcstann 2 месяца назад

¹1¹! It's "Sam" day in the Bay area.

@shallindurani 2 месяца назад

I wonder what the dog thinks about him lol

@fanaccount6600 2 месяца назад

why is that cup on the ground instead of being on the table?!

@vslaykovsky 2 месяца назад

this is an AI-generated video, that's why

@swaggitypigfig8413 2 месяца назад

So they can grab it with their toes and fling it towards each other as a conflict resolution technique.

@saravanashanmukham6108 2 месяца назад

Inspiring to know AI barrier can be overcome without a PhD in ML/AI. Thanks guys!

@AM-kx4ue 2 месяца назад

Hi everyone, I'm exploring how startups are balancing AI model training with customer data privacy, especially in competitive industries where data can make a difference against competitors. If you have insights or experiences to share on anonymization techniques, federated learning, differential privacy, or service models with privacy tiers, I'd love to hear from you. Let's discuss this further and exchange strategies for responsible AI development.

@FunwithBlender 2 месяца назад

Okay I am sold on Y C lol will submit my application, access to GPU's for fine tuning is valuable

@harshitgauravtiwari 2 месяца назад

What if this video also is ai generated

@harshitgauravtiwari 2 месяца назад

Omg i am the first to comment I have startup in semiconductors Hope someday will meet with Y combinators 😊

@john-kv7kl 2 месяца назад

bruh it is ai generated. 10:33

@joythought 2 месяца назад

This comment is AI generated.

@learn.ai99 2 месяца назад

No way u dont kno who that is 11:53

@GigaFro 2 месяца назад

Seeing one example of the generated spelling being correct or even a few does not mean there was any advancement in this area...

@shrawanthakur4168 2 месяца назад

It’s just the start of the AI and a lot of Sci-Fi things becoming real.

@jks234 2 месяца назад

15:04 memeworthy clip

@nischalnayak391 2 месяца назад

Great ! I watched this video to relealise i need millions of free credit to build a foundational model for free

@pauldannelachica2388 2 месяца назад

❤❤❤❤❤❤

@0x0michael 2 месяца назад

What sora imagined was a single-laned residential street, lots more space for trees, gardening, walking and for neighborhood activities. Cars move one-way in from one direction and out in the opposite.

@FunwithBlender 2 месяца назад

I hope playground wins though the more competition the better

@jeffsteyn7174 2 месяца назад

Looking down on synthetic data makes no sense. Models like orca was built on synthetic data and it outperforms models 10x its size.

@gunaysoni6792 2 месяца назад

The models you showcased today aren't really "foundational models" (at least in the way the term is currently used.) and a lot of what you show isn't super new. Saying that you don't need a lot of GPU's to compete is very misleading.

@Cygx 2 месяца назад

Feels like I’m sitting in listening to the four smartest kids in my class XD

@Alice8000 2 месяца назад

I hope you guys are very successful so you can buy some furniture! lol jk bro. just a prank bro.

@vincentwady 2 месяца назад

Let’s push 100% AI to the market. There should not be single human needed for a corporation after that.

@Mooohbroadcast 2 месяца назад

Thanks for sharing one more useless hype. You jumped from blockchain to crypto, NFT, and finally to AI. You should change your brand name in Y Hype 💩

@kamal_pratap 2 месяца назад

the hell?

@rodi4850 2 месяца назад

A guy not speaking Hindi gives his opinion on an lip sync model speaking Hindi 😂

@alexanikiev 2 месяца назад

This comment alone is a “great” example of stereotypical thinking. The problem is that we are already living in the 21st century and people speaking 3-4 languages on a daily basis is pretty expected 8)

@GatherVerse 2 месяца назад

If you really want to add value to this podcast why not add a black person to the conversation? We reccommend Christopher Lafayette. He's in the Valley and can contribute well to this conversation and draws an audience. Else, find someone else, but consider the upside to this. Thanks.

@tf_9047 2 месяца назад

AI, even at current levels of capability, is far too dangerous to our society to be released to startups or governments or businesses or the public. We need startups to tackle the safety of these models at a more aggressive rate than capabilities advances.

@ashleigh3021 2 месяца назад

People limiting AI are extremely dangerous. We need rule of law to tackle Luddism in the public and protect technology from ignorance.

@joythought 2 месяца назад

Seriously, how would a start up solve the alignment problem when that is out of their hands. Better for them to do new things building new models. The great thing about human agency is it's almost unstoppable. The great thing about AI agency is it can be switched off. Anyone fearing the rise of the machines has no idea how much power that is going to draw. Simple enough to switch off at the mains.