DataScienceCastnet

52
126 381

35:24

Zotero Cleanup (chatting informally while closing my open papers)

2 месяца назад

45:43

With the author: Readout Guidance (plus Diffusion Hyperfeatures)

3 месяца назад

40:00

Paper deep dive: Evolutionary Optimization of Model Merging Recipes

4 месяца назад

4:08:37

Paperathon #1

8 месяцев назад

26:20

ZipLoRA: Any Subject in Any Style (deep dive and paper explanation)

8 месяцев назад

14:32

Evaluating Diffusion Models with PickScore

9 месяцев назад

14:17

How I 'monetized' an AI demo

9 месяцев назад

32:45

Gaussian Splatting explorations

10 месяцев назад

8:26

LLM basics #4 with the LLM Science Exam Kaggle Competition - Retrieval

11 месяцев назад

15:21

What is Speculative Sampling?

11 месяцев назад

18:15

LLM basics #3 with the LLM Science Exam Kaggle Competition - Training a task-specific model for MCQs

11 месяцев назад

17:04

LLM basics #2 with the LLM Science Exam Kaggle Competition - Generating Synthetic Data

11 месяцев назад

21:08

LLM basics #1 with the LLM Science Exam Kaggle Competition - Zero-Shot approaches

11 месяцев назад

16:22

Stylizing Video with Diffusion Models

Год назад

13:22

InstructPix2Pix Explained - Edit Images with Words!

Год назад

41:09

Stable Diffusion Deep Dive Notebook Run-through

Год назад

19:58

Podcast E6: Wasim Lorgat

Год назад

15:50

Building DistilHN: Using ML to Summarize News Articles

Год назад

46:02

HuggingFace Class, Unit 2 - Fintuning and Guidance (casual notebook walkthough)

Год назад

50:38

HuggingFace Diffusion Model Class, Unit 1 (casual notebook walkthough)

Год назад

20:03

Editing Images with Diffusion Models (lit review / overview of different approaches)

Год назад

1:11:46

TGL Discussion Series: Hamel Husain

Год назад

29:55

TGL Discussion Series: Jason Antic

Год назад

34:49

TGL Discussion Series: Teodora Szasz

Год назад

14:53

TGL Discussions Series: @EnzymeZoo

Год назад

17:46

TGL Discussions Series: Apolinario Passos

Год назад

21:06

Progressive Distillation for Fast Sampling of Diffusion Models (paper sumary)

Год назад

24:55

Summarising Neuromatch Deep Learning Course in 20 minutes

2 года назад

34:16

Paper exploration: Making Diffusion Models go BRRR!

2 года назад

Комментарии

@optus231 5 дней назад

Where is this? "GS website (with links to paper):"

@adityakharbanda3290 20 дней назад

Could you please explain the loss function in a bit more detail? Thanks

@davidgwyer5169 27 дней назад

@datasciencecastnet Loving these videos! Is there going to be coverage of units 3 and 4 at any point?

@ParinithaRamesh-qf2ig Месяц назад

Can you please share the git repo for all your code, it would be great to follow along and see the results on my end.

@bikashpatra119 Месяц назад

Thank you for this nice getting started video. Could learn a lot from it. One question, did you write the function json schema or you used any function to generate the schema.

@aintgonhappen Месяц назад

the video is laggy :(

@datasciencecastnet Месяц назад

THat was an interesting video about steganographic techniques. The first thing I thought of was the movie Sneakers, where they used a cassette tape as the steganographic device. The movie is about 20+

@HaiweiShi Месяц назад

Hi, really good video. Could you share the jupyter notebook you showed in this video? It would be so grateful!

@quyet-65cs3buivan8 Месяц назад

very interesting, thank you! Can you also share the notebook code of your on this video?

@SahlEbrahim 2 месяца назад

isnt open ai api a paid feature?

@alexkelly757 2 месяца назад

Yeah it valueable. I've just added one to my read list. I don't think anyone goes over papers in a quick-fire method as you've just done where similar (also dis-similar) papers are compared at a high level on why you picked them and some institution of how they work.

@hamelhusain7140 2 месяца назад

Yaaay Johno!

@BrianMPrime 2 месяца назад

Nice work haha i don't clean up zotero often enough

@joe33444 2 месяца назад

This is a really interesting idea.. With regards to guidance, do you know if anyone has tried to train models that try and predict those noise deltas you would get from pushing the gradient backwards, but doing it in a forward direction? For example, you could train a model to take the features from the "up" layers in Stable Diffusion, and then predict a secondary noise delta to try and correct the regular Stable Diffusions noise in the right direction. Like estimate what the delta of the back propagation noise would need to be. I'm not sure if that would actually save in compute power during inference, compared to having to do a backwards pass and holding the whole graph while you generate images, because I assume that model would need to be reasonable in size. But it might also allow larger steps in prediction than you might get from a single gradient backwards pass. And it would remove the need for an internal RGB image stage at all, because you would only need to use that model during training.. Although it would likely break the awesome part of this method that it requires very few samples to get good results, at the cost of shifting work to the inference stage.

@neelshah1651 3 месяца назад

Thank you very much for such a great work and explanation. Hatt's off

@Ketan-somewhere 3 месяца назад

thanks a lot

@manindermaan3695 3 месяца назад

hello i have a confusion regarding one topic in diffusions, can i get your contact info Jonathan, like your mail ID or any other contact info please, i am working on my last year project anything would help

@abse-mj8pw 4 месяца назад

very great introduction! I can see a lot of efforts have been put into this video! It helps a lot to understand the paper! thank you for sharing!!

@abse-mj8pw 4 месяца назад

However I have one small question about the overfit part at the end of this video. Is it about that the test set translated into Japanese might be learned or finetuned by the math 7B llm?

@jonatan01i 4 месяца назад

Oh my god, man you don't understand how happy I am for your storytelling about how things went in the timeline of developing on the idea of model merging up to this point, where it started how it went, that and how they were thinking about reasons why it works, etc.etc.. I want to get into this so that I understand the main ideas and be able to start working on these as well, but it's so hard to get to the root of things, it requires a huge amount of time to read and digest everything and slowly being able to put the pieces together, so boy do I mean it when I say thank you!

@yuvish00 4 месяца назад

Can you share the code please?

@UmerHA 4 месяца назад

Hi Johno, at the beginning you said you're somewhat skeptical of model merging. Iiuc, your criticism is only about iterative merging for a given goal, which leads to overfitting. Or are you skeptical of the general concept of model merging? Thanks!

@gedankenthesis 4 месяца назад

That was an excellent overview of not just Sakana's evolutionary methods to identify good merge candidates, but also the popular techniques TIES, DARE and Passthrough/Frankenmerge. Appreciate it as usual, Johno!

@rogerc7960 4 месяца назад

Deep dive: model merging m.ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-cvOpX75Kz4M.html

@TuanKhai298 4 месяца назад

very helpful!

@Ali-wf9ef 4 месяца назад

It would've been cool if you could visualize which point in the scene you are showing the spherical harmonics for

@thehigheststateofsalad 5 месяцев назад

Thanks for hosting this

@seanriley3121 5 месяцев назад

when approaching a manifold, what would happen if the approach was aligned the norm of the manifold surface?

@specyfickRC 5 месяцев назад

Can you share this code you have made for this video ?

@thecheekychinaman6713 5 месяцев назад

Practical and useful, thank you!

@thecheekychinaman6713 5 месяцев назад

Subscribed, rare to see someone tackle a new category of competition on Kaggle, nice work

@EkShunya 5 месяцев назад

enjoying it :)

@Ali-wf9ef 5 месяцев назад

thank god it is in python. I tried reading Instant NGP's code and it's not easy to understand C++. Specially Cuda code

@ariel415el 6 месяцев назад

Great video!

@anotherjesse 6 месяцев назад

I really appreciate both the content of the papers, as well as your work to make papers more approachable. I wonder if DSPy(-ish) syntax could be a way for papers to share their LLM algorithms in a standard/comparable way. Having a way to quickly compare approaches could help contextualize/understand new approaches. I am looking forward to a video specifically on DSPy if you make one. Based on your presentation I am going to give it a try.

@lion87563 6 месяцев назад

Guy with worst image quality ever explains technique that produces best image quality ever. Just some geek joke, thank you for such a nice presentation

@MrAhsan99 6 месяцев назад

Thanks for the effort

@dl569 6 месяцев назад

thanks, very clear!

@kashishmukheja7024 7 месяцев назад

🎯 Key Takeaways for quick navigation: Note: It is "Orca" papers but the Harpa AI generated those as "Ora". 02:05 📚 *The session focuses on a paperathon where the goal is to collaboratively read and discuss various papers related to AI and machine learning.* 03:14 🧠 *The speaker outlines a general pipeline for training AI models, covering stages like data generation, pre-training, fine-tuning, alignment with human preferences, and model deployment.* 06:33 🤖 *The discussion shifts to the Ora paper, emphasizing teaching smaller language models to reason by using intermediate steps generated by a larger model.* 11:53 🌐 *Ora 2 builds on the original Ora paper by exploring improved training signals to enhance smaller language models' reasoning abilities, focusing on determining the most effective strategy for each task.* 15:29 🎓 *Ora 2 introduces task-specific system instructions, optimizing the model for various reasoning strategies tailored to different subtasks, aiming for a more versatile and effective chat model.* 17:36 📊 *Ora 2 demonstrates improved performance, surpassing other comparable models in various benchmarks, showcasing the effectiveness of its approach to diverse reasoning tasks.* 24:06 🧠 *Researchers generate synthetic data for diverse image editing tasks, customizing examples for tasks using llama 2 and various techniques.* 25:54 🖼️ *Emu Edit, a diffusion model, is designed to multitask, providing conditioning for different tasks while intelligently guessing the user's desired edit.* 28:57 🔄 *The approach of generating synthetic data for training models, as seen in Emu Edit, yields powerful editing capabilities, surpassing models that rely on less principled synthetic data generation.* 49:57 🧠 *The paper discusses the challenge of storing and training large language models with billions of parameters due to high memory requirements.* 51:08 🎯 *The proposed approach aims to reduce memory usage for fine-tuning large models, making it accessible even on a single 48 GB GPU.* 51:49 🛠️ *Low-rank adapters are introduced to train fewer parameters by applying a Delta to the base weights, further minimizing memory overhead.* 52:30 📉 *Quantization is used to shrink the base model even more, achieving efficient memory usage while maintaining model performance.* 57:25 🚀 *The paper demonstrates that the proposed approach allows for training models with comparable performance to full fine-tuning but with significantly fewer resources and increased efficiency.* [01:15:00 URL](ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-YNOIyvUCpAs.html) *📑 The paper introduces a prompt for information removal, ensuring unbiased context for answering questions.* [01:20:52 URL](ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-YNOIyvUCpAs.html) *🤖 "Zepha" is a paper discussing a recipe for training a high-performing chat model, focusing on fine-tuning and preference alignment.* [01:22:03 URL](ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-YNOIyvUCpAs.html) *🏆 Zepha achieves state-of-the-art performance on chat benchmarks, outperforming other models in the 7B parameter setting.* [01:26:43 URL](ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-YNOIyvUCpAs.html) *🔄 The combination of Distilled Supervised Fine Tuning and Direct Preference Optimization yields the best-performing chat model.* [01:34:34 URL](ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-YNOIyvUCpAs.html) *🚀 Direct Preference Optimization (DPO) solves the reward maximization problem in a single-stage policy training, making it computationally lightweight and efficient.* 01:38:00 🔄 *DPO allows measuring the likelihood or perplexity of sequences, providing a way to evaluate the model's performance in generating sentences.* 01:41:07 📊 *The DPO loss function focuses on increasing the likelihood of good completions, and its effectiveness lies in updating the model based on preference-ranked pairs.* 01:42:14 💻 *DPO outperforms fine-tuning on preferences or the base model, offering a straightforward and efficient way to leverage preference data for model improvement.* 02:03:20 📸 *Contrastive learning involves mapping images to an embedding space, making similar images close and dissimilar images distant.* 02:05:32 🔄 *Self-supervised techniques include generating variants of an image and ensuring embeddings of similar images are close in the embedding space.* 02:08:06 🌐 *In self-supervised learning, invariance-based methods aim for similar embeddings for different views, while generative methods involve filling in gaps or completing parts of an image.* 02:16:15 🚀 *The proposed method outperforms other techniques, requiring less pre-training and achieving high scores with minimal labeled data on ImageNet.* 02:24:49 🤔 *The conditioning variable Z in the joint embedding-predictive architecture specifies the positional information for predictions during training but is not used during inference.* 02:27:37 📊 *The Z variable helps produce multiple outputs during training, allowing the model to generate a distribution of possible predictions.* 02:32:45 🌐 *Lucid Dreamer paper discussed, focusing on Garian splatting technique for fast and efficient novel view synthesis in 3D scenes.* 02:38:02 📄 *Lucid Dreamer's "High Fidelity Text to 3D Generation via Interval Score Matching" introduced, generating impressive 3D content from text prompts.* 02:39:33 🔍 *Score distillation sampling explained as a method using 2D models to optimize 3D scene representations, addressing challenges like oversaturation.* 02:46:08 🔄 *Interval Score Matching introduced,a technique for consistency in score matching, providing a cleaner signal for updating base representations in 3D models.* 02:48:12 🌐 *Point initialization using existing models like Pointe for 3D GANs discussed, providing a better starting point for optimization.* 02:50:12 🎨 *Lucid Dreamer's applications include avatar generation, 3D editing, and impressive results in generating 3D scenes from various input types.* 02:51:59 🌐 *Mention of the ongoing innovation in text-to-3D space, with Lucid Dreamer being one of many papers pushing the boundaries in generating high-quality 3D content from textual inputs.* 02:58:32 📊 *Low-poly representations are beneficial for efficient rendering, and the paper explores a method to generate these representations using a graph convolutional encoder and residual face quantization module.* 03:02:24 🔄 *The paper introduces a mesh generation approach using a Transformer, where embeddings are treated as a sequence and decoded to produce mesh representations, showcasing potential applications in various domains.* 03:04:34 🔍 *Ablation studies reveal that the proposed paper incorporates several crucial techniques, emphasizing their necessity for achieving sensible results in mesh generation.* 03:08:32 📈 *The discussion shifts to exploring recent diffusion model papers, contemplating the advancements and improvements in the field beyond stable diffusion models.* 03:15:31 🚀 *"Virion" introduces an innovative approach to diffusion models, focusing on hierarchical stages with extreme compressions for efficient training and achieving competitive results with fewer GPU hours.* 03:20:35 🎨 *The key insight from "Virion" lies in breaking down the image generation task into different stages of difficulty, addressing compression-decompression efficiently at various hierarchical levels.* 03:26:23 🧠 *The discussion covers aspects of competition and innovation, including the allocation of compute between low and high-resolution parts and efficient training methods.* 03:27:35 📸 *Training without using billions of images is possible by creating a dataset from Creative Commons images, as shown by Mosaic ML, achieving competitive results with a smaller dataset.* 03:33:48 💡 *Pixart Alpha introduces an efficient text-to-image Transformer, leveraging a pre-trained ImageNet model, cross-attention for text, and synthetic data for fast training with competitive results.* 04:01:24 🔄 *The generative model pipeline involves steps like base training, fine-tuning with alignment/preference, and use case considerations like inference speed and deployment.* 04:01:51 🚀 *Use case considerations include speeding up sampling, deployment strategies, and exploring different sampling techniques for self-improvement and enhanced usability.* 04:02:16 📑 *Recap of papers covered, highlighting their focus areas in the generative model pipeline, including pre-training, fine-tuning, efficiency improvements, and data utilization.* 04:03:14 🌐 *Papers like I-Jeer focus on pre-training image models to learn useful representations, exploring generative approaches and joint embedding predictions.* 04:03:39 🧠 *Lucid Dreamer explores using an existing base model for a different purpose, demonstrating the versatility of pretrained models in varied applications.* 04:04:07 ⚙️ *Matroska, Pixart Alpha, and others emphasize efficiency improvements, both in terms of data utilization and training processes, contributing to the evolution of generative models.* Made with HARPA AI

@GiovanneAfonso 7 месяцев назад

Very informative, thank you for sharing! Very cool how you calculated the final score, it seemed very complicated on kaggle

@pouljensen2789 7 месяцев назад

Can splats emit light (instead of just reflecting)? If not, how difficult would that be to implement? I'd like to try modelling aurora, which would correspond to fully transparent splats emitting light.

@grantbaxter554 7 месяцев назад

Very cool, very well explained, thank you

@dileepdilluDillu 7 месяцев назад

nice explanation about the paper. But I have a doubt, Where we get ziplora_name_or_path. let me know please.

@user-zm2sg1jj9o 7 месяцев назад

Hi guys, some of you will probably run into a couple of issues while trying to run the notebook, if the debugger says from torch._six import string_class or something related to that, first git clone the VQGAN source code, if you have the problem, the rest of the code will fail, but thats okay, then comment the lines that git clone stuff, go to the file that has the issue, you can see the path in the debugger, then instead of from torch._six import string_class, write string_class=str, I think pytorch used to use a custom string class, now they use the python built in one,hope it helps :)

@davidkunz6702 8 месяцев назад

Great! Thanks for your videos! @datasciencecastnet I have a question: do you know if there is something like PickScore for image editing models specifically InstructPix2Pix? I would like to see what users prefer in terms of cfg text, cfg image and so on.