YESSIR! We called it. Video is coming. And by this time next year, all of this will run on 4 year old laptops. THERE IS NO MOAT and consumers have no loyalty to the giants. Gonna be a wild decade :)
In the meantime I'm building out a community led, AI solutions development platform for developers and innovators to collaborate on while having access to all the toolsets that may be needed. Stay tuned.
@@userwhosinterestedin well, even though we now have SOTA open source models for image, text, and very soon, video, I’m sure closed source advocates will always point to new frontiers. Last year, there were predictions that a public GPT4 level model would be untenable; all of a sudden, even an 8B model can compete. Essentially, even if all updates ceased, the tools we have now can be used to augment/automate any digital process.
The flux Dev version allows for commercial usage of images made by flux dev, what it doesn't allow is selling Flux dev or any finetunes of it as a service like a image generation service
It's funny how people think that Flux is an AI. It's not. It's a trained model for SD. If a prompt asks for CFG and interference steps, it's a SD prompt based site.
As far as I know only the Flux "schnell" and "dev" model are open source, but not the best "pro" version. But still probably the best open source models we currently have.
@@KOSMIKFEADRECORDS Imho it is - for me. I'm using it with ComfyUI, I found a great workflow which generates a first image with flux-schnell, and then redreams it with flux-dev and upscales it. The results at least for things I prompt are comparable to SD1.5 with custom models in terms of quality (which is very positive) but with much better prompt understanding. I never saw an open source image model which follows my prompts so good. Some result's I've got so far are just astonishing. It's also much better in rendering interactions between subjects than other models. It makes no sense to generalize to say "this is the best!!" as it depends what you want to generate, what you like.
@@newfrontiers5673 it really isn't for the right use cases and with the right prompts. I used it for a project of mine, and it works really well after some heavy tweaking.
I got Flux dev working on my home computer late in the day when it was released. The image quality is great so far and I was shocked at how easy it was to make a tiled upscaling workflow -- it just works with barely any tile inconsistency, even without any controlnets. The downside is it is slow -- on a computer where I could generate a 4k+ final image with SDXL in about 12 minutes, Flux takes about an hour. I haven't had a chance to really try to optimize the step number, though, so it's possible I could get it better. It also doesn't really use CFG but its own custom guidance scheme, which means that it does not support negative prompts. Depending on what you need to do, this could make life difficult for you (e.g. the "fried rice with no peas" quest). And it's not really something I care about, but I've seen people complaining about not being able to do styles well with Flux. It is definitely the case that Flux can kind of do its own style choices unprompted, such as a photographic image with a cartoony element within it.
@@ghost-user559 I generate an initial image at 1MP, upscale it with an AI model (4xUltrasharp works the best of the ones I’ve tried, even though it is an old model), and then do a second denoising pass at 9MP with strength of 0.2. Yes, doing lower res and upscaling with another model would be faster, but I find those upscales to be very obvious, with the detail looking fake at 100% zoom. So I always want to do a second pass.
The last name of Mario and Luigi is Mario ( Thats why they are called Mario Brothers) So, the full name of Luigi is Luigi Mario, and Mario's full name is Mario Mario. Since Luigi is green in colour, Luigi is already the "green mario"
You could also mention how diverse the output is for the same prompt! Many other models look some over fine-tuned, outputting almost the same image for a constant prompt again and again.
Flux is _awesome._ No disputing that. But there are some caveats. While it may not be (strictly speaking) censored, my understanding is that it isn’t trained on anything nsfw and so it’s naturally not really capable of such generations out-of-the-box. It’s also _not_ going to be an easy model to work with if you want to fine-tune it.
Actually I think it will change the strategy for what “fine tuning” is. For example one could create hundreds of Loras at a significantly reduced. Then massive Lora Merges with hundreds of smaller Loras could be merged with Flux to achieve a similar goal. Lora Merges would be a way going forward for these models that are cost prohibitive for consumer hardware to train.
Yeah, it's awesome in prompt understanding. It's a bummer it's lacking nsfw stuff, because this also mean, it's hard to get interactions between people at all, even sfw ones. But still, I really like the model. I'm using "Flux Schnell and Dev Workflow with Upscaling" from "Harmeet" for ComfyUI and was able even as non expert to get really nice results.
@@ghost-user559 I don't know what's the minimum on VRAM you need, my AMD graphics card has 20 GB, but I think 12GB are fine. RAM I have 64, and it uses 50. It takes 70 seconds for me to generate the final, upscaled image. But it should work with 12 GB VRAM if you use the lowvram parameter for ComfyUI and the MemoryMax Parameter which you could set on like 28000M as value. There is a manual on Reddit how you can set it up if you have a 12GB VRAM graphics card. The result will be the same but it takes a little bit longer.
If it is uncensored, then it is missing some training data. While some image outputs are bad, some of them are soo aesthetically nice, that something as simple as colored bottles looks like a work of perfect art.
Hey Matt, I never get to see your videos this early. So I’m taking the time to say thank you for posting such excellent content. Keep up the awesome work bud!
Flux Schnell and Dev are currently NOT uncensored. Rather, they aren't "censored" because that's not how visual models work; they just weren't trained on any NSFW images so Flux has some real problems with some anatomy. It takes special workflows with refiners to fix it in ComfyUI right now.
it's not uncensored. A lot of misinformation in your video. Confusing Flux dev and pro. Quite misleading but i am used to your approximation by now. In addition the flux version that actually is useful is the dev one and is far from fast. Please inform yourself a bit before wasting people's time. Information is out there.
I am realizing that i will need to find money for a whole new machine based around a big ole gpu... And I love the fact that its now a problem because the software is a real possibility
Is it somehow possible to import a picture of yourself or a friend and have the same person in different styles or settings? Or do you know what's the best open source tool for that use?
First time i can generate beautiful logo patches with correct text around: enamel, photo realistic see through aged stained glass, embroidered, pastels.
Hi, black forest is a german region, south west: Schwarzwald (and all germans knows the cake (with cherrys, cream ..): "Schwarzwälder Kirschtorte", and the ham "Schwarzwälder Schinken") greetings from Frankfurt
The specialized tools offered by SmythOS’s integrated development environment (IDE) streamline AI coding and experimentation. This functionality boosts productivity and improves workflow management.
time for me to get a better computer 🖥 😅 Honest question: What computer setup would you (this audience + Matt) get if you were going to upgrade this month? (not Mac because I have all android). Embarrassed to say I have been doing all of my AI stuff on my Chromebook and Chromebook Plus for 2 years (quite effectivel, lol😂) , but with Open Source taking off... it is time 🚀 🤖 Can't get away without running locally anymore. 🤗 Thanks all who respond😃👍🌴
@jasonhemphill8525 up to $4,000. less would be better. 😁 Thank you for caring 🤗 I will be needing to build some custom models for my business... and all of the normal video stuff. I'm hoping to make a custom animated avatar as a replacement for me in my videos (when we get there, tech makes it easier and easier so may be less complicated soon). and I am kind of thinking I will end up customizing a robot down the line for my kitchen methods. Not sure if the out-of-box 'casa' robots will know how to take the seeds and veins out of habaneros 😅🌶 (probably will if I just ask). Not getting too far ahead... I am not sure which environment I will use. Nvidia looks like they are partnering with this path, but i may go the META route open-sourced, a good setup to integrate with any. Thanks again, Jason! 😃
@@Emily-Broccoli_Sprouts I don’t know too much about model fine tuning or video avatars but with a budget of around $4000, you can do a lot. In the world of AI, VRAM is king. If you can find a good deal on used 3090s you can run two for a total of 48GB which should enough for interference for medium sized models at good quants. How familiar are you with what I am saying so far?
@jasonhemphill8525 Kind of familiar... I have never used a setup like this (but i want to, and every time i see a 'run locally' application, i am like 'dang it, man😁 I cant do that with this'. Do you think 2 3090s? It will be a change from doing everything in the cloud and from Chromebook apps with occasional Linux blunder, but I want to have options... I do know that there is a chance that pretty much we may have a whole new way to interact. 🤖🔉🤖🌱.. but I think computers will be around for the short-term future and perhaps longish term. I do wonder about power consumption, as well. (I live in a rental, so there are not many options to add new dedicated circuits for the setup. Currently, I only have a large monitor 🙃 (that I use to make my screen bigger). I don't have a computer to put the graphics card in. Basically, building one from the ground up. I considered doing a similar search to build a computer like MattVidPro did in one of his videos a while back, but rather than relying on an LLM that may not cool the computer properly, etc, wanted to ask a human who has actually knows, aka, you 😁 Don't feel obligated, just would like a bit of direction if possible 🙏😊 My familiarity of the AI world is I have been following it closely and using it/trying everything I can that comes out since chatGPT 3.5 came out, I think maybe before that, because i started playing with Midjourney before that. (but as far as playing on a good computer, no experience yet). 🌱🚀🤖😃
@@Emily-Broccoli_Sprouts that’s alright. Well the short version is AI models are big. REALLY big. Gpus are the fastest at running them but they have a major size constraint which is VRAM. At the moment 24Gbs is the most vram on consumer level cards. There are cards that have more but they are nearly an order of magnitude more expensive. In order for any of these models to work, they need to “fit” in the vram you have available. A used 3090 is the best compromise on vram capacity and price (as the 3090ti or 4090 are faster but with the same amount of vram). Some AI applications can also run on the cpu and system memory. It tends to be wayyyyyyy slower but ram capacity is significantly higher and cheaper. Although great strides have been made to make cpu inference faster, gpus are still king. If it was me, I’d get a relatively low end Ryzen chip on AM5 like a 7600. Pair that with around 128Gb of ddr5 at decent speeds. Two 3090s if you can get a good deal on those. And a case and power supply that can fit all of that computer. Try looking at completed builds on the PC parts picker website with “AI” in the title. “Build a pc sales” on reddit is a great resource to save some money too. Try some of the guys on “build a pc” on reddit too because they can get into the nitty-gritty of what EXACTLY to buy.
Great video and while it’s better than others at copyrighted material etc it is still not completely coherent and doesn’t handle many subjects too well. Holding objects is still pure quackery generations. It feels like we’ve reached a bit of a AI generation wall. Even the best video gens still create fever dreams. While it’s all a little tiny baby step in the right direction it’s gonna be slower than we initially thought before we can reliably create a story/movie etc.
Well it’s not the output they care about. They say you can mostly do what you want with the images. The thing they will go after people for is Image Gen sites trying to host their model for profit. And you cannot use their models for training other models. This is a lot harder to enforce, but because of the watermarks, they could definitely tell if a model was overtrained on Flux images and the merge would be very obvious to them.
Anyone looking to buy a new video card to generate AI, please make the GB of VRAM the number 1 priority. The GB of VRAM directly determines what AI you can run. The speed and number of CUDA Cores and and the DDR VRAM speed, only effects generation time.
i dont know the heck you guys doing, but my results are looking pretty much the same over and over, also if i use too much prompts, the ai just ignore.
That model is actually the first base one that knows a lot of cars. Expecially European where lacking. It's amazing. That's what i expect from a base model. Sadly i think it won't be possible to train on it, unless you have a NASA pc.
Using an M2 Max and found generation for just 1 image taking awhile. Like minutes. I know it would probably be faster with an Nvidia card but wondering if it's normal. (I thought M2 Max chips had neural chips?)
So far, it makes images that look like "AI art." (ie, they look like bad CGI). If you want to actually create works of art, you're already using SD 1.5, not 3, not Midjourney, not Flux. It's slow, and it's a downgrade compared to SD 1.5 ecosystem. If you aren't in that world, it seems impressive until you realize how slow it is when you try to iterate. It's also huge, and with the licensing, I doubt we will have many options for finetunes. I'd love to be wrong, but I don't care about something that looks like Midjourney, because I personally think Midjourney images are burned and very samey. Source: I am a working professional AI artist.
The sheer variety in this realm of technology is nothing short of astounding. The breadth of possibilities is virtually limitless, from the innovative advancements in artificial intelligence and machine learning to the groundbreaking developments in virtual and augmented reality. New discoveries and improvements enhance how we interact with our digital world each day. Whether it's the latest smartphone equipped with cutting-edge features or sophisticated software applications that streamline our daily tasks, the variety and progress in this sector continually reshape our lives in remarkable ways. Integrating these diverse technologies into varied industries-from healthcare to entertainment-showcases an impressive spectrum of potential benefits and uses, all of which promise to revolutionize our future.
I have an M1 with 16GB and could not get it to run. In comfyui it gets all the way to the sampler and then it spits out an error. The new Aura model will work on my mac but it literally takes 30 minutes. Best to just wait a few weeks and I’m sure it will be sorted.
Eventually we will probably be able to run the Schnell smaller model. But it will likely take an M Max or Ultra with a lot of Ram and Gpu if people are struggling in the PC world with top of the line 4090s lol. We are already like 15x slower than a pc dedicated GPU, but we make up for that with unified memory. I’m on 16GB as well, so probably Flux Dev will be out of range until the LCM version drops.
@@obscuremusictabs5927 Very true. We live in a wild tech boom. What it means for us in the Mac world is that we can safely rely on having these toys a year or two after they first release. Stable Diffusion was 6 months to hacky ways to run Auto 1111, and then about a year to native implementations like Draw Things and Mochi.
First image at 1024x 15 minutes, second image 7.5 minutes both using the fast model Schnell at only 4 steps (SDXL on comfy takes only 20 seconds). I hope the online versions stay up or else it's a no go! This model will only be used locally by very small percentage of people in the world.
dev version locally with comfy over pinokio takes 3 Minutes for 1280x720 with a rather cheap RTX4060/12GB, to me thats absolutely ok. Schnell version goes around 27 sec. max
for anything serious, 4080 is minimum. yes, you CAN run it on 3060/4060 with 8gb of vram, even on lower cards, if you dont mind some waiting, and dont mind that during generation makes using computer impossible, because model takes all system resources. so,.no work while generating,.even moving mouse is not smooth. also,.forget any advanced options, postprocessing, detailing, upscaling, in any reasonable time frame, its simply tooo slow.
Not to start anything, but this looks exactly identical to Stable Diffusion--which the same guys also created--and has the same flaws. Am I missing something here? Also, did you TRY something "uncensored"?
@@mich.duhamel Now, I'm not trying to upset you or insinuate anything, but have you considered that you're using the wrong "laying"? When you "lay" something, you are placing an object down with your hand. When you "lie" down, you are personally going horizontal. It's "lying". What makes this worse is that you seem to be telling me that Flux.1 gave you a woman "lying" in grass when you asked for "laying"--which is objectively terrifying. BTW, I just left you a gift on my Twitter wall. Feel free to pick it up at any time.
@@dirtydevotee "gift on my Twitter wall" Exactly. It gave you some twisted Down Syndrome looking mutant creation. And my reference was to SD3, not SD1.5. You used SD1.5. SD3 is ten times worse, it's literal nightmare fuel.