This is why I hope Midjourney gets into ControlNet. Just like with Stable diffusion, the ability to change the aesthetic of an image while leaving everything in place is priceless.
This is by far one of the best comparison videos on AI art that I have watched, your takes are spot on. This does make me wonder, what if you took the dalle output, used the image as a image reference for midjourney? Would it maybe keep some of the coherence from the dalle image, then render it with better aesthetics? Thankyou for the content.
very possible that Dalle-4 will be the MJ killer...unless they open up nudity...after that MJ will rule the world. literally ALL TECH is driven by porn...true story
I think it would be probably more fair if half of prompts would be written as you can use them in dalle and copy to midjourney. You did all in midjourney and copy exact prompt to dalle. I wonder how it would be if do other way around. Great video, thanks
I don't see why it matters. The point is that both are supposed to interpret the prompt correctly. It shouldn't matter in which order I do it. The prompts were very clear.
@TokenizedAI I think you've somehow limited yourself to constructing prompts naturally 'midjourney' like. While in dalle you can express what you want way more loosely, since nlm of chat gpt gives you that possibility. I literally talk with dalle on what I'd like change in second, third etc iteration of the same picture very loosely. No need to input all prompt again. Midjourney is more strict in terms of 'syntax' itself. Perhaps you ve checked how dalle deals with prompts from midjourney, but not necessarily how midjourney would deal with dalle prompts. Since I think prompts in dalle can much more loose, while in midjourney - not really. But it is just my observation. Overall I enjoy your video and I am waiting for next!! Keep up good work!!
I still think it doesn't matter and don't see how this is "limiting". Given a natural language prompt that we can all fully understand, which one does a better job at interpreting it? That's what was being tested. The prompt I used was completely different from how you would normally prompt for MJ v5 if you wanted decent coherence. The established frameworks don't include any real natural language, whereas in v6 you can do that. You just need to structure it properly. You and I understand what is being asked. The question is, do MJ and DALL-E understand it? That's what coherence is.
Keep in mind that MIdjourney 6 is in alpha now. Beta soon. They are saying that the alpha can change dramatically as they tune. I was doing some work this morning with the update they did last night and coherence still has it's problems. Similar to the video here.
True, but I can't wait until they eventually say it's no longer in Alpha. Otherwise they could just say that they're in a perpetual alpha phase 😅 It's just a snapshot anyway.
@@TokenizedAI the alpha was updated this weekend and they have more updates coming. They have said they expect the alpha to be the default by the end of the month and we will get a v6.1 very soon after. Jan-Fab is coming to be very busy for Midjourney with this and website updates, style tuner and more. :-)
Thanks for the comparison. And welcome back. This is not a criticism of you or your process, however, as you well know, the real power of prompting is in the iterative journey. Prompts that require a specific result may have to be worked up, sometimes over a number of successive prompts and techniques to achieve the best result. Throwing a blob of clay onto the potters wheel and critiquing the results may be good YT fodder, but if you're serious about getting the best result, it's necessary to jump into the jelly pit and start wrestling. For instance, incorporating imagine references combined with prompt weights in MJ would definitely deliver a superior and more cohesive result. And if Dalle3 doesn't have that functionality then that forms part of the analysis of the comparison. I'd be interested to see a rematch of this test using the tools provided by each platform to achieve the best result. Why buy a Ferrari if you never get out of 1st gear in the shopping mall car park?
I appreciate the analogy, but it misses the point. We're not assessing the sandwich. We're assessing whether a sandwich machine creates the sandwich I asked for.....on the first attempt. Not on the second or third. There is no doubt that an artisanal sandwich tastes better. But that's not what we're trying to find out here.
@@TokenizedAI For me, chasing AI coherence is about learning to speak the language that will provide the best results. If you want to flush out an illusive fox, you need to make like a wounded rabbit. That approach doesn't work so well if you're trying to court a young lady. 😉
Yes and no. Speaking the right language is important but we need reliability in order to use AI effectively. Without sufficient coherence, it's simply impossible to create certain images with intent.
I would give prompt adherence 40% weight of importance and 60% to quality of image. If the image is not as nice then you lose a lot of the wow factor. I will say that Dall-E is heading in the right direction because quality of images can always improved on, but control is what is pushing this side of AI to the next level. I would like to see Mid Journey really focus on character and control improvements by offering prompt template's for whatever the situation call calls for. Style, aspect ratio, gender, hair color etc. Like a suggestion form based on prompt.
Well, what is the point bro? At the end 9 out of 10 times. Truthfully ask yourself how many times would you use Dall-e kind of standards for your client’s projects? Haha... Cheers!
Neither would I, but it helps to point Midjourney into the direction we want it to go (coherence-wise). Plus, viewers keep asking for these comparisons, even if you don't seem to see the point.
I'm with you Chris. Love the art style of MJ over Dall E3, but the coherence of MJ sucks. Worse, each time I ask MJ to correct an element that they were in-coherent about, they dig in an just reproduce the same non-coherent BS from the previous flubs. I know MJ is just a machine, but dang, it sure feels like a machine with attitude and belligerence.
@@TokenizedAI I guess I can pitch 'em a soft ball on this one, but g-whiz, they certainly do frustrate. Keep us in the loop if you figure out how to get a better handle on the coherence. PS nice to see you back.
A fascinating contrast between DALL-E 3 and Midjourney v6! It's remarkable to observe how well each manages prompt coherence and which one is clearly superior.
Nice. I knew the results before watching the video given what I saw in galleries of MJ6 and DALLE-3 does not even match the previous MJ version in photographic detail. DALLE is still good for creating images either as pure digital illustrations or as a base for SD inpainting/reference so that you get the interaction of artifacts correct in DALLE with the raw quality and aesthetics of the other LLMs.
Yeah. 👏👏Glad you are back Chris. Missed you. Happy New Year! It’s gonna be amazing 🤩 You inspire me! Thank you. I’m a course member so looking forward to fresh ideas 💡 😁. I’m sure having trouble with generating hands in mj.
Thanks! It's good to be back. But I must admit that it feels like my overall time budget has reduced quite a bit. There's so much that needs to be done 😆
@@TokenizedAII get it! My head is on a swivel. It’s weird but good. Everyday I remind myself that Rome wasn’t built in a day and to not expect to much from myself. I have a cupboard full of cliches to keep things in perspective. I appreciate whatever you can do. Just put one foot in front of the other. Feel everything. Speak what is true (stay true) and keep moving. 🙏✨
Dude, chill! And stop spamming the comments on every single video. 🤦🏻♂️ There is an announcement/warning on www.promptalot.com. If you had a registered account on the site then you would have received an email informing you that Discord made changes that resulted in a bug. A fix is coming!
Midjourney LOOKS way better but it's so annoying how it fails to make objects that interact with each other in a scene look integrated. It may get some rerolls with Dall-E but it's much, much better at that. And Dall-E seems to go above and beyond and make the scene you've requested "make sense". If you request a battle scene it feels like it makes an effort for each character it creates to have expressions or poses that make sense. When it comes to simple scenes MJ takes the cake, it is indeed aesthetically better, but I really wish it had Dalle's prompt coherence and also that it had a better sense of scale. It's really a bummer that they censored Dalle/Bing to oblivion to the point many have given up using it altogether. Microsoft apparently couldn't handle the resurgence in popularity and they went full "hey, how can we ruin this?"
I like your videos, but in this case I can't understand your opinion. Dall E is still cheap with many wrong details. I would always decide against your opinion... Sorry... But I have to say also, that MidJalways just need a small step to win the ultra best result. But from version to version this small step never comes 😞
Everyone is entitled to their opinion 🙂 But the fact that you're saying DALL-E produces cheap images tells me that you're taking into account the aesthetics. I'm only judging the "coherence", not the image quality or aesthetics. I thought I made that clear in the video 🤔
the positioning of the bunny was more correct on the MJ vs DALLE, as Dalle didnt get it right once. and as far as coherence to your prompt i would argue that the last one goes to MJ as well as it does have more of what you asked for even if it wasnt what you were thinking of in your head
I wouldn't want to sacrifice the quality of the images, I'd rather re-roll or re-word prompts and if that didn't work, I'd do what was needed in Photoshop/Affinity - which is what I already do now. Also, some of the type of images I work with actually come out better using V5, 5.1 or 5.2. Interesting to see though
Interesting, you're the second person to say that their images come out better in v5. I can't really say that applies to my experience. But maybe it is also the different prompting style?
The issue with DALL-E-3 is censorship: 3 times out of 4, it refuses to create the requested image. It's nearly impossible to create comic-style images that surpass an extremely strict decency.
The level of prompt coherence I get from DALL-E is actually hugely variable. Sometimes it does a fairly good job, and sometimes I'm just scratching my head trying to figure out how it thought this output is related to my prompt. But overall, my theory is that they are using GPT 4 Vision as a discriminator. DALL-E is much slower than MidJourney, and the number of images it provides tends to vary. It should be four images, but sometimes it is two and frequently only one. So I think GPT 4 is examining the images that were created and evaluating it before how well it matches the prompt, and decides whether or not to try again. When it tries again, you get fewer images. When I only get a single image, that is when dropped coherence tends to be the lowest for me. So that's the point where it is probably just giving up. That's my theory, at least
That actually sounds like a really good theory to me. I think the number of images isn't a big issue. As I've said before, DALL-E is really good for casual image generation and for the most part, it nails most of the things I need with the first or second attempt. Working if MJ is usually different though because I use it for entirely different things and have to re-roll more anyway.
Aaaaaah now I see, you´re from Hamburg! German! Endlich auch mal ein AI-Experte aus Deutschland, gibt viel zu wenige von hier =D Ich war mir die ganze Zeit nicht sicher, weil du so gutes Englisch sprichst =D Greetings aus Mittelhessen!
Great video as aleays. Midjourney just got a new upgrade this morning and I'm finding the coherance is much better than yesterday, I'm so looking forward to the beta version being released .
Don't see how the rewriting matters. What matters are the images it produces and they're more coherent, that's what I'd judge on. Also, Bing Image Creator clearly doesn't use the same model.
Well it obviously wasn't recorded today, but it's from Thursday. Latest drop or not doesn't really matter to me. I can't until Midjourney finally feels that it's perfect. If they release it and call it v6 then that's the current state it's in. If things change, I'm happy to revisit again once they remove the "alpha" label. But let's face it, they wanted to release something before Christmas, even if it meant that they'd release an unfinished product. That was their own decision. 🙂
For me, they are just really different. Dalle is easier so I tend to use it more. But as I get more experienced I go to mj when I need a really good realistic image. The masters of mid journey course is good… it always comes down to getting a good prompt crafted. Don’t underestimate that.
@CM-zl2jw Hi, yes and no. the problem the argument you're posing is that getting a good prompt in MJ is generally radically more time consuming if you want very specific details in the render. With Dall-E it's universes less time consuming based on my testing. I almost never test with vague prompts as my work is about achieving very specific results....... for which Dall-E via free portals such as BIC, Bing Chat, and CoPilot is generally far superior........and FREE. I didn't renew my MJ Pro Annual subscription because the cost to benefit ratio is definitely no longer worth it for my needs. Besides, my guess is that in 2024 we may see upgrades to Dall-E making it even more a reason to abandon my Pro Annual subscription to MJ. I'm not a MJ hater.......I was for quite some time probably their most vocal advocate in Discord. But they've decided to focus on things not beneficial to me at all and their abusive NSFW censorship is a serious problem as well.
I don't think there is an obvious "better" one. It all depends on your specific needs and what you need seems to be different from what many others need. DALL-E 3 is great for casual image creation and very specific representations. But if I'm working on a movie trailer or any other big project, then DALL-E is useless to me and Midjourney is the only one that works for me.
I'm too much of a simp for all the SD-related stuff. I like to use stuff that just works out of the box without tons of dials and sliders. But I do keep an eye on them because I see the need to integrate them into Promptalot at some point.
I believe it is kind of comparing apple with pear. You should be using Dalle3 with Bing without ChatGPT, than that would be a fair challange and the score would be 4 tp 0 =) There is a huge differance between DALLE3 in Bing and DALLE3 in Chat GPT. Also the message limit on ChatGPT is not letting me to use DALLE for any of my meaningful projects, I wish you could mention that as well. The prompt that you are giving is also using the power of LLM on the backend. Anyway, thank you for the content and promptalot as well, you saved around 10% of my time =)
Bing Image Creator doesn't use the same model as DALL-E 3 via ChatGPT. Why would I compare an inferior (free) product to Midjourney's latest model? Midjourney is paid. ChatGPT Plus is paid. That's the only fair comparison. 🙂 What happens under the hood is irrelevant. The only thing that matters is what sort of results users can achieve.
@@TokenizedAI Also midjourney dont have a industry leading LLM to work the prompts on the back of the house :) thats why I said comparing apple with pear 😇 Still did not feel totally fair to compare 😇
Would you compare an iPhone to a Samsung Galaxy, even if you knew that Apple had an industry-leading feature somewhere inside it? Of course you would. So the comparison is 100% fair. What counts is the output.