Midjourney v6 vs. DALL-E 3: Who's the Prompt Coherence King?

Tokenized AI by Christian Heidorn

Подписаться 73 тыс.

Просмотров 10 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

29 сен 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 105

@DigitalAscensionArt 8 месяцев назад

This is why I hope Midjourney gets into ControlNet. Just like with Stable diffusion, the ability to change the aesthetic of an image while leaving everything in place is priceless.

@KolTregaskes 8 месяцев назад

They have said something like that is coming.

@Revex08 8 месяцев назад

This is by far one of the best comparison videos on AI art that I have watched, your takes are spot on. This does make me wonder, what if you took the dalle output, used the image as a image reference for midjourney? Would it maybe keep some of the coherence from the dalle image, then render it with better aesthetics? Thankyou for the content.

@TokenizedAI 8 месяцев назад

Yeah, you could do that but ideally the first output should be coherent already.

@dex509 8 месяцев назад

Are you change the speed of the video? It is like 1.5 speed not normal

@TokenizedAI 8 месяцев назад

Nope.

@thewavechef 8 месяцев назад

"Prompt Coherence" 😮 I like that.

@TokenizedAI 8 месяцев назад

It's crucial! 🙂

@ArtificialRhythms 8 месяцев назад

I prefer Dall-E. Just me..... FYI , I just made a all AI made music video. AI made lyrics, images, music score and video.

@TokenizedAI 8 месяцев назад

Nice!

@Enu_Vibe 8 месяцев назад

2024 Prediction: Dalle-4 will have MJ level aesthetic while dominating the coherence.

@TokenizedAI 8 месяцев назад

We'll see... 😉

@kzrhthevultcha 8 месяцев назад

very possible that Dalle-4 will be the MJ killer...unless they open up nudity...after that MJ will rule the world. literally ALL TECH is driven by porn...true story

@przemekmaj3576 8 месяцев назад

I think it would be probably more fair if half of prompts would be written as you can use them in dalle and copy to midjourney. You did all in midjourney and copy exact prompt to dalle. I wonder how it would be if do other way around. Great video, thanks

@TokenizedAI 8 месяцев назад

I don't see why it matters. The point is that both are supposed to interpret the prompt correctly. It shouldn't matter in which order I do it. The prompts were very clear.

@przemekmaj3576 8 месяцев назад

@TokenizedAI I think you've somehow limited yourself to constructing prompts naturally 'midjourney' like. While in dalle you can express what you want way more loosely, since nlm of chat gpt gives you that possibility. I literally talk with dalle on what I'd like change in second, third etc iteration of the same picture very loosely. No need to input all prompt again. Midjourney is more strict in terms of 'syntax' itself. Perhaps you ve checked how dalle deals with prompts from midjourney, but not necessarily how midjourney would deal with dalle prompts. Since I think prompts in dalle can much more loose, while in midjourney - not really. But it is just my observation. Overall I enjoy your video and I am waiting for next!! Keep up good work!!

@TokenizedAI 8 месяцев назад

I still think it doesn't matter and don't see how this is "limiting". Given a natural language prompt that we can all fully understand, which one does a better job at interpreting it? That's what was being tested. The prompt I used was completely different from how you would normally prompt for MJ v5 if you wanted decent coherence. The established frameworks don't include any real natural language, whereas in v6 you can do that. You just need to structure it properly. You and I understand what is being asked. The question is, do MJ and DALL-E understand it? That's what coherence is.

@jimberry7865 8 месяцев назад

Keep in mind that MIdjourney 6 is in alpha now. Beta soon. They are saying that the alpha can change dramatically as they tune. I was doing some work this morning with the update they did last night and coherence still has it's problems. Similar to the video here.

@TokenizedAI 8 месяцев назад

True, but I can't wait until they eventually say it's no longer in Alpha. Otherwise they could just say that they're in a perpetual alpha phase 😅 It's just a snapshot anyway.

@KolTregaskes 8 месяцев назад

@@TokenizedAI the alpha was updated this weekend and they have more updates coming. They have said they expect the alpha to be the default by the end of the month and we will get a v6.1 very soon after. Jan-Fab is coming to be very busy for Midjourney with this and website updates, style tuner and more. :-)

@IronCladMultimedia 8 месяцев назад

Sooo... can you prompt in DaLL-E3 for coherence then image link it in Midjourney for the aesthetics?

@TokenizedAI 8 месяцев назад

Yeah, probably.

@DigitalAscensionArt 8 месяцев назад

Midjourney is clearly more aesthetic. But I still like what Dalle attempts to accomplish. 😏

@TokenizedAI 8 месяцев назад

I use DALL-E a lot for random graphics that I need. It does an excellent job of that. But yeah, for aesthetics I need to use MJ.

@morpheus2573 8 месяцев назад

Thanks for the comparison. And welcome back. This is not a criticism of you or your process, however, as you well know, the real power of prompting is in the iterative journey. Prompts that require a specific result may have to be worked up, sometimes over a number of successive prompts and techniques to achieve the best result. Throwing a blob of clay onto the potters wheel and critiquing the results may be good YT fodder, but if you're serious about getting the best result, it's necessary to jump into the jelly pit and start wrestling. For instance, incorporating imagine references combined with prompt weights in MJ would definitely deliver a superior and more cohesive result. And if Dalle3 doesn't have that functionality then that forms part of the analysis of the comparison. I'd be interested to see a rematch of this test using the tools provided by each platform to achieve the best result. Why buy a Ferrari if you never get out of 1st gear in the shopping mall car park?

@TokenizedAI 8 месяцев назад

True....and yet, irrespective of what a typical workflow or creative process might look like, coherence is coherence. 😉

@morpheus2573 8 месяцев назад

@@TokenizedAIThere are sandwiches and there are sandwiches. The difference is in the care it took to make it.

@TokenizedAI 8 месяцев назад

I appreciate the analogy, but it misses the point. We're not assessing the sandwich. We're assessing whether a sandwich machine creates the sandwich I asked for.....on the first attempt. Not on the second or third. There is no doubt that an artisanal sandwich tastes better. But that's not what we're trying to find out here.

@morpheus2573 8 месяцев назад

@@TokenizedAI For me, chasing AI coherence is about learning to speak the language that will provide the best results. If you want to flush out an illusive fox, you need to make like a wounded rabbit. That approach doesn't work so well if you're trying to court a young lady. 😉

@TokenizedAI 8 месяцев назад

Yes and no. Speaking the right language is important but we need reliability in order to use AI effectively. Without sufficient coherence, it's simply impossible to create certain images with intent.

@Wasted-GTA 8 месяцев назад

I would give prompt adherence 40% weight of importance and 60% to quality of image. If the image is not as nice then you lose a lot of the wow factor. I will say that Dall-E is heading in the right direction because quality of images can always improved on, but control is what is pushing this side of AI to the next level. I would like to see Mid Journey really focus on character and control improvements by offering prompt template's for whatever the situation call calls for. Style, aspect ratio, gender, hair color etc. Like a suggestion form based on prompt.

@TokenizedAI 8 месяцев назад

There is no such thing as a prompt adherence weight in Midjourney. Remember, this isn't Stable Diffusion.

@EdvvinWang 8 месяцев назад

Well, what is the point bro? At the end 9 out of 10 times. Truthfully ask yourself how many times would you use Dall-e kind of standards for your client’s projects? Haha... Cheers!

@TokenizedAI 8 месяцев назад

Neither would I, but it helps to point Midjourney into the direction we want it to go (coherence-wise). Plus, viewers keep asking for these comparisons, even if you don't seem to see the point.

@The-Spondy-School 8 месяцев назад

I'm with you Chris. Love the art style of MJ over Dall E3, but the coherence of MJ sucks. Worse, each time I ask MJ to correct an element that they were in-coherent about, they dig in an just reproduce the same non-coherent BS from the previous flubs. I know MJ is just a machine, but dang, it sure feels like a machine with attitude and belligerence.

@TokenizedAI 8 месяцев назад

I wouldn't say it sucks. That's too harsh and not the message I'm trying to get across. I'm just saying that it needs further improvement.

@The-Spondy-School 8 месяцев назад

@@TokenizedAI I guess I can pitch 'em a soft ball on this one, but g-whiz, they certainly do frustrate. Keep us in the loop if you figure out how to get a better handle on the coherence. PS nice to see you back.

@Maartenalbers 8 месяцев назад

Glad you are back! I hope your time of was good. Thank you for giving so much!

@molugusatyapriya2 20 дней назад

A fascinating contrast between DALL-E 3 and Midjourney v6! It's remarkable to observe how well each manages prompt coherence and which one is clearly superior.

@waynelai354 8 месяцев назад

Nice. I knew the results before watching the video given what I saw in galleries of MJ6 and DALLE-3 does not even match the previous MJ version in photographic detail. DALLE is still good for creating images either as pure digital illustrations or as a base for SD inpainting/reference so that you get the interaction of artifacts correct in DALLE with the raw quality and aesthetics of the other LLMs.

@CM-zl2jw 8 месяцев назад

Yeah. 👏👏Glad you are back Chris. Missed you. Happy New Year! It’s gonna be amazing 🤩 You inspire me! Thank you. I’m a course member so looking forward to fresh ideas 💡 😁. I’m sure having trouble with generating hands in mj.

@TokenizedAI 8 месяцев назад

Thanks! It's good to be back. But I must admit that it feels like my overall time budget has reduced quite a bit. There's so much that needs to be done 😆

@CM-zl2jw 8 месяцев назад

@@TokenizedAII get it! My head is on a swivel. It’s weird but good. Everyday I remind myself that Rome wasn’t built in a day and to not expect to much from myself. I have a cupboard full of cliches to keep things in perspective. I appreciate whatever you can do. Just put one foot in front of the other. Feel everything. Speak what is true (stay true) and keep moving. 🙏✨

@asikarafat6891 6 месяцев назад

Promptlot is not working, what's the problem brother

@TokenizedAI 6 месяцев назад

Dude, chill! And stop spamming the comments on every single video. 🤦🏻‍♂️ There is an announcement/warning on www.promptalot.com. If you had a registered account on the site then you would have received an email informing you that Discord made changes that resulted in a bug. A fix is coming!

@frankrobert3471 8 месяцев назад

Midjourney LOOKS way better but it's so annoying how it fails to make objects that interact with each other in a scene look integrated. It may get some rerolls with Dall-E but it's much, much better at that. And Dall-E seems to go above and beyond and make the scene you've requested "make sense". If you request a battle scene it feels like it makes an effort for each character it creates to have expressions or poses that make sense. When it comes to simple scenes MJ takes the cake, it is indeed aesthetically better, but I really wish it had Dalle's prompt coherence and also that it had a better sense of scale. It's really a bummer that they censored Dalle/Bing to oblivion to the point many have given up using it altogether. Microsoft apparently couldn't handle the resurgence in popularity and they went full "hey, how can we ruin this?"

@mikaelhamad2788 8 месяцев назад

👌 P r o m o s m

@AnthonySell 8 месяцев назад

Do you happen to know how many tokens we get with MJ V6 compared to V5? I have yet to see any definite answer on this.

@TokenizedAI 8 месяцев назад

Good question. I haven't seen any info about this yet. It feels like it's a bit more than before but I'm not certain.

@AnthonySell 8 месяцев назад

@@TokenizedAI I wonder, given the improved prompt coherence, if there is not a way to test this...

@gravesbruce 8 месяцев назад

Would adding a more coherent image created in Dalle to a midjourney prmpt for style be the better solution at this point in time or doesnt that work?

@80salive 8 месяцев назад

I like your videos, but in this case I can't understand your opinion. Dall E is still cheap with many wrong details. I would always decide against your opinion... Sorry... But I have to say also, that MidJalways just need a small step to win the ultra best result. But from version to version this small step never comes 😞

@TokenizedAI 8 месяцев назад

Everyone is entitled to their opinion 🙂 But the fact that you're saying DALL-E produces cheap images tells me that you're taking into account the aesthetics. I'm only judging the "coherence", not the image quality or aesthetics. I thought I made that clear in the video 🤔

@kadenickel 8 месяцев назад

the positioning of the bunny was more correct on the MJ vs DALLE, as Dalle didnt get it right once. and as far as coherence to your prompt i would argue that the last one goes to MJ as well as it does have more of what you asked for even if it wasnt what you were thinking of in your head

@TokenizedAI 8 месяцев назад

Fair points.

@GillBrooks 8 месяцев назад

I wouldn't want to sacrifice the quality of the images, I'd rather re-roll or re-word prompts and if that didn't work, I'd do what was needed in Photoshop/Affinity - which is what I already do now. Also, some of the type of images I work with actually come out better using V5, 5.1 or 5.2. Interesting to see though

@TokenizedAI 8 месяцев назад

Interesting, you're the second person to say that their images come out better in v5. I can't really say that applies to my experience. But maybe it is also the different prompting style?

@GillBrooks 8 месяцев назад

@@TokenizedAI Not sure. I don't waffle :) I think it's more the actual results that I want for what I'm doing. Some V6 have just been way too literal

@TokenizedAI 8 месяцев назад

Perhaps we should tell it not to take things literally 😆

@sbmicro1896 8 месяцев назад

The issue with DALL-E-3 is censorship: 3 times out of 4, it refuses to create the requested image. It's nearly impossible to create comic-style images that surpass an extremely strict decency.

@TokenizedAI 8 месяцев назад

True. So focus on comics that don't require that or work with open source solutions. No point in fighting it.

@ashlynnantrobus5029 8 месяцев назад

The level of prompt coherence I get from DALL-E is actually hugely variable. Sometimes it does a fairly good job, and sometimes I'm just scratching my head trying to figure out how it thought this output is related to my prompt. But overall, my theory is that they are using GPT 4 Vision as a discriminator. DALL-E is much slower than MidJourney, and the number of images it provides tends to vary. It should be four images, but sometimes it is two and frequently only one. So I think GPT 4 is examining the images that were created and evaluating it before how well it matches the prompt, and decides whether or not to try again. When it tries again, you get fewer images. When I only get a single image, that is when dropped coherence tends to be the lowest for me. So that's the point where it is probably just giving up. That's my theory, at least

@TokenizedAI 8 месяцев назад

That actually sounds like a really good theory to me. I think the number of images isn't a big issue. As I've said before, DALL-E is really good for casual image generation and for the most part, it nails most of the things I need with the first or second attempt. Working if MJ is usually different though because I use it for entirely different things and have to re-roll more anyway.

@rossobosso 8 месяцев назад

Aaaaaah now I see, you´re from Hamburg! German! Endlich auch mal ein AI-Experte aus Deutschland, gibt viel zu wenige von hier =D Ich war mir die ganze Zeit nicht sicher, weil du so gutes Englisch sprichst =D Greetings aus Mittelhessen!

@RetroAiUnleashed 8 месяцев назад

Best wishes to you in 2024 my friend! sgtsy Thank you for returning to us! I have missed you! God bless you! from your friend in Canada 🤔😊🍁🙏🏻😎

@TokenizedAI 8 месяцев назад

It's good to be back 😁

@thewebstylist 8 месяцев назад

Great to have u back to YT out Ai brother! Very much looking forward to the Masters of MidJourney updates of yours Big 2024 indeed!!

@TokenizedAI 8 месяцев назад

Will hopefully release the first vid by the end of the week.

@igormil 8 месяцев назад

Great video! I do think when it comes to the photorealism round Midjourney had a clear victory. But thats just my opinion 🤷

@TokenizedAI 8 месяцев назад

The image quality or the prompt coherence? Cause I was just evaluating the coherence.

@trace1301t 8 месяцев назад

Great video as aleays. Midjourney just got a new upgrade this morning and I'm finding the coherance is much better than yesterday, I'm so looking forward to the beta version being released .

@TokenizedAI 8 месяцев назад

I think it also depends on what you're working on.

@ahmedkagabo 8 месяцев назад

I believe that comparison is fair and balanced. Thank you, Christian.

@TokenizedAI 8 месяцев назад

Thanks. Plus, this surely won't be the last comparison. 🙂

@HansJoachimNolte 8 месяцев назад

hey chris, thnaks for posting again. Wish you all the best for 2024!

@TokenizedAI 8 месяцев назад

All the best to you too! 🙌🏻

@StrikerTVFang 8 месяцев назад

What's the name of the tools that are used in discord under all of the prompt results. That toolbar looks useful! But I don't see the name of it.

@TokenizedAI 8 месяцев назад

I thought I mentioned it in the video? That Promptalot.... it's also linked in the video description.

@StrikerTVFang 8 месяцев назад

@@TokenizedAI oh you're right. I did see that name mentioned. I didn't realize it was the toolbar lol. Thanks!

@pbb 8 месяцев назад

Dalle just looks so basic and cheap

@DesignWho 8 месяцев назад

I remember 6:17

@TokenizedAI 8 месяцев назад

😁👍🏻

@Freakeater 8 месяцев назад

How do you get 2 images per prompt from Dall-E every time? I only get one image per prompt

@TokenizedAI 8 месяцев назад

It's kinda random to be honest. But maybe it's because I have a ChatGPT Plus subscription?

@Freakeater 8 месяцев назад

@@TokenizedAIObviously, I do also have the ChatGPT Plus subscription 😂 but thanks anyway

@maxziebell4013 8 месяцев назад

I would also go for Bing Image Creator for a direct access to Dall E without the LLM rewriting stuff

@maxziebell4013 8 месяцев назад

Was this actually done with the latest Midjourney drop?

@TokenizedAI 8 месяцев назад

Don't see how the rewriting matters. What matters are the images it produces and they're more coherent, that's what I'd judge on. Also, Bing Image Creator clearly doesn't use the same model.

@TokenizedAI 8 месяцев назад

Well it obviously wasn't recorded today, but it's from Thursday. Latest drop or not doesn't really matter to me. I can't until Midjourney finally feels that it's perfect. If they release it and call it v6 then that's the current state it's in. If things change, I'm happy to revisit again once they remove the "alpha" label. But let's face it, they wanted to release something before Christmas, even if it meant that they'd release an unfinished product. That was their own decision. 🙂

@StyleYourCareer 8 месяцев назад

Love these videos! This was super helpful. I agree that I have found coherence so much better in DALL-E

@MikeWalkerSBC 8 месяцев назад

Thank you so much!

@TokenizedAI 8 месяцев назад

You're welcome!

@LouisGedo 8 месяцев назад

Dall-E 3 is still much better based on my testing

@CM-zl2jw 8 месяцев назад

For me, they are just really different. Dalle is easier so I tend to use it more. But as I get more experienced I go to mj when I need a really good realistic image. The masters of mid journey course is good… it always comes down to getting a good prompt crafted. Don’t underestimate that.

@LouisGedo 8 месяцев назад

@CM-zl2jw Hi, yes and no. the problem the argument you're posing is that getting a good prompt in MJ is generally radically more time consuming if you want very specific details in the render. With Dall-E it's universes less time consuming based on my testing. I almost never test with vague prompts as my work is about achieving very specific results....... for which Dall-E via free portals such as BIC, Bing Chat, and CoPilot is generally far superior........and FREE. I didn't renew my MJ Pro Annual subscription because the cost to benefit ratio is definitely no longer worth it for my needs. Besides, my guess is that in 2024 we may see upgrades to Dall-E making it even more a reason to abandon my Pro Annual subscription to MJ. I'm not a MJ hater.......I was for quite some time probably their most vocal advocate in Discord. But they've decided to focus on things not beneficial to me at all and their abusive NSFW censorship is a serious problem as well.

@TokenizedAI 8 месяцев назад

I don't think there is an obvious "better" one. It all depends on your specific needs and what you need seems to be different from what many others need. DALL-E 3 is great for casual image creation and very specific representations. But if I'm working on a movie trailer or any other big project, then DALL-E is useless to me and Midjourney is the only one that works for me.

@CM-zl2jw 8 месяцев назад

@@TokenizedAIdo you use Leonardo and stable diffusion? I love Leonardo… so much potential there. I don’t have any experience with sd but wish I did!💯

@TokenizedAI 8 месяцев назад

I'm too much of a simp for all the SD-related stuff. I like to use stuff that just works out of the box without tons of dials and sliders. But I do keep an eye on them because I see the need to integrate them into Promptalot at some point.

@DigitalAscensionArt 8 месяцев назад

“Cheap 3D animation” - so true. I get a lot of images that give me that look.

@botsandbytes 8 месяцев назад

You can improve the aesthetics of Dall-e with Magnific Ai, but it is expensive.

@atahanacik365 8 месяцев назад

I believe it is kind of comparing apple with pear. You should be using Dalle3 with Bing without ChatGPT, than that would be a fair challange and the score would be 4 tp 0 =) There is a huge differance between DALLE3 in Bing and DALLE3 in Chat GPT. Also the message limit on ChatGPT is not letting me to use DALLE for any of my meaningful projects, I wish you could mention that as well. The prompt that you are giving is also using the power of LLM on the backend. Anyway, thank you for the content and promptalot as well, you saved around 10% of my time =)

@TokenizedAI 8 месяцев назад

Bing Image Creator doesn't use the same model as DALL-E 3 via ChatGPT. Why would I compare an inferior (free) product to Midjourney's latest model? Midjourney is paid. ChatGPT Plus is paid. That's the only fair comparison. 🙂 What happens under the hood is irrelevant. The only thing that matters is what sort of results users can achieve.

@atahanacik365 8 месяцев назад

@@TokenizedAI Also midjourney dont have a industry leading LLM to work the prompts on the back of the house :) thats why I said comparing apple with pear 😇 Still did not feel totally fair to compare 😇

@TokenizedAI 8 месяцев назад

Would you compare an iPhone to a Samsung Galaxy, even if you knew that Apple had an industry-leading feature somewhere inside it? Of course you would. So the comparison is 100% fair. What counts is the output.