Midjourney urgently needs to develop a consistent AI system for their site in the future. Most users are asking for this. I love creating characters with it for my projects, but without consistency, it's quite challenging. It's essential to have consistency in the full body, face, and especially in the clothes.
That's been something they've been teasing for awhile. Apparently they have a "storytelling" team-- dedicated to much more than just consistent characters-- but I think persistent locations, emotional tone, style, etc. I know it is fairly painful right now, but I honestly do feel that when they release this feature it'll be a "It just works" type thing. But in the meantime-- I'm right with you.
I'm working my butt off learning actual art skills to make up for this. I have crippled hands which lead to my using AI in the first place, but this has inspired me to work my manual skills even harder. Such an exciting time.
It just needs to do what you ask for… Also, they still have a huge problem of insane levels of paranoid filtering based on words and word combinations that are the absolute best parts of the palette of the best artists at times. Somehow people got to thinking that generative AI is like a search engine… which we actually need filters on for our own sake. Because other people make that stuff. BUT, why in gods green earth would I EVER need or want ANYONE to tell me that what I need or ask for or write or generate is somehow “unsafe” or wrong?!? I ASKED for it!! Filters and social mediation of individual creativity is a totally destructive oxymoron. Emphasis on “moron”. THAT IS THE BIGGEST ISSUE. I can’t even use generative AI. Like every complicated big project I spend days on is ruined by some part of it or all of it being deemed “unsafe”?!?! Not to mention all the inspiration and randomness I don’t get to see!! That stuff is GOLD even if I never use it!!! People who make AI are completely uncreative, have no vision. Art needs that, like life, like your own heartbeat. Everything. LOSE THE FILTERS!!!
It is downright insane! I'm currently going through some experiments with keyword ordering in prompts, and I'm blown away with how responsive it is! And this is still only the Alpha of v6?!
I was doing some Batman stuff earlier today and it seems to REALLY favor the Pattenson version. I’ll have to see if I can’t prompt out of that. I’ll take a look at Supes. Is it the Cavell version? That’s what it seemed to like before.
@@TheoreticallyMedia yeah every time i want a realistic looking Superman with “Dvd Screengrab” or “cinematic film shot” v6 100% of the time keeps giving me the same Man of Steel costume where as v5 would give a variety of suits. Iv tried Batman v6 i keep getting ben Affleck even when i type “Robert Pattinson Batman” it gives me ben affleck
Fantastic job putting this out so soon! 💯 Beta or Alpha? They have paid SO much attention to lighting, textures and, thankfully, some love to styles other than photorealism. I love that! About photorealism - 🤯 I have generated images with accurate _arteries_ in the eyes, and cilia on the face! Ridiculously incredible. Also, I'm not sure --tuner works with v6 but I could be wrong. Thank you so much for this Tim. Can't wait to see what else you find. Have a great holiday season. Have a great 2023 in fact. 👍 To think, this is going to get _better_ . Good stuff.
The photorealism is INSANE. I know I mostly focused on Photo/Cine in this video (I think that's kind of what I'm known for?) but I'm dying to try out some illustrative styles soon! Luckily, things SHOULD slow down a bit next week for the holidays-- maybe get a chance to catch my breath and just play!!
So after using Version 6 and running a ton of my old prompts through it I can say that Midjourney is more cohesive and a lot more accurate but also way less creative and edges on more boring. Not my favorite update for having fun with creatitivy...but if I want something specific like "disney Princess, full bodd, realistic" it does a way better job especially with hands...but again it is a little more boring in way of other aspects and even less random.
Yeah, I'm hearing that a lot-- but also the opposite, so it seems to be a very interesting update in terms of opinions of the aesthetics. I would say, it is still technically in the alpha stage, and they're going to be tuning it after the holiday season-- so, I wouldn't call it done yet!
It's their NSFW filters that killed the experience for me. It's so heavily censored it kinda kills a lot of ideas. Unless you want just pedestrian artworks like anyone else.
Couldn't wait for ur video and 2 hear all ur tips and tricks, promoting definitely different more natural language but photo realistic def not same and can't get numbers in the image for anything or words og I'm so grateful for u
Oh, I didn’t even try for numbers- but yeah, I can see that being a mess. Maybe v7 we’ll get actual text and numbers! Still, it’s nice to see it’s getting there!
every /imagine will produce 4 images right? why don't MJ produce 4 different version per image generate? lets say we can choose V3,V4,V5,V6 (4 images) so we can compare the images respectively :D
so, it CAN-- kinda do it? But not in the grid of 4. To do that, you'll want to use permutations. It'll be (Prompt) --ar x:x --v {3,4,5,6} Which, will give you the same prompt (in the grid of 4) across each of the versions.
Messing with V6 all evening and I like it. I actually have your prompt formula set in Chatgpt (well slightly different but based on one of your previous videos) and I use it for T-shirt designs. I noticed in 5.2 I was getting similar results quite often. Today, I have some TOTALLY unexpected and really cool results. V6 actually seems to interpret your formula better. I will be getting the t-shirts loaded up for sale tomorrow 😜
Oh that's awesome to hear! And yeah-- a lot of those original prompts kind of do unexpected and interesting things now. It's funny, I only messed around with Photo/cinematic for this video-- I can't wait to dive into some more styles!
Awesome video… keep up the great work!!❤ But how are the filters?!? That’s the MAIN issue all the generative AI needs to address BIG TIME! It just absolutely needs to do what you ask for… like word or photoshop. They still have a huge problem of insane levels of paranoid filtering based on words and word combinations that are the absolute best parts of the palette of the best artists at times. Somehow people got to thinking that generative AI is like a search engine… which we actually need filters on for our own sake. Because other people make that stuff. BUT, why in gods green earth would I EVER need or want ANYONE to tell me that what I need or ask for or write or generate is somehow “unsafe” or wrong?!? I ASKED for it!! Filters and social mediation of individual creativity is a totally destructive oxymoron. Emphasis on “moron”. THAT IS THE BIGGEST ISSUE. I can’t even use generative AI. Like every complicated big project I spend days on is ruined by some part of it or all of it being deemed “unsafe”?!?! Not to mention all the inspiration and randomness I don’t get to see!! That stuff is GOLD even if I never use it!!! People who make AI are completely uncreative, have no vision. Art needs that, like life, like your own heartbeat. Everything. LOSE THE FILTERS!!! My paintbrush, my art, is not a search engine!!!!
Still working on that as prompting has changed with v6. Hoping to have a full rundown later this month. For now, I’d try: A) reference images in the style you are looking for B) get really descriptive in your prompt, maybe even try hitting the “anime/videogame” keywords in your prompt multiple times, just to let the bot know you really want that!
Oh, I totally get that. The one that you just reminded me I need to check is some images of people playing guitar. It’s always super tough for AI to do, on account- well, fingers for one, and then fingers doing weird finger shapes.
After using chat gpt 4 with dalle...i just demand more consistance and exact reproduction of commands...though render quality is better in midjourney, but for storytelling for films, we need more accurate actions from prompts. The level of exact reprodution of prompts is just another level in dalle 3
Yeah, I'm curious to see what Midjourney's "Storytelling" team is coming up with. I have the feeling that once they drop their feature, it'll be a real game changer. The way I view MJ is less about storyboards and video inputs, and rather as moodboards/inspiration and...well-- oddly, thinking of it as the "movie poster" version of a shot? Like, in comic books, there's the "cover artist" who renders this amazing image-- and then the workhorse "interior artist" who does a lot of the meat and potatoes work. MJ is a very good cover artist that can kind of do interiors. If that makes sense?
Not sure what video that did poorly that you are referring to at the end of this video. You didnt say what it was and at least on mobile app, there was no link.
Oh, it was the one on AI Girlfriends. It was a fun video to make, and really showcased the shortcomings of the idea. Here’s the link: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-isL4Ov3sMvY.htmlsi=9CC-_0rsiA2Ha_2N
Haha, I did just do a video on AI Girlfriends! (My AI girlfriend dumped me...) I'll check in with the UI thing. I know that the prompt language has changed, so I'll play around to see if I can manage to trick it back. It's there, it's gotta be...it's just a matter of figuring out how to ask for it.
I know I’m gonna be alone on this, but I am more excited for the website than V6. V6 is amazing when it comes to photo realism but kind of lack the artistic creativity we saw with V4 and V5.1. Also, I would be super nervous if I was a photographer with V6. ( a lot of job losses within 3-6 months)
I mean, yes and no. I think for like, food photography? Or stock? For sure. But event and wedding photographers? They’re good. No one wants AI photos of their engagement!
@@TheoreticallyMedia Testing it out I noticed sunlight shadowing has advanced a lot this version, especially with portraits. However I have yet to figure out how to replicate this every time. Sunlit is a great tag. Any tips?
Thank you..I also hope the bot figures ou to interperate numbers... if I put in "twelve/12 " dancers I never get that number but maybe six with Crazy Faces..LOL @@TheoreticallyMedia
Stable Diffusion XL is better. A) free B) better for realism C) better for any style D) better and easier UIs like Fooocus E) ability to inpaint/outpaint that MJ will never have because they are stuck in Discord. Can also inpaint real photos. Makes paying for Photoshop generative fill pointless F) can create nudity and thirsty characters, which many are trying to do in MJ anyway by circumventing the cencorships with new words
Invalid Traffic. I keep getting hit with that and one of the proposed solutions is that RU-vid/Google might be registering embedded videos are being watched by bots. I know-- it's super annoying. I just got hit with Invalid Traffic again, I think because of the GTA/AI video, since that one kind of blew up. It's really frustrating.
wow, text imaging is huge, if a bit wonky yet. pseudo-text is cute but i want my "Times Square, 70s Scorcese, sleazy" to show me plausible grindhouse marquees, not bizaare impressionism.
So in the north west people don't look longingly at one another but rather at something else when prompted they should look at one another. Also they are all white and the women are blonde and wetter than the men and all of them look like models from an outdoor fashion ad rather than real people - let alone typical fishermen.Which is fine, I guess, if the prompt would have been "A young, white guy who wears outdoor fashion with a white, blonde beautiful model who is a lot wetter than him - let them look somewhere - doesn't matter." - except it wasn't. Really cool hat though, I agree. Also the cover of The The Gunslinger isn't the right way up. The astronauts helmet is misaligned, the "subscribe" girl has broken fingernails (6 of them on one hand) and wears a pair of revolutionary new headphones that are held in place by her sunglasses and cover the neck rather than the ears.The bike would be a Yamaha SR500 (if Yamaha had ever built a bike with drum and disk brakes and a asymmetrically mounted engine). Considering how disconnected the output is from the prompt, one would imagine that it can get some of the elements right since there is nothing prompting it to draw something like a weird headphones held to the neck by sunglasses. But that is just not how these things work.
That’s a feature that is supposed to be coming later. They basically launched the v6 model without all the bells and whistles. Pan, Zoom, and Inpainting are not available yet, but will be updated pretty soon,
Another great video from you as ususal. You make the best AI tutorial videos. They're very concise, easy to understand and follow, entertaining and funny. Have you considered doing a series of tutorial videos for Stable Diffusion? I haven't found a good RU-vid channel dedicated to Stable Diffusion that is as good as you.
may i ask for your help? The Remix on the Alpha site is confusing. When I used remix in discord, I turned it on in the settings so it was global. But on the alpha site Remix does not seem to be global but used individually. I asked this question several times and the usual answer I get is, you use it just like you do on discord. In my mind, that isn't the case. Can you give a better explanation on how it is used on the alpha site. I searched high and low for an answer but no luck. Thanks....p.s. I must not be the only one confused by the Alpha remix because in my search I got a good dozen of people saying they were just as confused as I am.
Be interesting to see how the new prompt understanding works compared to previous models for the comparison. Like a prompt we use in 5/5.2 might not work as well in 6 and definitely not in reverse.
For sure. It got cut for time, but I tried a few of my old v5 prompts and ended up with some really interesting outputs, but WAY different than the originals. V6 is for sure a new animal!
Great video Tim. I just loaded v6. I need to generate 9000 more images to get to Alpha but I do hope they drop the requirement for that. It reminds me a little bit of some of the parameters and stable diffusion? I’m really really excited. Thanks for all your fantastic videos.
So glad that Vary Region is available in V6 now! I rely on that so much for when I get a face that I like, but want to change the outfit of the character
Great vid Theo, learned heaps as usual! I've noticed that the term 'abstract' really makes renders abstract now like faaaar more than before. Found that pretty interesting.
@@TheoreticallyMedia I use image blending a lot to generate some ideas most of the time without prompts just two images and arguments like --ar and stuff and I can say that quality in this aspect dropped significantly it's like back in version 3-4
@@TheoreticallyMedia I did after posting that and will try it out. I love all the creative randomness you and several other RU-vidrs tend to do, but I use impainting 95% of my time in MJ. It is so amazing to get so much control over the details, even if it is hugely aggravating at how bad it is sometimes lol
yup, exactly that! I think Magific has a little more "spank" to it-- plus, the ability to control the level of creativity, but for the most part, it is kind of the same thing.
I'm gonna sorta miss that MidJourney alternate reality in which everything looks familar to our timeline, until the skewy written language reminds you that you're not in Kansas anymore.
Haha. I know, even in that example where I kick back to v4. I mean, I know it’s ugly as all get out, but I’m also kinda nostalgic for that super weird era of AI art!
Oh really?! I’ll have to check it out! I played the OG DeX, and I’m pretty sure I even have Mankind Divided, but never got around to it as CP2077 came along and I got lost in that janky mess. (That’s another one I still need time to replay now that it’s been fixed)
Ah, so that dating video I kept seeing in my feed was yours. It didn't seem interesting. I feel like I knew what it was going to say and I don't care about people who are going to date a fake person. Plus I saw the movie.
Haha. Her? Yeah- I think the first part about meeting your AI self was pretty interesting, I tie it into a whole thing with VR Freud. If you have a chance, give it a watch and let me know what you think! Oh, and I play guitar in that one!
That seemed to be an issue early in v 5.2 as well. I know you can prompt out of Caucasian, but I fully agree that when you prompt “Beautiful Woman” it is a little disheartening that it is just one type of woman. That said, MJ is probably the most reactive of all the AI generators when it comes to diversity and inclusion. Give the model a little bit, I’m sure they’ll get it sorted.
Have to correct you for the first time, forgive me, master Tim, lol. But according "junk" commands like award winning or photorealistic - did exactly some prompts like in 5.2 and the result was (in my case) just breathtaking. Maybe it depends on some special other keywords and then MJ just ignore the junk commands?
in general, I think that might be the new model at work more than the keywords. No inside info here, but I think when MJ flagged those keywords, I think they were saying the model straight up ignores those words. At the end of the day, the important thing is that you ended up with awesome images!
Oh, and maybe and unimportant update you might have recognized - doing celebrities (or so called ones) got actually worse. (although very prompt depending of course too, some work fine) Happened in the past too (version 3 or 4 or so), but will maybe changed too. But of course inside swap helps. Thx for quick response though! @@TheoreticallyMedia
Coming soon from everything I’ve heard. They are super baby steps, but I still say MJ has the best aesthetics of all the image generators. Subjective, I realize. They appear to be on the cusp of going 3d, and once that happens? It’s gonna be huge.
I really wish they would focus more on consistency with characters (specifically outifts, you can always face swap) and using pose references or face reference as and outline online, not a stylization or replication
So, I know they have a team working on a "Storytelling" feature that is due to drop fairly soon. They've been a bit coy as to what exactly that means-- but Con. Characters is for sure there-- plus, in their MJ way, they are promising much more than just that. I'm thinking locations and emotional tone, but that's just me guessing. I'll say, as someone that checks in on the MJ office hours-- David (CEO) has taken some flack for not jumping into Character Posing, and the newer "Real Time" generation. Interestingly, what he comes back with is: Yes, we could follow everyone and release a janky version-- but we'd rather take our time and develop something that will blow everyone away. Within v6 to v7, I think we're going to see something really earth shattering coming out of MJ.
I saw some swords, machine guns and a Desert Eagle in other generations. All fairly realistic as well. I was surprised by that, I wonder if they’re lightening up on those aspects. I know what you mean though, trying to do a John Wick image and he’s holding Star Wars blasters in v5 was kind of lame.
@@TheoreticallyMedia I meant more the ergonomics of holding weapons. Historically MJ has been generally bad, often unusably so with that, weapons pointing in strange directions or not held in a physically possible way etc. DALLE3 is not perfect, but is often surprisingly good, especially with handguns, even if it does have a tendency to want to put one in each hand. I’ll just wait and see how the beta version is, hopefully when it’s also web based, not that I had huge issues with Discord tbh.
I know, I’m not a huge fan of that either. I think what they’re looking to do (actually all of the image generators) are prompts like: “A photograph of a busy city street at lunch taken on a film camera. The photo is slightly overexposed with film grain” Basically; getting rid of “technical” terms like lens focal length, film stocks, etc. The thing is, I think those terms and keywords are still in there. They’re just pushing for more natural language prompting.
I kind of get it from an engineering perspective, but yeah, say you wanted a painting or image in impressionist style or oil abstract, photorealism is a style , but using --raw not a big deal, also do like the input for stylize amount
@@Justin-General I think (in general, not MJ specific) it’s an attempt to normalize AI Prompting, so you don’t have to be an expert in technical art terms to get the results you want. On the one hand, I’m for it- since it kills off those annoying “Prompt Engineers” who are trying to monetize prompts, but on the other: I think those terms should remain viable tokens, since- Y’know, a lot of us study these fields and like to zero in on particular aspects of the generation using those terms. As the LLMs get better, it’ll eventually become: “if GPT4 knows it, Midjouney will as well”
Oh, the website Alpha? I did a whole video about that earlier this week. This one I just wanted to focus on the V6 model, which everyone can use. Website is still in limited release. That said, prompts in both versions yield the same results. I'm going to do a big nuts to bolts MJ video early next year when the website finally goes fully live.
All right fair enough! :)@@TheoreticallyMedia do you know when the full live will be? At various sources the seem to promise incremental users added but not giving out proper roll out schedule. I really despise the discord stuff and will be absolutely buying subscription when the alpha is available to all.
Why become a master Artist, when you can just become a master thief, good luck getting a job and respect from society with your masterful Midjourney skills.
It's interesting to hear about the changes in prompting sensitivity. Do you have any tips for crafting effective prompts in version 6 for those who are used to the older version?
Use descriptive language to help the bot understand what you want to see. Experiment with parameters to change how the image generates. Use weights to adjust the importance of different elements in your prompt. Use negative weights or --no to exclude certain elements from your prompt. Use commas and double colons to separate different concepts in your prompt.
Hi Tim, HNY! This was the first video I watched and just recommended you to my painter friend Grace that doesn’t even know how to sign on to Discord. I do have the v6 and stealth mode but I have just under 2000 generations. Wish I had 10,000 lol. Wanted to let you know I entered a painting, that was a composite from mid journey v5.2 and a friends niece. I never said a word to anybody and I won an award for it in my industry, which is really against any AI right now. None of us can really say we are using it or we get disqualified if you could believe it or not, even though it’s painted from scratch and we’re using composite from our own photos and blends. I find it to be a great reference tool, and it saves me a lot of money, stylizing and setting up photo shoots and hiring models to then paint the stuff. So much easier and I can’t wait to have 10,000 generations lol
Happy New Year to you as well! Congrats on the award! And yeah, mums the word there. I totally get it, but (obviously) to me it’s such a myopic viewpoint. There’s really no difference in generating a reference image than there is collecting a bunch of magazine or photo references and creating a collage. I know I’m preaching to the choir on this, but yeah- it’s a bit silly and certainly an issue where the naysayers don’t understand the technology. Well, again: your secret is safe with me!
Myopic is a great way to describe a lot of these Realism Fine Art organizations. They are run by baby boomers that are purists. What I mean by that is they think the only way to be a good artist is to do it the way, Michelangelo, Rembrandt, and Company did. I totally love the masters and have copied that and I do paint from life. I find it hypocritical because before stock photos even came in, we went to the public library in the 90s Picture Collection. On another note, I found MJ 5.2 in December amazing and generated some unbelievable reference that was consistent. When V6 came out the first few days it was great and now there’s a lot of glitches in it. It’s very tricky with prompting using just a paragraph separated by periods. When I use the describe command, inversion six,it generates prompts the way version 5.2 did and it takes me a while to tweak it to get it to work. Are you finding the same thing and I guess it’s all part of the fine-tuning process?
This new version looks "broken" to me for now, I mean after testing many some simple prompts, all the generated images look less creative, less artistic than version 5. It's like if the model training has been more focused on stock photos content. Sad I cannot share some screenshots comparisons. I'm using a lot weird param, maybe it doesn't work like before. Going back to 5.2 for now!
All my old prompts are vibey and work fine. Composition is definitely better though. Describe is still way better than the competition (CLIP interrogator and the Fooocus one).
Tim, just stumbled on your channel today. Great stuff, and I wish I heard your formula sooner! In re v4 - to date, I find that's been the best when it comes to images where you want a pixel art aesthetic.
Awesome Video! 🌟 I learned a lot from your video on Mid Journey version 6. You explained the new features very well, and the image comparisons were amazing. Your tips and tricks are super helpful for using V6 effectively. Thanks for sharing your expertise and inspiring creativity! Keep it up! 👏🎨🚀
Fantastic to hear! Thank you so much! Really looking forward to all the MJ updates and new features in 2024! It’s gonna be a really exciting year of videos on the channel!
One thing I noticed right away was when I played around with random words and emojis or long strings of gobbledegook, a lot of the images that came up were of females, which was disappointing.
@@TheoreticallyMedia I tried a subscription to Runway ML last week, it wasn't worth it. Only about 1 out of 20 of my generations were borderline useful for anything. That's okay when you can just keep clicking try again, but when you're burning through limited credits it's no longer so appealing. A few cherry picked ones were breathtaking, but most were like... wow total disaster fail. I was using a starting image of myself, and in some cases it was turning me into a tentacle pron shapeshifting monster with eyeballs that moved in different directions with a mind of their own. I thought it would look at the picture, analyze it, turn what it thought it saw into extrapolated 3D objects inside 3D space, then pick whatever view it wanted to use and rerender out a new picture... in some cases it seemed to do that, in other cases it just tried to slew and squish things around this way and that, or mutate you into a person that didn't even look anything like you. So... hmm. It's been fun, but I wouldn't turn over the NORAD defense system to AI just yet to SkyNet.
@@TheoreticallyMedia that's why I always wait for you, did your peep the images I got from krea on the GD discord? I was upscaling an old screen grab of the Nintendo 64 version of GoldenEye, and while i called out Pierce Brosnan, it did look like him, but TBH it looked more like you. Which begs the question, are you really a secret agent? Hah
Fascinating update on Mid Journey version 6! Have you found any specific types of prompts or subjects that the new version handles particularly well compared to version 5.2?
Not yet. They launched the v6 model, but not all features are present yet. I cover what we have and what we’re waiting for towards the end of the video. I don’t think it’ll be long though, I think the MJ team just wanted to launch and then take some time off for the holidays. Expect it fairly early into 2024