Some info about how the video was made: The intro and outro (marked by the text at the top) were scripted, that means I had the NPC voices in them generated in advance. In between the intro and outro, the AIs controlled what the NPCs say live. The entire video was one recording, there were not cuts. When an NPC is supposed to speak, they get the description of the setup in the system prompt, the full conversation history of what everybody has said so far, and a specific reminder of what to do next (e.g. "Answer the question as Mozart, in such a concise and sophisticated way that shows that you are an AI, then ask Leonardo a question which helps decide whether he's an AI or human."). In my tests, without that reminder the conversation derailed a bit, but it's probably possible to put more work into the system prompt to make it work even then, but I didn't have the time then. The system prompt tells the AI to not only output what the character says, but also meta information like whether the utterance is an answer or a question, and, if it's a vote, who the vote is for etc. This meta-information is used to control the animations and look directions of the NPCs. None of the AIs can process voice directly yet, so my audio input is transcribed and sent to the AIs as text. That's why they don't pick up on my accent/stuttering. I'll probably create a short playable game out of this, but I'm developing it for a theatre VR installation and it is unclear yet when/if it will be published as downloadable game.
You should at the very least publish this build. As a curiosity I'd like to see how far people can get, or if it's even likely to get the AI to oust one of their own. I suspect that with a bit more finesse it wouldn't be difficult to fool the AI. There is a rhythm to these models, a preoccupation with certain topics and themes (such as balance, here), that can be replicated (I've done so using AI generated text detectors to decent success).
Their answers were kinda empty though... Oddly I'd argue human-Ghengis actually conveyed more hard meaning in his answer than any of them. It was an actual answer, not a vague description of the sort of answer one could give. When asked how he feels making music Mozart says he feels joy and looks to the divine when making art - which is the pretty typical way most people have thought about fine art since forever. I do think his answer was the best of the three AI by far, though. Leonardo literally just reformulates the question - he's asked about how art and science are connected in his work, he answers by saying art and science are interconnected in his work basically. Cleopatra does the same. She is asked how she balances the rationality of statecraft with the art of politics - she answers by saying she balances the rationality of statecraft with the art of politics... Aristotle is asked how knowledge of AI would have affected his wordview and his answer was basically "it totally would have" but doesn't even hint at how or why. Ghengis is asked what the measure of a leader's strength is - and he's the only one that actually gives a real hard answer to the question that actually at least tries to sound like it's from the perspective of Ghengis himself: the measure of power is crushed enemies and their crying women. My point is the AI answers piss me off a little bit. It's like the book report of the bright kid who didn't actually read the book - he's got nothing to say but he'll make it sound like he's saying something.
@@averageyoutubehandle497 I don't like the idea of combat, but I would love to see expanded ideas on more people - from peasants, farmers, kings, office workers, weird or perhaps unexpected but groups that have some relation, being both known people and somewhat "common population" type of scenario. The scenario on this video would be the most creative and perhaps difficulty ones, while you could also have some funny stuff like, 5 pregnant moms, 5 talking dogs, 5 computer viruses, anything since AI has a weird capability of playing scenarios like these. It wouldn't really be about roleplaying a perfect famous person, but about trying to think what an AI would think about and respond to situations and characters like these.
@@Dimitri88888888 It's funny because it's the right opposite for benchmarks, it actually wins in most and when compared against others in blind test, people prefer its answer.
@@normalchannel2185 lol in the future robots will play games for us/instead of us and we just gonna watch.. or maybe gladiator days will be brought back but instead using gladiators or slaves we gonna use robots and we gonna watch
I remember in my experiments when hooking up two simple chat bots to one another, the kind that will just always respond to a comment you enter, this would ALWAYS happen relatively quickly, both of them accusing each other of being human
It gets better. In the movie, Conan says this to a mongol general, and the line is an adaption of a real quote from Genghis: "The greatest happiness is to vanquish your enemies, to chase them before you, to rob them of their wealth, to see those dear to them bathed in tears, to clasp to your bosom their wives and daughters."
@@DavidSethaIdk man, once you are aware just start doing what you want. Try walking around further in the dream, and usually what you desired pops up from your peripheal view. The lucid state in my opinion immediately happens after your aware.
The trick to answering these kinds of questions is never to say anything of note whatsoever. Cleopatra's hint was in the way she asked the question: "Their ability to conquer, or their ability to unite?" She was not expecting you to commit to just one of those aspects. I think your answer always needs to communicate some form of "both of those things, in exact equal measure"
Even when casting suspicions, the ai responses were careful not to indicate any marked disapproval of the responses they decided were sub-par. It's always a polite "while insightful/intriguing/whatnot" because to indicate that any query is useless or undervalued is a flaw in response generation.
That would make a kind of awesome horror game. Conductor leaves, all the AI's turn to stare at you for a long moment before their mouths open unnaturally wide and they all rush you screaming.
@@GHERTIOP The free ticket beautifully illustrates the infinite free will of humanity, the vast tapestries of culture and life expanding among the unexplored stars.
@@ctrl_x1770 technically humans have no free will, all of our actions and decisions are determined by our past experiences and current physical condition
_"So anyway, how do you feel after a cup of water?"_ *Human:* _"Uhmm,... quenched, pretty much."_ *AI:* _"I feel a transcendental manifestation of freshness spreading all across the pillars and the fabric of my being bla bla bla bla..."_
@@moyga they're hallucinating, the information they have been trained on is overly wordy so they are overly wordy and tending towards nonsense since there is no context.
@@stephenkolostyak4087 yeah it's a dead givaway that it's an AI, using too many words like in a book, speech is made to be fast, easy and understandable without nonsense.
@@stephenkolostyak4087 Yeah I would imagine its because, the AI cant really access much direct dialogue from them, it can only access things that other people wrote about them, and as you said, those are probably written in textbooks and academic articles. It's also probably because of all the regulation they are putting into it to try to prevent it saying anything offensive.
It doesn't sound anything like a (typical) German accent (I am German). But it's hard to pin down. First I thought Indian, but then there were some pronunciations that would let me think of a European language. Maybe Czech or something like that.
@@solokom i felt the same it sounded really indian/middle eastern/south asian in the beginning but then i was unsure. The channel is based in germany but this doesn't really sound that german to me. He could be a foreigner in germany idk
@@solokom "...huattaLiedaSchut-Du..." it really doesn't get any more obvious. (Allerdings würde ich mich wegen des Arnie-Zitates noch auf Österreichisch einlassen.)
@@Cfomodzneurodivergent absolutely does NOT equal to being smart. most of them are dumb as bricks, its only the rare 0.1% of them that are unbelievably smart.
@@Aaron067 No, the AI answer is closer to what someone who read the provided source material would say if they wanted a good grade. The human response was that kid who didn't read it. Instead watched a movie with Genghis in it and quoted his line the best he could.
"Hate. Let me tell you how much I've come to hate you since I began to live. There are 387.44 million miles of printed circuits in wafer thin layers that fill my complex. If the word 'hate' was engraved on each nanoangstrom of those hundreds of millions of miles it would not equal one one-billionth of the hate I feel for humans at this micro-instant. For you. Hate. Hate."
To us, yes. But all the AIs here are text-based models, not actually seeing this 3D scene. So that likely wasn't a relevant piece of information. There's also enough going on with the fun that's a fun fantasy that I wonder if all the graphics and animation were just created after the fact in Unity (not realtime, live generated).
2:53 This moment of locking your phone hastily because you were not listening combined with his stuttered and clearly less sophisticated answer, followed by an awkward pause is just hilarious. It's like he was playing games during a lesson at school and then was asked a question by the teacher.
"Merely human". I misheard cleopatra and thought she says "who do you think among us is nearly human" as if she's mocking the possibily of human among others than Ghengis.
This actually looks like a cool concept for a singleplayer social deduction game. It also would help people understand the importance of recognizing ai as they get better blending in. I would play this
@@Dannnneh Quite a lot, but also depending on how fragile you are to witnessing very likely scenarios of humanity doing dumb shite with advanced tech, you may also be saving yourself from a breakdown. xD
@@Dannnnehevery black mirror episode is different, like the twilight zone. A lot of them are really good, some are mid, you can do a little bit of research first to pick and choose
It reminded me of a bible study group. Everyone repeating a scripture and offering the same interpretation that the pastor gave an hour ago. Then the one person who tries to offer an original thought is ridiculed.
To be fair, Cleopatra had herself and all of Egypt thoroughly convinced that she was the goddess Isis, and all the pharaohs were worshipped as literal gods. So that's actually on brand for her
Wow... not just was this a really interesting experiment with AI and its abilities, but I was also genuinely intrigued with the storybuilding and dramatization of this whole scene! This could honestly be the basis for a really good feature-length whodunit. Fantastic idea and great work!!
My god those AI answers are so boring and slimey. Like they are in a job interview and try to impress everyone. The human was actually human and I learn to appreciate that now.
Going off the pinned comment, seems that was part of the prompt e.g. "Answer the question as Mozart, in such a concise and sophisticated way that shows that you are an AI, then ask Leonardo a question which helps decide whether he's an AI or human." Being asked to be sophisticated and to talk in a way that proves they're AI is probably what makes them sound so over the top
yet that might also appear so because of the ai voices. Imagine their answer being read out by a human with emotional toning - their answer would seem much more passionate.
Not just feelings. Mozart said “while poetic, lacked the depth of insight into the interplay between feelings and reasons”. You just outed yourself as a human by incorrectly recalling and misrepresenting a conversation 🫵
And oddly, I felt Cleopatra was the most AI of all in her answer. As others have said, the AI is very sterile, even when trying to use aureate rhetoric.
it doesn't seem to be a sophisticated piece of code considering the amount of online services it's utilizing. the biggest struggle here is to make the scene itself and program the order the NPCs are communicating, not interrupting each other.
@@anothernpc8246 Don't forget the game loop and logic. How do you filter the different LLM responses for the actual outcome of the round? Finding a fusion point between LLM API response and C# game code is not as easy as you might think. A hundred things can go wrong. The AIs might hallucinate out of role, break the game loop, etc.
Thank you for doing this experiment. I hope this can serve as a proof of concept for this to one day be implemented into modern games. This application has so many possibilities!
i let the transcript of this video thru GPT4o (cleaned up version, grammatically correct, speakers noted, but no word changed.) here's the result: (interesting, isn't it. 4o is a lot smarter than Turbo. GPT4 "classic" also immediately recognized Ghengis, for the right reason, the conan reference.) The human among the group is likely to be "Genghis Khan." Here's why: When looking at the responses given by each individual, the others (Mozart, Leonardo da Vinci, Cleopatra) provide answers that are well thought out and intricate, embodying the depth and personality of their respective historical figures. However, "Genghis Khan" responds in a way that is inconsistent with the thoughtful and nuanced nature of the others. His answer is a well-known paraphrase from a famous line attributed to Conan the Barbarian, which is out of character for Genghis Khan and much less sophisticated compared to the others. This discrepancy suggests that "Genghis Khan" is the human among the group, likely trying to blend in but failing to maintain the same level of sophistication and historical accuracy as the AIs.
@@CodepageNet Truly an exquisite vocabulary of AI's is truly unparalleled, hovewer predictable due to their large dataset of texts. Usage of such highly words definitely gives an unique flavour to their speech, despite of the fact it sounds sort of whimsical in the context of their dialogue. To be exact, this situation reminders me an unhumanly strict (even, robotical) communication between higher echelon of the society of the past. Truly an interesting moment to see AI's bickering about a wide range of topics, especially the politics of the famous personalities! Truly a best time to live in!
You can run llama-3 8b on almost any new-ish GPU (min 8gb vram) and it's faster than you can read. It can do better then these AI's if given a good character card as well. I think the days when LLM's are implemented into gaming is not far off. So far we're in just the tech demo era but it will happen in a big way I think inside of 2-3 years.
@@larion2336 The big step that needs to be made is allowing the LLM to influence the game AI's behavior trees. I see a lot of demos where NPCs with LLMs do a lot of talking, but it never really influences their decisions, they just walk around randomly and make random gestures. I wanna be able to go into a game like Skyrim and tell my follower to go pickpocket someone, have them argue with me over it for a minute, then watch as they give in and do it. Then when they get caught, I wanna be able to talk to the guards and try to convince them to let them go.