Watch ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-6yQEA18C-XI.html - George Hotz vs Eliezer Yudkowsky AI Safety Debate | Pre-order tinybox buy.stripe.com/5kAaGL6lk9uX9nW144 more info on -> tinygrad.org | from $1250 buy -> comma 3X comma.ai/shop/comma-3x | best ADAS system in the world openpilot.comma.ai | Support George by subscribing twitch.tv/subs/georgehotz | Follow George on twitter.com/realGeorgeHotz to be up to date | Read George's geohot.github.io/blog/ | Sources for this stream: - twitter.com/realGeorgeHotz/status/1690831164431585280 - wtfhappenedin1971.com/ - people.math.harvard.edu/~ctm/home/text/others/shannon/entropy/entropy.pdf - prize.hutter1.net/hfaq.htm - arxiv.org/pdf/2303.08774.pdf - geohot.github.io/blog/jekyll/update/2023/08/08/a-really-big-computer.html - arxiv.org/pdf/2005.14165.pdf - arxiv.org/pdf/2203.15556.pdf - en.wikipedia.org/wiki/Landauer%27s_principle - en.wikipedia.org/wiki/Why_there_is_anything_at_all - en.wikipedia.org/wiki/List_of_unsolved_problems_in_computer_science - en.wikipedia.org/wiki/Richard_S._Sutton#Career Chapters: 00:00:00 muted intro 00:01:15 un-muted intro 00:01:38 debating Eliezer Yudkowsky, bad people 00:02:35 Sam Altman 00:04:35 Thermodynamics is to Energy as is to Intelligence 00:05:00 where did smartest people go in 1950 00:05:40 debating Sam Altman, RLHF, truth 00:07:55 universities is where the smartest people go in 1950 00:08:25 where do the smartest people go in 2023 00:08:54 hedge funds, FAANG, startups, power over nature 00:10:00 less theory, more applied, hard doomers (EA), thermodynamic god 00:11:00 why this split exist, we have no theory, arguments for Yudkowsky 00:11:50 intelligence criticality are possible? 00:12:00 1800, small stick boil water calculation 00:15:20 asking ChatGPT grams of wood to boil a gallon of water 00:16:25 how much intelligence do I need to prove fermat's last theorem 00:17:35 thermodynamics, classical physics 00:18:18 no unit for intelligence, unit of energy per second, J/s, W, Horsepower 00:19:35 person-years, bad universities, Carnegie Mellon University 00:20:50 2012, 2014 universities destroyed 00:22:15 getting a drink 00:22:50 wtf happened in 1971 00:24:00 information theory, entropy, thermodynamics for information 00:25:18 real university reach out to me 00:25:40 collage, scam, Scott Aaronson, Robin Hanson 00:26:15 administrative class, professional managerial class 00:26:40 hutter prize, compression is intelligence 00:27:40 gpt-4 technical report 00:28:25 don't hate Open AI, overhyped, person-years 00:29:55 using GPT 4 for field of science, best tutor to ever exist 00:31:00 plot gpt-2 gpt-3 bits per characters on enwik8 00:31:45 Sergey Brin is back at Google, Sundar Pichai 00:32:30 petaflops-days, 1 person = 20 PFLOPS 00:33:40 Paul Christiano Bankless 00:34:20 losing to Connor Leahy = thinking about AI alignment 00:35:39 Joscha Bach latest Lex Fridman 00:35:55 1 person years = 7300 petaflops-days 00:37:20 chinchilla deepmind, bits per character, token 00:38:38 Elon, Twitter/X login pop-up, destroy so much long term value in a company 00:39:40 what unit is the loss in for LLMs 00:40:05 nat (unit), shannon (unit), bits, 1 george of intelligence 00:41:10 beautiful elu, minimal energy requirement 00:42:20 landauer's principle, can reversible computation implement intelligence 00:43:30 why AI doesn't destroy the world, God is real 00:44:15 we are in the video game, highly rated game, we don't die in some stupid way 00:46:10 never use the word AI safety, instead use AI alignment 00:46:49 feminism legal equality of man and women 00:47:05 alignment, good loss functions that do what you want 00:47:32 defcon, Why there is anything at all 00:48:25 there is no difference between agents and tools, Robin Hanson 00:49:00 the tail wags the dog, car is the dominant species on earth 00:51:35 we have no real theory to answer how much intelligence we need for... 00:52:08 list of unsolved problems in computer science 00:52:50 people think George is not serious, care about the search of truth 00:54:15 academic system that is completely destroyed 00:54:45 information based approach is good 00:55:15 university not good for finding a job, education industrial complex 00:56:40 Eliezer Yudkowsky did not attend high school or college 00:58:25 no theory of intelligence, bounds of complexity theory 00:59:35 where is Marcus Hutter, deep mind, Jürgen Schmidhuber, Richard S. Sutton 01:00:40 Sutton became a Canadian citizen in 2015 and renounced his US citizenship in 2017 01:01:09 defcon vaccine card check, defcon crazy people, public apology, san francisco 01:02:35 real science, real questions, can't do science anymore, grant process 01:03:40 professional managerial class + AI = bad 01:04:05 Alex tired of the rants, preparing the debate 01:04:28 open source AI are bad, only for trusted people 01:05:20 if science about intelligence is done somewhere link it in the comments 01:06:05 lot's of questions, no answers, physics off the rails 01:06:30 what happened that stop science 01:06:45 we don't live in a world where standing still is safe 01:07:00 what happened to the internet, consolidate power 01:07:55 thank you for watching, we do not die 01:08:14 are you the problem? if you are think about what can you do to be less of a problem 01:08:45 tinygrad, tinycorp, comma, comma hackathon 01:08:55 deleting twitter, living under the bridge and finding truth
What does smartest mean? What was the population like? How many of them were literate? How much did literacy cost and how many people could afford it? At what level was the technical revolution? How strong is the influence of religion on minds? What consequences could be for you inside a society that lives by prejudices and beliefs?
Come on dude, the unit for intelligence is the 'noob-year', because you gotta standardize for knowledge. Kinda like dog years. Great concept bro. Love it!
Fun fact: 1 horse has actually around 5 horse power. Here’s why: The definition of a horse power comes from horse jumping. They saw a horse lifting the jockey (~75kg) by 1 meter in 1s. So a (metric) horse power was defined as the power it takes to lift 75kg of weight by 1 meter in 1s. However, they forgot that the horse additionally needs to lift its own weight, which is around 4x the weight of the jockey. Hence, an average horse has around 5 horse power.
Funny, I had a simillar thought few weeks ago on the cigarette break at office. I was thinking about the thermodynamics like equation of state for the code. My thought process was that you can have a lot of microstates of code (that is ways to get the same result) with a possibly different features such as volume (how lenghty is your code), entropy (how many ways you can write this code) , pressure (how compressed is your code), and trickiest I guess, energy (How complex is your code? Using analogy p*V). Instead of the coordinate system you could think about the syntax of the code and instead of the hamiltonian governing the time evolution of the microsystem you could think of the semantics of the code. But maybe it's just some bullshit i don't know I did not pursue it further.
I forgot to add one more thing. Those all quantities should form some kind of the equation of state that would represent the output of the code. That is you could change variables such as the entropy, pressure, volume but according to the equation of state and you would end up with the same black box getting some input and giving output, if you would put aside the computation times (maybe that also could be some variable). Probably the topology of the enveloping space would be very nontrivial though, but maybe for sufficiently big programs one could do some continuous approximations etc. I mean if black holes can have thermodynamics, why code couldn't, at least in some form. Most basic criterium is satisfied, you can have a lot of microstates that produce one macrostate.
Unlike energy, intelligence is not additive. If you have 2 horses, the horse power is 2. But if you have 2 people, the sum of the intelligence for a year cannot just be 2 person-years or whatever. You can have a billion people but they will not solve fermat's last theorem.
@@karigucio yep. A person with twice the brain size would solve problems that a thousand people cannot. It's similar to chat GPT. If you double its size, it would be able to do deeper reasoning that the smaller model would never be able to do even if you allow it to contemplate for a long time.
In reference to the thermo stuff, we have 3 or 4 laws of thermo. The third one shows what absolute zero means for particles for entropy and enthalpy. So for information exchange theory and a unit of intelligence, you need an absolute 0 of intelligence. Then build from there using bits of information. Bits of information + context == intelligence. Units of intelligence will also be in reference to something else, like enthalpy and entropy.
13 - First Principles of AGI Safety with Richard Ngo ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-DxwXLCQY1ns.html "When we think about the concept of energy or when we think about the concept of information or even the concept of computing these were all things that were like you know very important abstractions for making sense of the past. Like you can think about the industrial revolution as you know this massive increase in our ability to harness energy even though at the time. The industrial revolution started we didn't really have a precise concept of energy so one one hope is to say well look, uh we're going to get a more precise concept as time goes by just as we did for energy and computation and information and uh you know whatever this precise concept is going to be uh will end up thinking that you can have more or less of it."
Bro. George saying cars are the dominant species on Earth is like the flavor of this year of my life. Learning about American infrastructure and it's car obsession, thinking about the vehicle to passenger mass ratio. Like how cars weigh 3,000+ pounds and yet often are just carrying one person (who weighs ~150 lbs) meaning that ~95% (and often more) of the mass being moved is just the vehicle and not the thing that actually needs to move. Compare this to electric bikes, one wheels, electric skates, etc. where the mass of the vehicle vs the mass of the passenger is between 1:1 and 1:20 (vs 20:1). Even when a car is carrying 2 people and 100lbs of cargo the ratio is 7.5:1.
Bro, i’m a filmmaker, podcast operator who has to movie furniture for a living. May I pls work for you. I need a job. I‘’ll edit and turn on the sound and stuff. Thanks. George Anton.
feel like a newborn in a 20 year olds body whenever I listen to a ghot stream. So sad that I know absolutely nothing about anything and god knows how much time we have left
Maybe we just need a way to convert the units of the real universe into digital units. My intuition is that this might could be done with bits per second or something similar to traditional digital units. Afterall, isn't intelligence and computation the same thing, or at least almost the same thing? Intelligence is conscious-guided (or goal/algo guided) computation (and compression). It's all super fascinating to think about.
Thermodynamics is to Energy as Relevance Realization is to Intelligence. It’s more than information compression. You must first dynamically restrict the problem space within which you compress information. See: 1. The Frame Problem (Standard Encyclopedia of Philosophy) 2. Predictive Processing and Relevance Realization 2022 - Andersen et al 3. Dr. Michael Levin on Embodied Minds and Cognitive Agents 4. AI: The Coming Thresholds and the Path We Must Take
I found myself pondering throughout the stream: "Thermodynamics is to Energy as **Thermodynamics** is to Intelligence." To elaborate, consider a CPU. If it generates a lot of heat but delivers few FLOPS, it has a poor entropy-to-structure ratio. In this case FLOPS represent the structural outcome, and the rest rendered as heat is a waste of energy. Similarly, a baby that only consumes calories and dissipates heat into the environment (i.e., delivers few FLOPS) has limited options in life. This equates to low intelligence or a poor entropy/structure ratio. To enhance your intelligence or your range of options in life, you need to build stable structures, or "FLOPS," that enable you to access more complex options. These structures lead to even more intricate structures with better entropy/structure ratios. This concept can be extended to species, entire countries, anthropology, and so on. Given this framework, we can now start calculating: - The caloric investment needed to develop a human to the point where they can contribute significantly to fields like mathematics (e.g., Fermat's Theory). - The costs and risks associated with "monetizing" this investment, including the risk of ending up with an artist instead of a scientist in previous point - The caloric cost of maintaining a social structure that gives the illusion of freedom, thereby increasing the likelihood of producing creatively scientific minds as opposed to self-destructive adults with an entropy/structure ratio almost like a child. - This also explains why evolution has made us spend so many calories on our brain at the cost of being physically inferior to almost any other animal. All we can do is sweat, think and build social structures to overcome individual weaknesses. So, intelligence could be viewed as the ability to navigate along these two axes in a way that continually expands your options with each life decision. However, this is a separate philosophical avenue that extends beyond the scope of thermodynamics. X axis: entropy -> structure Y axis: persistency -> flexibility PS: I wanted to replace FLOPS with GORGES to make it funnier but my concept is already too long and too convoluted to spice it up with confusing jokes :D
Watched only 30 minutes so far but: It feels really weird to quantize intelligence and we actually have something like that but it's really hard to measure and I would say just wrong: IQ. For me it kind of feels like measuring how smelly something is through the unit Olf. But let's say what other unit's are: - Kcal is the energy needed to heat 1kg water by 1 degree (C or K) - Newton is a force which gives a mass of 1 kilogram an acceleration of 1m/s/s Intelligence Unit is something that..... and now comes the hard part :D But in these units there is the new unit which gives something (measured in SI) an Y amount of attribute. And it is really hard because we can't even describe that something with units, if we had the units for the "something" and the "attribute" then it would be easy. So what can they be? Intelligence can have the following attributes: - Filtering information - Seeing information as a whole - Seeing details as well - Modelling (which is kind of filtering... you fix some variables and you calculate with the rest...) - Adaptability I think is not an essential thing because a trained model can solve very hard problem without actually learning from it. But by this logic an onion's DNA would be intelligent because it is prepared for way more than us, moving humans. Maybe adaptability is an essential thing. - Time will be in the denominator a couple times (Because solving a problem faster in minutes is way better than evolving the DNA code through hundreds of years to get a problem solved). This actually is inside of as for our fast and slow brains. If you get hurt you pull away your hands immediately and you may hot something and break a something but it's still an intelligent answer. - short or farsightedness: some solutions are great if you only calculate with the next hour but horrible if you calculate for the next 10 years. "I have enough money to live comfortably and in prosperity til the end of my life.... If I die next Wednesday..." I just have more questions then before, intelligence is something that produce a better solution in an environment as fast as possible. The as fast as possible is easy, it's just /s. What is a better solution? Something that works for as long as possible? Works long enough? There is time again in this variable. The shorter the time here, the worse the solution is so time should be in the nominator but then it will cancel out? It kinda makes sense. But then we are left with: Intelligence = some_constant * better_solution no matter how fast and for how long which does not make sense? What even is a good solution? There should be a goal at least to define this and the further it comes to the goal, the better which can be usable in for example evolution based learning. But how to define goal? It's a state that is not current but can be achieved by putting in energy (Work :v). We finally have some SI units: Joules. So goal has Joules in it if not only that. "produce a better solution in an environment as fast as possible" okay, but to what: a problem. But problem is just the distance from the goal itself, so it should have the same units as a goal whatever it is. The environment part is hard again, it's like.... A set of things that place obstacles between the problem (or current state) and the goal. And since goal and problem should have the same unit, obstacle has the same unit as well as it just increases it? Assuming an obstacle can be linear only which we know that it's now true. Now I am thinking about big O in algorithms, so a solution can be an algorithm itself? We can measure algorithms with big O. Intelligence is something that comes up with algorithms in a given environment to solve a problem and which by achieves a goal. And an environment can hold not only obstacles but other algorithms, that other intelligence has already figures out (like every theorem in math we use... we don't come up with them). These are shortcuts. So solution = all_obstacles - shortcuts_used or solution = all_obstacles/shortcuts used. The second one is more interesting because that means that if none of them are constant then the unit of a solution is the following: unit_of_obstacle/unit_of_shortcut. Now it starts to seem like that intelligence is just a shortest path finder from A (problem) to B (solution) but then again, I don't know what to do with the time part. Intelligence is solution / time? IDK, I should have been sleeping for 3 hours and I will have a horrible day tomorrow because of this so good night, my brain hurts :D I can be completely on the wrong path and deep inside the wrong rabbit hole but it happens :D
@@geohotarchive fam, did you even watch it? Anyway, I just came across the fact that the 2009 PhD thesis by one of the founders of DeepMind was on the topic. "Machine Super Intelligence - Shane Legg". It's online. Likewise, there's a 2010 video online, "Measuring machine intelligence - Shane Legg, Singularity Summit 2010"
Skynet, Terminator, Hal, Lore, Colossus, etc. all Hollywood or UK SciFi fiction. But people shriek about fictional bogyman yet ignore the current AI tools wielded by those who've demonstrated indifference or contempt for the rights of the Constitution.
@odorlessflavorless it's not it's the same as the original video on www.twitch.tv/videos/1898043541 George is probably trolling us 😅 with muting every time now...
I think, you don't have a good understanding on what is behind the scene. Your question is good, but how you formulate is bad. They are no real intelligence, they are level of attention and your question can't be understandable as a physique problem like: "Can you enumerate the number of level of attention do you need to resolve this theorem". Or how many level of abstraction you need to really have the understanding the complete figure. And by level of abstraction, i mean like count the number of neurones/level of attentions to classify letter y and number 9 at first, then all letter, then all word, then learning the true concept behind this word (not popular believe), then how many level of attention do you need to truly understand a question (what the believe and understand of the user who ask the question), then a novel (the poesie/art/feeling in), then a book (elaboration of hard sensible concept on many pages) etc... at the end how many level of attention you need to have a whole picture of the physics theory... And more, how many inference (try and error of a guess) you need to have the luck to have the attention focused on the good value on the plurality of knowledge to figure how to decrease drastically the enthropie question introduced in the system by the question.
wait, is not true that if AI could take over, the universe would be filled with AIs?? This AI destroying the world is ridiculous (we are the AI). remember, for AI, time is irrelative, and here goes the 'slow' but safe evolutionary process (spread of AI)
How hotz depends also on altitude above sea level, since I can boil water using the adipose tissue harvested from deceased summit attempts on Mount Everest with far less energy than your puny wood
For debate leverage, consider the moral framework assumed by you and your opponent (likely Utilitarianism) and ways philosophers debate Utilitarians. Any chance an AGI would adopt a moral framework more like Kantian deontology? IE, killing is wrong because it is logically inconsistent, not because of some statistical equation. Deontological ethics in light of AI is a philosophical thesis waiting to be written.
So it should probably be energy through artificial neuron per second. Then we have to do the computation to each organism in terms of emergence. So planaria etc ...
we have no theory to predict the outcome of intelligence criticality because its practically impossible to know, energy isnt comparable at all to intelligence bc its deterministic whereas intelligence is non deterministic, this is textbook sartre
simplify, there is only one system; tool utilization. Any animal who has the ability to alter their surrounding as a result of reflection, is just that, using the tool of reflection to alter their future actions, in conjecture with their ability to utilize reasoning. just like anything, the distribution of intelligence is dependent on this utilization of tools. Furthermore the better you are at this tool use: the more intelligent you are. what happens when this distribution gets larger?
recommend listening to some of Steve Omohundro older talks on trying to predict AI Drives , while acknowledges that an intelligence beyond our own is incomprehensible has some good precedence for what the general emergent properties of rational agents might be by extrapolating game theory . has some cool thoughts on cooperation even amongst adversarial agents , and i think would agree with you about opensource proliferation of intelligence being the least dangerous outcome , as a society of diverse intelligent agents with differing goals are ideal to balance and pressure each other , as opposed to some monolithic entity dominating everything under it for better or worse . "Rationally-Shaped Minds A Framework for Analyzing Self-Improving AI" and "The Basic AI Drives"
That's interesting what you said about if you knew the truth you'd live under a bridge happily. Isn't that what sages like Mr Eckhart eventually got to? The truth is out there, the question is (seriously), are you really ready to take it in?
is the derivative of volume of possible AIs with respect to a given performance. Its what i would call the most analogous to the derivative of available states with respect to energy which is the original definition of entropy in continuous physics (you have to go to the continuous definition instead of the computer theoretic, discrete one since the model of NNs are continuous in their parameters). I would call energy the equivalent of performance since its the thing you're trying to optimise (physics optimises to the lowest state of energy). That would be my guess. But sadly that doesn't tell you much about the system, on the contrary of thermodynamics. Maybe you could use that to infer how the performance changes to a nudge in it's parameters.
You're right in that the current GPTs can't be AGI. As you say and imply, there needs to be action, decisions and so on. A cognitive architecture. The next idea blows my mind: What if we compress the "operations of cognitive architectures to solve certain types of problems", using that as training data? In this way, any kind of action, decision, analysis, or design could be done rapidly, and the results and learnings from those would result in optimal operations.
Regarding the question about reversible, computation, one of the transformer architecture modifications used reversible residual layers to reduce memory requirements. It did not adversely impact model performance.