just got a research internship because of you, final question was “explain neural nets like i’m 5”, used your explanation from a video with number recognition. Interviewer was super impressed with how concise i was with my response, thanks for making me an expert now blue :)
@@lilposs98 imagine a neural network as a factory of workers (neurons or nodes). Each line of workers performs a specific function (layers), let’s say the entire factories job is to sort images of cats and dogs. To speed up the process, the first worker line gets the raw photos, the following lines are responsible for identifying certain features in the photo (pointy ear, long nose, whiskers, etc). The workers collaborate and pass their results onto the next line. The very last line of workers (output layer), decide whether, based on all the results, the photo was of a cat or dog. Now at first the workers arnt very good at this and sometimes get confused, so the manager (training algorithm) comes in and says “u called this cat but it was dog, let me adjust how much u care about pointy ears”. The manager tweaks how much all the workers care about certain feature and the results coming from other worker lines (weights and bias), until they get really good at identifying stuff. This explanation is almost analogous to “But what is a neural network?” by 3blue1brown, but I had to give me own twist to show i knew it. Of course they didn’t just ask me that question, but i felt like i just gave a great answer bc of this youtuber, im pretty young so my exposure to advanced deep learning is minimal.
"A Large Language Model is a sophisticated mathematical function that predicts what word comes next for any piece of text" - you just couldn't describe it better, amazing explanation.
it is how i understood to NOT equate LLMs with "understanding language". i doubt it knows grammar but only guesses from the sheer volume of what it has "read" to "know" how a sentence should be structured. (the same with the "visual AIs" and how they have no depth perception - unless they have been shown varying pictures of the same objects at different distances and angles and perspectives)
@@xway2 i just said what the difference is, it doesn't understand grammar. how it learns what objects are are probably similar; what is apple, ball, cat, dog, etc and that emergent thing where it can extrapolate relations shown by those vectors of man:woman and king:queen, ie. similes. but does it really know what verbs are?
I agree! I'm a fairly technical person (working towards a PhD in physics) but despute that, having this sort of very big picture explainer before diving deeper really helps contextualise everything and remember it later. Would recommend making this 'Episode 0'.
I have an old friend of mine working in the field of curating data for training AI models. These videos has made me understand a lot more of what his company does than anything he's ever said to me.
You mean a meticulously scripted explainer video by an expert, featuring sophisticated animations, was more clear than casual chitchat you have with your friends? You don't say.
Just the right amount of abstraction for a short video. Also nice that it contains a bit of the final training part where humans rate answers of the model, to tweak to model. I believe you where still going to make a video about that in your series
There's a fantastic video about that part (and about what happens when it goes wrong), it's "the true story of how GPT-2 became maximally lewd" by rational animations.
I've always appreciated your videos, and you really made math look simple to me, as I now tend to imagine things visually, because of your animations. I am just starting at Machine Learning and AI, and seeing this video about the matter was really a gift I didn't expect to get. Thanks for the great content, as always.
7:36 "The words that it generates are uncannily fluent, fascinating and even useful." Beautifully put. Probably the best summary of why LLMs don't think, but are valuable, I have ever heard.
This emergent property of LLMs makes me thing that our minds are nothing more than complex pattern recognition functions running on biological machines. Maybe we don't "think" either.
humans don't think, its just neurons connected together with electricity. human brains are trained by repeated exposure to words and ideas from sources such as the humans parents and community, along with biologically coded shaping that is somewhat locked in at birth. eventually they go from saying random gibberish such as "goo goo ga ga" to saying things that are fluent, fascinating and even useful.
His summary is accurate, but he elides the significant problems that can occur when LLMs output authoritative and true sounding, but yet untrue words. Even in as straightforward and concise an explainer as this, these caveats should always be included.
@@gmcanallyI agree, this technology is incredibly useful, but we'd be dumb not to also hammer home its constraints. If we do, we can get even more people thinking about and discussing ways to overcome those constraints. Then eventually we can use either auxiliary systems, or a new approach entirely, to create ways to actually program in logic and reasoning, fact checking, abstract thought, etc. But we're definitely not there yet and the Sam Altman types should stop pretending like we are.
The computer history museum is one of my all time favorites just from the memories of going there with my grandma as a kid. Loved the simple explanation too!
Thank you I was considered one of those kids who just don't get maths, 40-50 years ago, forget about it As it turns out, my way of learning is simply different (ended up with a master in engineering with top grades, but not without incredible effort and a lot of luck, having met those very few professors and study buddies who made the whole difference) Thank you for your channel - with resources such as this (and countless others on YT and online courses etc.) in my time, who knows what I could have achieved (this is not just about this specific video alone, it's about your channel and others like it in general) Now, as my career is over, I just take joy learning for free :-)
Pour copious amounts of energy and data into a black box, then shake repeatedly until the output mostly resembles what you want. Repeat until you consume the energy of a small industrial nation.
Or, you could just be more patient. Like, seriously, people don't realize the power usage is to make it faster, not more functional. And what basically everybody everywhere constantly skips when talking about these systems - the power usage is only during training. Once you have a trained model, and all of the weights are set, actually USING the model takes a trivial amount of power. It's the shaking of the black box that takes all the power of a small industrial nation. After that, using the black box is radically more efficient overall than even asking a human a question. By many orders of magnitude (humans are catastrophically energy inefficient).
@@DustinRodriguez1_0yea but most companies want to shake their own box. You don't really know what will come out of it and you will have to shake it at least every few years to keep it up to date.
That's one of my big aims. After putting out the videos on Transformers earlier this year, some viewers gave the feedback that when they shared the videos with friends in their lives curious to know more about the topic, sometimes those friends found the videos a bit heavy or confusing. Hopefully, this offers something more helpful to share.
“a large language model is a sophisticated mathematical function that predicts what word comes next for any piece of text” - well phrased sir, that’s a very satisfying definition for a novice like me.
It is a good simple definition, be careful though, it only works if you're aware it is a simplification. For example, there are a lot of people who read that and think "it's just like the autocomplete on my phone keyboard, but a bit better" when that's very wrong. It is so complex we don't understand how it works, but this mathematical function presumably computes the meaning and context of what was said so far as well as the direction of where the conversation is going, refining this information through its learned knowledge of the world into a representation from which we can extract the most appropriate next characters for the response. There is nothing simple about this and we don't know what the limits are for what different versions of it can do.
this video was very useful. i loved your deep learning series, but this was a really nice summary and i definitely think it should be at the start of the deep learning playlist, as an intro
One suggestion I have for the Computer History Museum is that they really should bring more computers back to operating condition now that the Living Computer Museum is closed. I totally understand that some of the older ones are prohibitively expensive to bring back, but there are so many more recent computers that they can make interactive. What difference is there compared to a museum for hardware/industrial design without interfacing those computers?
Awesome stuff! One piece of feedback: At 6:53, you use the word "vector", but up until now you've only been saying "lists of numbers". If this is for a general audience, I think throwing in new terminology right at the end without explaining it could be confusing.
this was the one thing that stood out for me, and i think Grant must have thought a lot about it. i have watched a few of his videos and am familiar with 'vector' so i could have bias too, but it really is an essential term that must be acknowledged and i was actually thinking it should have been mentioned earlier in the video (instead of just "lists of numbers") when he started animating the propagation process etc. (i would trust his judgement as an educator to know when to slip in something new, just to pique curiosity for the viewer/student to pick up on and find out more)
There’s a bright yellow rectangle around the last LIST OF NUMBERS when he says the last vector. Anyone too stupid to make that connection but also allegedly want to know how LLMs work would also be asking “sir sir I have a doubt please do the needful and revert” regardless of how comprehensive the explanation was
I've been studying this topic since 2023, and this is the best short video covering the most important aspects of LLM's without unnecessary jargon or misleading statements. Well done!
Fun fact: if you keep asking the transformer to complete the AI assistant after it has ended its answer, it will then invent text from the user, impersonating them. At least that's the case with most models. It will be trying to predict what the user is going to say, the same way that the answer from the assistant is what the LLM "thinks" the assistant would say.
It's as if the machine is playing the "character" of an assistant, rather than being an assistant. It's also why it's so hard for it to admit when it doesn't know something. It's playing improv, trying to perform the character of someone who _does_ know the answer.
I think it's quite interesting that ultimately something like ChatGPT appears to be a helpful assistant that can answer questions. But in reality its just a piece of software that is really good a completing incomplete conversations. And when then just happen to give it a conversation where the user asks something and now the response has to be filled in. But we present it as thought the response is the assistant even thought it's not really.
@@pixelforg It seems unlikely that a real AGI is going to be a large language model specifically, though it's possible (likely, even) that advancements in tech from that field would inform the design and function of such a hypothetical AGI.
Hadn't tuned into a 3B1B video in a while. This man is the gold standard for explaining complicated mathematical concepts. Period. I hope he never stops.
This is easily the most informative and easiest to understand video about LLMs. The lightbulb went off as soon as you said,"A Large Language Model is a sophisticated mathematical function that predicts what word comes next for any piece of text"!
I would like to express my sincere appreciation for the impressive work you continue to do. The quality and dedication they demonstrate in each publication is admirable. However, I've noticed that unlike older posts, I can't currently find videos in Spanish. While I understand that this change may have several reasons, I must admit that, although I would have liked to have those resources in my language, this also represents an opportunity to improve my level of English, which is a personal advantage. Thank you for sharing your knowledge, "I VALUE IT A LOT.", cordial greetings from Spain.
I just wanted to take the time to give you a huge thank you. I’m a finance guy but am focusing my portfolio towards AI so improving my understanding of how it actually works after having pretty much no experience in it is a must. I have no formal experience but I am an enthusiast in the field. I’m glad I found your channel and this video because it’s really helped me to get some foundational knowledge on how this works. I’ve went back and watched your more in depth videos on LLMs too and they have also been great. Thank you for the informative videos. The illustrations are incredibly helpful too.
Can we take a moment to appreciate the carefully selected background music 3B1B uses in his videos? It makes his videos even more enjoyable to listen to.
great video as usual! i'm a math phd, and get disappointed with how little math folks talk about the labor behind large language models. it would be nice to see folks explicitly recognize how these models are built. its not all math under the hood!
just stick with it - he is an absolute star in educating people using simple language, i'm not too advanced in math either, but just let his explanation wash over you and you WILL get the gist of it.
Great video. 4:47 afaik RLHF does not use human annotation for reinforcement learning on the base model. Instead, Human Feedback is used to align a reward model for the RL process on the base model.
Your best one on this topic. Excellent. I teach AI, and this one I can show my friends. And wife. And, I’ve visited the Computer History Museum. Loved it. Looking forward to my next trip to the Valley where I will visit it again. It is awesome.
The fact that the final embedding vector is able to somehow encode the meaning of all of the previous text in the context window well enough the predict the next token is the most mind blowing part.
For anyone who hasn't checked out the Computer History Museum youtube channel, I would highly recommend for valuable primary sources/interviews with pioneers in the field! I love Gary Starkweather's lecture on the 'Birth of Laser Printing'!
I remember that one too. I also used the teletype appearing terminal at one point. It’s often difficult to understand how we survived such primitive technology. Lol
Thanks for this brief video which beautifully explains concepts like LLMs, Attention mechanism, Reinforcement Learning to the general audience. I remember reading 2017 "Attention is all you need" paper for my UG project in 2019 and was deeply fascinated when GPT2 released in early 2020. Much time has passed since then and now AI is has moved far beyond simple Transformer architecture so this video works as great refresher and solid foundation for curious beginners.
What I find most useful as a beginner in anything is to understand what I'm learning is buying me; for example, if someone had told me that learning the forms in calculus would save me literal days of algebraic nonsense and yield certain results in moments, I would have been MUCH more enthusiastic ;-) So you're doing pretty well - I think people who basically just heard about it, wonder what it is, and have thinking minds got enough to determine usefulness not just in general but to them directly. That will get you more new subs as those you might lose who aren't really aligned with your content; beginners will always exist.
4:59 this staggering amount of computation is also only made possible by an equally staggering amount of power and water consumption. AI training at this scale is exasperating climate change by rapidly increasing the amount of power big tech companies like Google are using.
Awesome! I’ve been trying to figure out how to help a friend watch your videos on machine learning and stuff like that without getting lost in the math. You do a great job of explaining both the concepts and the math, and I would say something like ignore the heavy math, look at the animations and listen to what Grant is saying about them. Then you made this video. I thank you very much.
Hey 3B1B I've recently become engrossed by everything AI / machine learning that I've been devouring content like yours and googles machine learning crash course. Thanks for all the amazing content!😄
I literally started researching machine learning a week ago then tried finding llm explanations but nothing animated or concretized today and then you drop a banger answering this
Absolutely - I own an AI company, talk about the tech, and talk about AI law. I often talk with folks who are quite smart but not trained in math. I had assembled a collection of your last videos as an intro, but this is the single best intro I've seen! Thank you!
this video in my opinion should be placed in the neural networks playlist right before the videos that start talking about the transformer architecture, this video serving as an introduction, then the other following with more detail
This is a great museum. It was so much fun to see computers I grew up with and the origins from many years before I was born. Very close to Google HQ in Mountain View, CA, so if you are going to be in the area, check it out.
I started college last September (2023), and since that time, it is crazy how much AI has changed. Im in eletronics, and for complex questions, the earlier models could sometimes explain the solution after a few back and forths, but the math with actual numbers was horrendous. With gpt4o or whatever it called, as an example; I can take a screenshot of a question that supplies a circuit diagram of an FM transmitter and is a 5-part question that requires lots of knowledge on FM circuitry as well as pretty complex formulas. It'll answer all 5 parts of the question with decent explanations, and I can easily get a deeper breakdown by another prompt. I could probably pass my program without actually learning anything. The teachers are generally okay with us using it for assignments and what not, but still have strict exam policies. I don't think they fully understand how much its improving every month. They tried it a while ago and went, "thats cool, but it still has issues."
Super nice video for introducing non technical people to LLMs! I sent some of your more thechnical viedos to some friends but they were a bit put off by all the maths (which i really liked by the way) and the length so now i can send them this ;)
That line is a bit misleading. it’s not that people have been working on them for them for that long, but rather when they train the models they have to train them for that long
Great job. This is a simplified and effective way to introduce the idea. However while LLMs predict the next word based on patterns they've learned, they do so by considering a broader context than just the immediate word before them. This means they use the surrounding text and sometimes even long-range dependencies to make better predictions. This is particularly important for tasks like maintaining coherence over longer paragraphs or handling more complex questions. (This explanation is from Chat-GPT)
Thank you so much for this video! I've always wanted a good introduction to show to other people that are curious about how it all works! (and finally I have it)
There are probably many reasons why this video is a good resource, but one that springs to mind for me is that it makes it quite clear why LLMs are now consuming so much energy in the world: brute force has a high cost.
just got a research internship because of you, final question was “explain neural nets like i’m 5”, used your explanation from a video with number recognition. Interviewer was super impressed with how concise i was with my response, thanks for making me an expert now blue :)
Ok this is glorious
That's amazing ahah, well done
congrats! how did you answer tho
@@lilposs98 imagine a neural network as a factory of workers (neurons or nodes). Each line of workers performs a specific function (layers), let’s say the entire factories job is to sort images of cats and dogs. To speed up the process, the first worker line gets the raw photos, the following lines are responsible for identifying certain features in the photo (pointy ear, long nose, whiskers, etc). The workers collaborate and pass their results onto the next line. The very last line of workers (output layer), decide whether, based on all the results, the photo was of a cat or dog. Now at first the workers arnt very good at this and sometimes get confused, so the manager (training algorithm) comes in and says “u called this cat but it was dog, let me adjust how much u care about pointy ears”. The manager tweaks how much all the workers care about certain feature and the results coming from other worker lines (weights and bias), until they get really good at identifying stuff.
This explanation is almost analogous to “But what is a neural network?” by 3blue1brown, but I had to give me own twist to show i knew it.
Of course they didn’t just ask me that question, but i felt like i just gave a great answer bc of this youtuber, im pretty young so my exposure to advanced deep learning is minimal.
Can you please explain/tell us that what exactly was that concise answer that you gave? 😮
"A Large Language Model is a sophisticated mathematical function that predicts what word comes next for any piece of text" - you just couldn't describe it better, amazing explanation.
it is how i understood to NOT equate LLMs with "understanding language". i doubt it knows grammar but only guesses from the sheer volume of what it has "read" to "know" how a sentence should be structured. (the same with the "visual AIs" and how they have no depth perception - unless they have been shown varying pictures of the same objects at different distances and angles and perspectives)
I wonder….
It's curve fitting on steroids based on some bias.
@@binsarm9026 However the philosophical question you can ask is how different is that really from what we're doing (with our native language)?
@@xway2 i just said what the difference is, it doesn't understand grammar.
how it learns what objects are are probably similar; what is apple, ball, cat, dog, etc and that emergent thing where it can extrapolate relations shown by those vectors of man:woman and king:queen, ie. similes.
but does it really know what verbs are?
3B1B explains LLM with RLHF for the CHM!
I'll go ahead and update the title to "3LRC"
@@3blue1brown wow really love your content
As a biologist I can confirm that i have developed a PTSD of people talking in barcodes.
this is the first time i've heard of RLHF - it is clearly what anybody who wants to talk about regulating AI technology must know of.
@@3blue1brown Do I have to give this a thumbs down? LOL
This should be the first video someone watches before watching all your other explainers. Its a great foundation.
I agree! I'm a fairly technical person (working towards a PhD in physics) but despute that, having this sort of very big picture explainer before diving deeper really helps contextualise everything and remember it later. Would recommend making this 'Episode 0'.
I have an old friend of mine working in the field of curating data for training AI models. These videos has made me understand a lot more of what his company does than anything he's ever said to me.
I know someone who used to curate cars at night.
@@robertthallium6883 Beautiful
I am currently studying AI, and it really excites me that blue is talking about this interesting things on Transformers.
You mean a meticulously scripted explainer video by an expert, featuring sophisticated animations, was more clear than casual chitchat you have with your friends? You don't say.
@@aquajosei5759 no, you're LARPing and are trying to fit in
Just the right amount of abstraction for a short video.
Also nice that it contains a bit of the final training part where humans rate answers of the model, to tweak to model. I believe you where still going to make a video about that in your series
There's a fantastic video about that part (and about what happens when it goes wrong), it's "the true story of how GPT-2 became maximally lewd" by rational animations.
@@DiThijust watched the video and it is indeed a great story 😀
Hopefully something like this doesn’t happen when programming for robots
I've always appreciated your videos, and you really made math look simple to me, as I now tend to imagine things visually, because of your animations.
I am just starting at Machine Learning and AI, and seeing this video about the matter was really a gift I didn't expect to get. Thanks for the great content, as always.
7:36 "The words that it generates are uncannily fluent, fascinating and even useful."
Beautifully put.
Probably the best summary of why LLMs don't think, but are valuable, I have ever heard.
This emergent property of LLMs makes me thing that our minds are nothing more than complex pattern recognition functions running on biological machines. Maybe we don't "think" either.
humans don't think, its just neurons connected together with electricity. human brains are trained by repeated exposure to words and ideas from sources such as the humans parents and community, along with biologically coded shaping that is somewhat locked in at birth. eventually they go from saying random gibberish such as "goo goo ga ga" to saying things that are fluent, fascinating and even useful.
His summary is accurate, but he elides the significant problems that can occur when LLMs output authoritative and true sounding, but yet untrue words. Even in as straightforward and concise an explainer as this, these caveats should always be included.
@@gmcanallyI agree, this technology is incredibly useful, but we'd be dumb not to also hammer home its constraints. If we do, we can get even more people thinking about and discussing ways to overcome those constraints. Then eventually we can use either auxiliary systems, or a new approach entirely, to create ways to actually program in logic and reasoning, fact checking, abstract thought, etc.
But we're definitely not there yet and the Sam Altman types should stop pretending like we are.
this video didn't contain any argument about thought or theory of mind
2 vids in 12 days we are truely blessed
Ja!
Truly
3 now
indeed
WATTBA
The computer history museum is one of my all time favorites just from the memories of going there with my grandma as a kid. Loved the simple explanation too!
Thank you
I was considered one of those kids who just don't get maths, 40-50 years ago, forget about it
As it turns out, my way of learning is simply different (ended up with a master in engineering with top grades, but not without incredible effort and a lot of luck, having met those very few professors and study buddies who made the whole difference)
Thank you for your channel - with resources such as this (and countless others on YT and online courses etc.) in my time, who knows what I could have achieved (this is not just about this specific video alone, it's about your channel and others like it in general)
Now, as my career is over, I just take joy learning for free :-)
Pour copious amounts of energy and data into a black box, then shake repeatedly until the output mostly resembles what you want. Repeat until you consume the energy of a small industrial nation.
ecoterrorists, where are you?
Or, you could just be more patient. Like, seriously, people don't realize the power usage is to make it faster, not more functional. And what basically everybody everywhere constantly skips when talking about these systems - the power usage is only during training. Once you have a trained model, and all of the weights are set, actually USING the model takes a trivial amount of power. It's the shaking of the black box that takes all the power of a small industrial nation. After that, using the black box is radically more efficient overall than even asking a human a question. By many orders of magnitude (humans are catastrophically energy inefficient).
@@DustinRodriguez1_0yea but most companies want to shake their own box. You don't really know what will come out of it and you will have to shake it at least every few years to keep it up to date.
“Every machine is a smoke machine if you operate it incorrectly enough “
input: all life on earth
algorithm: soul grinder
output: chatbot
This is entirely useful for beginners! I have many friends and family members who would genuinly benefit from watching this video.
That's one of my big aims. After putting out the videos on Transformers earlier this year, some viewers gave the feedback that when they shared the videos with friends in their lives curious to know more about the topic, sometimes those friends found the videos a bit heavy or confusing. Hopefully, this offers something more helpful to share.
@@3blue1browncan we still have technical videos made for engineers pls
7:48 Never thought in a million years that I would ever see Kendrick Lamar in a 3B1B video
I bet it's up there as a demonstration of deepfake tech
Makes sense though, that's an AI music video
“a large language model is a sophisticated mathematical function that predicts what word comes next for any piece of text” - well phrased sir, that’s a very satisfying definition for a novice like me.
If someone were to dumb it down even further: “it’s a glorified a autocomplete”.
It is a good simple definition, be careful though, it only works if you're aware it is a simplification. For example, there are a lot of people who read that and think "it's just like the autocomplete on my phone keyboard, but a bit better" when that's very wrong.
It is so complex we don't understand how it works, but this mathematical function presumably computes the meaning and context of what was said so far as well as the direction of where the conversation is going, refining this information through its learned knowledge of the world into a representation from which we can extract the most appropriate next characters for the response. There is nothing simple about this and we don't know what the limits are for what different versions of it can do.
Soo yeah, pretty much just glorified autocomplete.
this video was very useful. i loved your deep learning series, but this was a really nice summary and i definitely think it should be at the start of the deep learning playlist, as an intro
3B1B, I just want you to know how approachable your videos are. They give us a nice, intuitive understanding without taking too long. Thank you!
One suggestion I have for the Computer History Museum is that they really should bring more computers back to operating condition now that the Living Computer Museum is closed. I totally understand that some of the older ones are prohibitively expensive to bring back, but there are so many more recent computers that they can make interactive. What difference is there compared to a museum for hardware/industrial design without interfacing those computers?
Awesome stuff! One piece of feedback: At 6:53, you use the word "vector", but up until now you've only been saying "lists of numbers". If this is for a general audience, I think throwing in new terminology right at the end without explaining it could be confusing.
this was the one thing that stood out for me, and i think Grant must have thought a lot about it.
i have watched a few of his videos and am familiar with 'vector' so i could have bias too, but it really is an essential term that must be acknowledged and i was actually thinking it should have been mentioned earlier in the video (instead of just "lists of numbers") when he started animating the propagation process etc.
(i would trust his judgement as an educator to know when to slip in something new, just to pique curiosity for the viewer/student to pick up on and find out more)
There’s a bright yellow rectangle around the last LIST OF NUMBERS when he says the last vector. Anyone too stupid to make that connection but also allegedly want to know how LLMs work would also be asking “sir sir I have a doubt please do the needful and revert” regardless of how comprehensive the explanation was
I truly love these ML/DL videos. I'm new to studying the field so these are so helpful to my understanding. Please continue making em!
I've been studying this topic since 2023, and this is the best short video covering the most important aspects of LLM's without unnecessary jargon or misleading statements. Well done!
Fun fact: if you keep asking the transformer to complete the AI assistant after it has ended its answer, it will then invent text from the user, impersonating them. At least that's the case with most models. It will be trying to predict what the user is going to say, the same way that the answer from the assistant is what the LLM "thinks" the assistant would say.
It's as if the machine is playing the "character" of an assistant, rather than being an assistant.
It's also why it's so hard for it to admit when it doesn't know something. It's playing improv, trying to perform the character of someone who _does_ know the answer.
I think it's quite interesting that ultimately something like ChatGPT appears to be a helpful assistant that can answer questions. But in reality its just a piece of software that is really good a completing incomplete conversations. And when then just happen to give it a conversation where the user asks something and now the response has to be filled in. But we present it as thought the response is the assistant even thought it's not really.
Honestly, i dont even see AGI any time soon, how can it even come when the LLMs dont understand shit?
@@columbus8myhw Ironically enough, that's how humans often work too, we all play the roles we think we are supposed to be playing
@@pixelforg
It seems unlikely that a real AGI is going to be a large language model specifically, though it's possible (likely, even) that advancements in tech from that field would inform the design and function of such a hypothetical AGI.
Hadn't tuned into a 3B1B video in a while. This man is the gold standard for explaining complicated mathematical concepts. Period. I hope he never stops.
This is easily the most informative and easiest to understand video about LLMs. The lightbulb went off as soon as you said,"A Large Language Model is a sophisticated mathematical function that predicts what word comes next for any piece of text"!
I’ve only and sit half way through and this must be the most clear explanation of LLMs I’ve ever seen. Impeccable work dude!
This guy has that Reimann Hypothesis Video that I keep watching often and still fascinates me !!
I work directly with a client in this field. This is one of the best explanations I've heard.
I would like to express my sincere appreciation for the impressive work you continue to do. The quality and dedication they demonstrate in each publication is admirable.
However, I've noticed that unlike older posts, I can't currently find videos in Spanish. While I understand that this change may have several reasons, I must admit that, although I would have liked to have those resources in my language, this also represents an opportunity to improve my level of English, which is a personal advantage.
Thank you for sharing your knowledge, "I VALUE IT A LOT.", cordial greetings from Spain.
I wouldnt have enough lifetime to say thanks to all the effort and insightfullness of your content
Nobody have instructed AI to give specific answers, it massages itself to fit data. This is the most important thing people need to realize
I just wanted to take the time to give you a huge thank you. I’m a finance guy but am focusing my portfolio towards AI so improving my understanding of how it actually works after having pretty much no experience in it is a must. I have no formal experience but I am an enthusiast in the field. I’m glad I found your channel and this video because it’s really helped me to get some foundational knowledge on how this works. I’ve went back and watched your more in depth videos on LLMs too and they have also been great. Thank you for the informative videos. The illustrations are incredibly helpful too.
I am so proud of you! I've been lucky to visit CS museum just before covid and it is still one of the brightest memories around my visit that year.
Can we take a moment to appreciate the carefully selected background music 3B1B uses in his videos? It makes his videos even more enjoyable to listen to.
superb! will share with my 11th grader son, honestly this is the best way to introduce LLMs to young people and channel their curiosity
Absolutely love how you're sharing knowledge on large language models. Your passion for the topic is infectious! Great video, keep it up!
great video as usual! i'm a math phd, and get disappointed with how little math folks talk about the labor behind large language models. it would be nice to see folks explicitly recognize how these models are built. its not all math under the hood!
The training provided here highlights why watching your video is far more effective than attempting to transcribe it and listen to it being read back.
It turned out to be very informative, we demand a more detailed version of the video
This is great! I often find it hard to understand your videos because I'm so bad at math (I still watch them though...). But this video is perfect!
just stick with it - he is an absolute star in educating people using simple language, i'm not too advanced in math either, but just let his explanation wash over you and you WILL get the gist of it.
Best explainer of LLMs I’ve found so far!
Ace content. Have you ever considered doing a few on image generation? Would love to see those explainers from you
what an honor! to be apart of computer hisotyr museum
Thanks!
Great video.
4:47 afaik RLHF does not use human annotation for reinforcement learning on the base model. Instead, Human Feedback is used to align a reward model for the RL process on the base model.
I think he knows that but he explained it like that to make it easier to understand for begginers
Your best one on this topic. Excellent. I teach AI, and this one I can show my friends. And wife.
And, I’ve visited the Computer History Museum. Loved it. Looking forward to my next trip to the Valley where I will visit it again. It is awesome.
Impressive explanation. I could listen all day to you, explaining abstract things.
The fact that the final embedding vector is able to somehow encode the meaning of all of the previous text in the context window well enough the predict the next token is the most mind blowing part.
You might just be the best educator in the interwebs. This was superb.
For anyone who hasn't checked out the Computer History Museum youtube channel, I would highly recommend for valuable primary sources/interviews with pioneers in the field! I love Gary Starkweather's lecture on the 'Birth of Laser Printing'!
I see a very old friend here at 0:10, the first computer terminal I ever used and the computer was a PDP-11. Good times.
I remember that one too. I also used the teletype appearing terminal at one point.
It’s often difficult to understand how we survived such primitive technology. Lol
Thanks for this brief video which beautifully explains concepts like LLMs, Attention mechanism, Reinforcement Learning to the general audience. I remember reading 2017 "Attention is all you need" paper for my UG project in 2019 and was deeply fascinated when GPT2 released in early 2020. Much time has passed since then and now AI is has moved far beyond simple Transformer architecture so this video works as great refresher and solid foundation for curious beginners.
Absolutely fantastic explanations here! This seems like a great introduction to your other videos on LLM's!
Already shared with my colleagues, many who are LLM curious. Thank you for this accessible on ramp!
What I find most useful as a beginner in anything is to understand what I'm learning is buying me; for example, if someone had told me that learning the forms in calculus would save me literal days of algebraic nonsense and yield certain results in moments, I would have been MUCH more enthusiastic ;-)
So you're doing pretty well - I think people who basically just heard about it, wonder what it is, and have thinking minds got enough to determine usefulness not just in general but to them directly. That will get you more new subs as those you might lose who aren't really aligned with your content; beginners will always exist.
4:59 this staggering amount of computation is also only made possible by an equally staggering amount of power and water consumption. AI training at this scale is exasperating climate change by rapidly increasing the amount of power big tech companies like Google are using.
You're the best!!! You're videos have taught me so much in the last couple of years. Don't stop doing what you do!!!
Beautifully written and concisely summarized!
6:01 "talk tuah one another". My brain is rotting, but great video
I recommend every maths and ML enthusiast to watch your videos when we talk about maths or ML ❤
The idea of attention is brilliant. It basically invents a langunage with little semantic ambivalency, so things "aren't lost in translation
Always a good day when 3blue1brown uploads
Awesome! I’ve been trying to figure out how to help a friend watch your videos on machine learning and stuff like that without getting lost in the math. You do a great job of explaining both the concepts and the math, and I would say something like ignore the heavy math, look at the animations and listen to what Grant is saying about them. Then you made this video. I thank you very much.
Excellent work, as always. It would be fantastic if you could start incorporating audio tracks in various languages into your videos.
Thank you SO much for making a more concise video. I'm actually going to show this to my grandma next Sunday.
Hey 3B1B I've recently become engrossed by everything AI / machine learning that I've been devouring content like yours and googles machine learning crash course. Thanks for all the amazing content!😄
Bro just summarized weeks worth of initial research as a graduate student in an 8 min video, love to see it
I literally started researching machine learning a week ago then tried finding llm explanations but nothing animated or concretized today and then you drop a banger answering this
You know it's gonna be a beautiful day when Grant drops us a video out of nowhere
Absolutely - I own an AI company, talk about the tech, and talk about AI law. I often talk with folks who are quite smart but not trained in math. I had assembled a collection of your last videos as an intro, but this is the single best intro I've seen! Thank you!
That is extremely impressive for 7 minutes. Well done!
So relaxing to listen to whilst learning something fascinating ❤️
These deep learning videos are awesome, although less popular, please cover the ever mystical Graph Neural Networks next.
Very useful, excellent visualizations, even better than the usual high standard. More beginner friendly video's please. Thank you!
this video in my opinion should be placed in the neural networks playlist right before the videos that start talking about the transformer architecture, this video serving as an introduction, then the other following with more detail
OMG, I love that museum and I took my kid there earlier this year! Their favorite section was the old videogames that you could play lol
Absolutely stunning explanation 🔥
Really LOVE this series. Wait for the Training and Inference topic come
😀
This is very well explained and extremely high quality content.
Okay, so you carried me through my master’s degree exams, im writing my thesis rn. If i get this degree, ill be a patreon supporter 🤝
Thanks a lot.
This is a VERY good simple introduction to how LLM's work.
I would love to see more.
This is a great museum. It was so much fun to see computers I grew up with and the origins from many years before I was born.
Very close to Google HQ in Mountain View, CA, so if you are going to be in the area, check it out.
This deserves a thumbs up!
Honestly such an incredible explanation !
I once asked ChatGPT to write an essay describing the reception history of Mozart’s Symphony no. 25 in G minor, and it did a breathtakingly good job.
I started college last September (2023), and since that time, it is crazy how much AI has changed. Im in eletronics, and for complex questions, the earlier models could sometimes explain the solution after a few back and forths, but the math with actual numbers was horrendous.
With gpt4o or whatever it called, as an example; I can take a screenshot of a question that supplies a circuit diagram of an FM transmitter and is a 5-part question that requires lots of knowledge on FM circuitry as well as pretty complex formulas. It'll answer all 5 parts of the question with decent explanations, and I can easily get a deeper breakdown by another prompt.
I could probably pass my program without actually learning anything. The teachers are generally okay with us using it for assignments and what not, but still have strict exam policies. I don't think they fully understand how much its improving every month. They tried it a while ago and went, "thats cool, but it still has issues."
Hey 3Blue1Brown -- Great video as always
What an amazing video! Fantastic explanation: clear, concise, and will stick in my memory!
Thanks for this explanation it solves one of my fundamental questions about AI and their way to operate.
Finally a good explanation of this. Thank you!
Super nice video for introducing non technical people to LLMs!
I sent some of your more thechnical viedos to some friends but they were a bit put off by all the maths (which i really liked by the way) and the length so now i can send them this ;)
This was great! Really good intro
4:20
it's incredible they've been working on large language models for over a hundred million years
That line is a bit misleading. it’s not that people have been working on them for them for that long, but rather when they train the models they have to train them for that long
Blaze it
Great job. This is a simplified and effective way to introduce the idea. However while LLMs predict the next word based on patterns they've learned, they do so by considering a broader context than just the immediate word before them. This means they use the surrounding text and sometimes even long-range dependencies to make better predictions. This is particularly important for tasks like maintaining coherence over longer paragraphs or handling more complex questions. (This explanation is from Chat-GPT)
bro is on the grind 🙏
Awesome!!!! Never thought i could visualize it
Thank you so much for this video! I've always wanted a good introduction to show to other people that are curious about how it all works! (and finally I have it)
Fantastic! You RULE! Excellent explanation of the topic.
There are probably many reasons why this video is a good resource, but one that springs to mind for me is that it makes it quite clear why LLMs are now consuming so much energy in the world: brute force has a high cost.
Superb work. Another really impressive video.
Simply incredible quality work 👌