MIT 6.S087: Foundation Models & Generative AI. CHAT-GPT & LLMs
FULLSTÄNDIGT TRANSKRIPT
all
right welcome to the third lecture on
Foundation mulative AI So today we're
going to cover chat
GPT um and um right I mean I think for a
lot of people chat GP was
the the tool or the the AI that really
made people understand this is different
now we're able to do things we weren't
able to do before and and definitely uh
created some kind of hype uh so
hopefully after this lecture you'll you
understand kind of the basic idea and
also somehow understand the BET right
the bet that open Ai and Ilia the head
researcher did in terms of what actually
would lead to CHP and how in hindsight
it might be quite I mean easy but it was
a really daring bad not obvious at all
at the time that this would actually
work out
um so should be be a lot of fun and just
to quickly go through our course
schedule as well a little bit right so
today is January 16 uh and next time
we'll talk about stable diffusion image
generation and then we'll talk about
emerging Foundation models basically
Foundation models generative AI in the
commercial space H we'll have two guest
speakers and then we'll end with the
lecture on AI ethics and regulation as
well as a
panel okay so what have we talked about
before we started off
H with an introduction a short high
level intuitive answer to what is
foundation M generative
AI we went a little bit on a
philosophical digression and asked about
how's the world structured because that
allows us to think about how we should
learn in the world then we on the second
lecture went through all the different
algorithms um and yeah today we'll we'll
dive in more specifically into chpt and
kind of uh pull everything together um
and to reiterate right so what do we do
in uh Foundation models geni well we
apply this self-supervised learning
where we learn without uh label data so
we can we can get you know as much data
as we want because there's no human
being in the loop so there's no limit
how much we can scale this up and and
what we get from this you know by
learning from observation and learning
from the data directly is a very
contextual and relational understanding
of meaning and we gave this example
before about you know from a supervised
learning perspective you learn what a
dog is from seeing you know labeled uh
examples of dogs and in reinforcement
learning you focus on optimizing certain
goals and you understand a dog in
relation to how it makes you happy or
fulfilled in some sense or optimizing
your goals but in self supervised
learning right it's the foundational
technology behind uh Foundation models
you learn from observing dogs in
different context and you get a very
relational definition of a dog so it's
something that's walk by an owner with a
leash it has an anistic R with cats it
chases fris with oone right this is your
definition of what a dog is and today
we'll you talk about something that's
extremely engineering heavy in you know
chat GPT uh relies on a lot of tricks
and Engineering insights and
breakthroughs that we're not going to
cover and I think still though you know
like it's like talking about a car you
can understand the high level
perspective of a car and get some
insights how to work how it works and
how it's going to be useful for you
without getting into all the engineering
details but of course in real life those
engineering details really really
matters and are very very hard to get
right and that's something that we won't
really dive into in this lecture because
that's just when you bring something up
certain scale and you have to paralyze a
lot of machines Etc and think about high
parameters it's a whole science so it's
not trival at all but it's kind of hard
uh to teach in a course like this and
and you have to learn by just actually
building this
stuff
um okay so
um also a little bit of philosophizing
in this uh class as
well
um I think that again like we talked
about a little bit of a theme here right
is that the why this new AI is so
powerful is because it doesn't Force
things to comply to Simple Rules right
it kind of abandons our ability to
understand and compress what we're
seeing and deals with that chaos
directly that's why AI is so powerful
and so
humanlike um so also like when I talk
about this in CHP we try to make very
high level um statement but of course
the nuances matters and I think it's
quite interesting uh I took this quote
from a general from the 18 and
1700s and he says this uh quote that P
Theory which sets itself in opposition
to the mind and what he meant was that
he's a general so he fights in battles
and War and at the time people loved to
come up and theorize around War like we
should have certain rules and how
soldiers should behave in fighting and
stuff like that but he's like well I've
been in War uh and Wars don't comply to
rules first off so you know everybody
has a plan before they get hit in face
basically so you know as people start
shooting at you and you have this fog of
War of you don't know what's going on
there's no simple rules to help you
there and also what he says this in
terms of the mind he says like well
actually he's realized by working with
soldiers that soldiers and human beings
our mind we're not good at acting
according to rules that we try to
memorize we're very intuitive and very
kind of quick to react to things by our
intuition that's what really really
matters and that's what we're strong at
so if you force a soldier's well Al try
to memorize a lot of rules and that's
how it should act in a battle you're
kind of screwed and very limited in what
you can do uh which also is something
that I think AI uh in a new type of AI
leverages okay so chat
GPT um right this is a really amazing
breakthrough that uh has some very
humanlike Mastery of language that we
can communicate that can basically solve
a really wide array of tasks for us
anything that can be phrased in terms of
text language it can it can basically
solve and now as well when with gp4 ET
becomes uh it's able to handle multi
modalities but it's it's extremely
powerful so let's try to break this
apart well first off what does this name
actually stand for well the chat part is
obvious it stands for chat and then GPT
stands for generative pre-trained
Transformer and this is a I mean a good
description of what this uh actually is
um and I think also if you look at the
the two different three different
concepts here they're also almost
corresponding length in terms of how
important and influential they are in
making chat GPT work so chat part we
we'll cover last it's the kind of the
least important one in some sense H the
Genty pre-trained is the self supervised
step of how you train this and arrive at
this uh model and then the Transformer
is the basically the engine behind it in
some sense
and so let's start with this generative
pre-train what does it mean how do we
pre-train this model and that's
basically where openi spent 99% of the
compute was to do this pre-training step
so it's it's very very
important okay so what we're going to do
is that we're going to uh just take some
random text from the internet so we have
a sequence of words and and then we're
just going to try to predict uh the next
word based on previous words so let's
say we have uh we start with i here as
LÅS UPP MER
Registrera dig gratis för att få tillgång till premiumfunktioner
INTERAKTIV VISARE
Titta på videon med synkroniserad undertext, justerbart överlägg och fullständig uppspelningskontroll.
AI-SAMMANFATTNING
Få en omedelbar AI-genererad sammanfattning av videoinnehållet, nyckelpunkter och slutsatser.
ÖVERSÄTT
Översätt transkriptet till över 100 språk med ett klick. Ladda ner i valfritt format.
MIND MAP
Visualisera transkriptet som en interaktiv mind map. Förstå strukturen med ett ögonkast.
CHATTA MED TRANSKRIPT
Ställ frågor om videoinnehållet. Få svar från AI direkt från transkriptet.
FÅ UT MER AV DINA TRANSKRIPT
Registrera dig gratis och lås upp interaktiv visning, AI-sammanfattningar, översättningar, mind maps och mer. Inget kreditkort krävs.