Andrej Karpathy: From Vibe Coding to Agentic Engineering
FULL TRANSCRIPT
We're so excited for our very first
special guest. He has helped build
modern AI, then explain modern AI, and
then occasionally rename modern AI. He
actually helped co-ound open AAI right
inside of this office. Was the one who
actually got Autopilot working at Tesla
back in the day, and he has a rare gift
of making the most complex technical
shifts feel both accessible and
inevitable.
You all know him for having coined the
term vibe coding last year, but just in
the last few months, he said something
even more startling. That he's never
felt more behind as a programmer. That's
where we're starting today. Thank you,
Andre, for joining us.
>> Yeah. Hello. Excited to be here and to
kick us off.
>> Okay. So, just a couple months ago, you
said that you've never felt more behind
as a programmer. That's startling to
hear from you of all people. Um, can you
help us unpack that? Was that feeling
exhilarating or unsettling?
>> Uh yeah, a mixture of both for sure. Uh
well, first of all, um
I guess like as many of you, I've been
using agentic tools like lot code,
adjacent things, uh for a while, maybe
over the last year as it came out and it
was very good at you know chunks of code
and sometimes it would mess up and you
have to edit them and it was kind of
helpful and then I would say December
was this uh clear point where for me I
was on a break so I had a bit more time.
I think many other people were similar
and uh I just started to notice that
with the latest models uh the chunks
just came out fine and then I kept
asking for more and it just came out
fine and then I can't remember the last
time I corrected it and then I was I
just you know trusted the system more
and more and then I was vibe coding
[laughter]
and uh so it was kind of a I do think
that it was a very stark transition. I
think that a lot of people actually I
tried to I tried to stress this on uh
Twitter and or X because I think a lot
of people experienced AI last year as
ChachiPT adjacent thing. Uh but you
really had to look again and you had to
look as of December uh because things
have changed fundamentally and uh
especially on this like agentic coherent
workflow uh that really started to
actually work. Um, and so I would say
that um, yeah, it was just that
realization that really uh, uh, had me
um, go down their whole rabbit hole of
just, you know, infinity side projects.
Uh, my side projects folder is like
extremely full with lots of random
things and, uh, just, uh, V coding all
the time. Uh, so, uh, yeah, that kind of
happened in December, I would say, and I
was looking at the repercussions of that
since.
>> Um, you've talked a lot about this idea
of LLMs as a new computer. um that it
isn't just better software, it's a whole
new computing paradigm. And um software
1.0 was explicit rules, software 2.0 was
learned weights, software 3.0 is this.
Um if that's actually true, what does a
team build differently the day they
actually believe this,
>> right? So uh yeah, exactly. So software
1.0, I'm writing code, software 2.0, I'm
actually programming by creating data
sets and training uh training neural
networks. So the programming is kind of
like arranging data sets and maybe some
objectives and neural network
architectures. And then what happened is
that basically if you train one of these
GPT models or LLMs on a sufficiently
large set of tasks implicit basically um
implicitly because by training on the
internet you have to multitask all the
things that are in the data set. Uh
these actually become kind of like a
programmable computer in a certain
sense. So software 3.0 know is kind of
about uh you know your programming now
turns to prompting and what's in the
context window is your lever over the
interpreter that is the LLM that is kind
of like interpreting your context and uh
performing computation in the dig
digital information space. So I guess um
yeah that's kind of the transition and I
think there's a few examples of that
really drove it home for me and maybe
that might be instructive. Uh so for
example when you when openclaw came out
when you want to install openclaw you
would expect that normally this is a
bash bash script like a shell script. So
run the shell script to run to install
open claw. Um but the thing is that in
order to target lots of different
platforms and lots of different types of
computers you might run an open claw.
This these shell scripts usually balloon
up and become extremely complex. But the
thing is you're still stuck in a
software 1.0 universe of wanting to
write the code. And actually the open
claw installation is a is a copy paste
of a b bunch of text that you're
supposed to give to your agent. Uh so
basically it's it's a little skill of uh
you know copy paste this and give it to
your agent and it will install open
claw. And the reason this is a lot more
powerful is you're working now in the
software 3.0 paradigm where you don't
have to precisely spell out you know all
the individual details of that setup.
The agent has its own intelligence that
it packages up and then it kind of like
follows the instructions and it looks at
your environment, your computer and it
kind of like performs intelligent
actions to make things work and it
debugs things in the loop and it's just
like so much more powerful, right? So I
think that's a very different kind of
like way of thinking about it is just
like what is the piece of text to copy
paste to your agent? That's the
programming paradigm. Now I think one
more maybe uh example that comes to mind
that is even more extreme than that is
when I was building um menu genen. So,
menu genen is this idea where you um you
come to a restaurant, they give you a
menu. There's no pictures usually. So, I
don't know what any of these things are
uh usually like 30% of the things I have
no idea what they are, 50%. So, I wanted
to take a photo of the restaurant menu
and to get pictures of what those things
might look like in a generic sense. And
so I built I've vcoded this app that
basically lets you upload a photo and it
does all this stuff and it runs on
Verscell and uh it basically rerenders
the menu and it gives you like all the
items and it gives you a picture that it
uses an image um you know generator uh
for to basically OCR all the different
titles uh use the image generator to get
pictures of them and then shows it to
you. And then I saw the software 3.0
version of this which is which blew my
mind which is literally just take your
photo give it to Gemini and say use
Nanobanana to overlay the the things
onto the menu. Uh and Nanabanana
basically returned an image that is
exactly the picture of the menu that I
took but it actually put into the pixels
it rendered the different things in the
menu and this blew my mind because
actually all of my menu gen is spirious.
It's working in the old paradigm that
app shouldn't exist. uh and uh yeah the
software 3.0 paradigm is a lot more kind
of raw. It just um your neural network
is doing more and more of the work and
your prompt or context is just the image
and the output is an image and there's
no need to have any of the app in
between. Um so I think that people have
to kind of like reframe you know not to
work in existing paradigm of what things
existed and just think about it as a
speed up of what exists. It's actually
like new things are available now. And
going back to your programming question,
it's not even I think that's also an
example of working in the in the old
mindset because it's not just about
programming and programming becoming
UNLOCK MORE
Sign up free to access premium features
INTERACTIVE VIEWER
Watch the video with synced subtitles, adjustable overlay, and full playback control.
AI SUMMARY
Get an instant AI-generated summary of the video content, key points, and takeaways.
TRANSLATE
Translate the transcript to 100+ languages with one click. Download in any format.
MIND MAP
Visualize the transcript as an interactive mind map. Understand structure at a glance.
CHAT WITH TRANSCRIPT
Ask questions about the video content. Get answers powered by AI directly from the transcript.
GET MORE FROM YOUR TRANSCRIPTS
Sign up for free and unlock interactive viewer, AI summaries, translations, mind maps, and more. No credit card required.