TRANSCRIÇÃOEnglish

State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490

4h 25m 4s47,136 palavras4,088 segmentsEnglish

TRANSCRIÇÃO COMPLETA

0:00

- The following is a conversation all about the state-of-the-art in artificial

0:04

intelligence, including some of the exciting technical breakthroughs and

0:07

developments in AI that happened over the past year, and

0:11

some of the interesting things we think might happen this upcoming

0:15

year. At times, it does get super technical,

0:19

but we do try to make sure that it remains accessible to folks

0:23

outside the field without ever dumbing it down. It

0:27

is a great honor and pleasure to be able to do this kind of

0:31

episode with two of my favorite people in the AI

0:34

community, Sebastian Raschka and Nathan

0:38

Lambert. They are both widely respected machine

0:41

learning researchers and engineers who also happen to be great

0:45

communicators, educators, writers, and X posters.

0:49

Sebastian is the author of two books

0:53

I highly recommend for beginners and experts alike. First is

0:57

Build a Large Language Model from Scratch

1:01

and Build a Reasoning Model from Scratch. I

1:05

truly believe in the machine learning world, the

1:09

best way to learn and understand something is to build it

1:13

yourself from scratch. Nathan is

1:17

the post-training lead at the Allen Institute for AI,

1:21

author of the definitive book on Reinforcement Learning from Human Feedback.

1:26

Both of them have great X accounts, great Substacks.

1:30

Sebastian has courses on YouTube, Nathan has a podcast.

1:34

And everyone should absolutely follow all of those.

1:37

those. This is the Lex Fridman podcast. To support it, please

1:41

check out our sponsors in the description, where you can also find

1:45

links to contact me, ask questions, get feedback, and so

1:49

on. And now, dear friends, here's Sebastian Raschka and Nathan Lambert.

1:57

So I think one useful lens to look at all this through is

2:01

the so-called DeepSeek moment. This happened about

2:05

a year ago in January 2025, when the open-weight Chinese

2:09

company DeepSeek released DeepSeek R1, that I

2:13

think it's fair to say surprised everyone with near-state-of-the-art

2:16

performance, with allegedly much less compute for much cheaper. And from then

2:24

to today, the AI competition has gotten insane,

2:28

both on the research and product level. It's just been accelerating.

2:32

discuss all of this today, and maybe let's start with some spicy

2:36

questions if we can.

2:38

Who's winning at the international level? Would you say it's the set

2:42

of companies in China or the set of companies in the United States?

2:46

And Sebastian, Nathan, it's good to see you guys.

2:50

guys. So Sebastian, who do you think is winning?

2:53

- Winning is a very broad term.

2:57

I would say you mentioned the DeepSeek moment, and I think DeepSeek is winning

3:01

the hearts of the people who work on open-weight models because they share

3:05

these as open models. Winning, I think, has multiple

3:09

timescales to it. We have today, we have next year, we have in 10

3:12

years. One thing I know for sure is that I don't

3:16

think nowadays, in 2026, that there will be any

3:20

company that has access to technology that no other

3:24

company has access to. That is mainly because researchers

3:28

are frequently changing jobs and labs.

3:32

They rotate. I don't think there will be a clear winner in terms of

3:36

technology access. However, I do think there will be,

3:39

The differentiating factor will be budget and hardware constraints.

3:43

I don't think the ideas will be proprietary,

3:46

but rather the resources needed to implement them. I don't see

3:53

currently a winner-take-all scenario. I can't see that. At the moment.

3:59

- Nathan, what do you think?

4:00

- You see the labs put different energy into what they're trying to do, and

4:04

I think to demarcate the point in time when we're recording this, the hype

4:08

over Anthropic's Claude Opus 4.5 model has been

4:12

absolutely insane, which is just... I mean, I've used it and built stuff

4:16

in the last few weeks, and it's... it's almost gotten to the point where it feels like a bit of

4:20

a meme in terms of the hype. And it's

4:22

kind of funny because this is very organic, and then if we go back a few months

4:26

ago, we can see the release date and the notes, as Gemini 3 from Google got

4:30

released, and it seemed like the

4:33

marketing and just, like, wow factor of that release was super

4:36

high. But then at the end of November, Claude Opus 4.5 was released and

4:40

the hype has been growing, but Gemini 3 was before this. And it kind of feels

4:44

like people don't really talk about it as much, even though when it came out, everybody was like, this

4:48

is Gemini's moment to retake Google's

4:52

structural advantages in AI. And Gemini 3 is a fantastic model, and I still use it.

4:56

It's just kind of differentiation is lower. And I

5:00

agree with Sebastian; what you're saying with all these, the idea space is

5:04

very fluid, but culturally Anthropic is known for betting very

5:08

hard on code, which is the Claude Code thing, is working out for them right now. So I

5:12

think that even if the ideas flow pretty freely, so much of this is

5:16

bottlenecked by human effort and the culture of organizations, where Anthropic

5:20

seems to at least be presenting as the least chaotic. It's a

5:24

bit of an advantage, if they can keep doing that for a while. But on the other

5:28

side of things, there's a lot of ominous technology from China where

5:32

there's way more labs than DeepSeek. So DeepSeek kicked off

5:35

a movement within China, I say kind of similar to how

5:39

ChatGPT kicked off a movement in the US where everything had a chatbot. There's now

5:44

tons of tech companies in China that are releasing very strong frontier open-weight

5:48

models, to the point where I would say that DeepSeek is kind of losing its crown as the

5:52

preeminent open model maker in China, and the likes of

5:56

Z.ai with their GLM models, Minimax's models,

6:00

Kimi Moonshot, especially in the last few months, has shown more

6:04

brightly. The new DeepSeek models are still very strong, but that's kind of

6:08

a... it could look back as a big narrative point where in 2025

6:11

DeepSeek came and it provided this platform for way more Chinese

6:15

companies that are releasing these fantastic models to kind of have this new

6:19

type of operation. So these models from these Chinese companies are open-weights, and

6:24

depending on this trajectory of business models that these American companies are

6:27

doing, they could be at risk. But currently, a lot of people are paying

6:31

for AI software in the US, and historically in China and other

6:35

parts of the world, people don't pay a lot for software.

6:37

- So some of these models like DeepSeek have the love of the people because

6:41

they are open-weight. How long do you think the Chinese companies keep

6:45

releasing open-weight models?

6:47

- I would say for a few years. I think that, like in the US, there's not a

6:51

clear business model for it. I have been writing about open models for a while,

6:55

and these Chinese companies have realized it. So I get inbound from some of them.

6:59

And they're smart and realize the same constraints: a lot of top US tech

7:02

companies and other IT companies won't pay for an API subscription to

7:07

Chinese companies for security concerns. This has been a long-standing

7:10

habit in tech, and the people at these companies then see open

7:14

weight models as an ability to influence and take part of a huge growing

7:18

AI expenditure market in the US. And they're very realistic about this,

7:23

and it's working for them. I think that the government will see that that is

7:27

building a lot of influence internationally in terms of uptake of the technology,

7:31

so there's going to be a lot of incentives to keep it going. But building

7:35

these models and doing the research is very expensive, so at some point, I expect

7:39

consolidation. But I don't expect that to be a story of 2026, where there will be

7:44

more open model builders throughout 2026 than there were in 2025. And a

7:48

lot of the notable ones will be in China.

7:50

- You were going to say something?

7:51

- Yes. You mentioned DeepSeek losing its crown. I do think to some extent, yes, but

7:58

we also have to consider though, they are still, I would say, slightly ahead. And

8:02

the other ones—it's not that DeepSeek got worse, it's just that the other ones

8:06

are using the ideas from DeepSeek. For example, you mentioned Kimi—same

8:10

architecture, they're training it. And then again, we have this leapfrogging

8:13

where they might be at some point in time a bit better because they have the more recent

8:17

model. And I think this comes back to the fact that there won't be

8:21

a clear winner. It will just be like that: one person releases

8:25

something, the other one comes in, and the most recent model is probably always the

8:29

best model.

8:30

- Yeah. We'll also see the Chinese companies have different incentives. Like,

8:33

DeepSeek is very secretive, whereas some of these startups are

8:37

like the MiniMaxs and Z.ais of the world. Those two literally have filed

8:41

IPO paperwork, and they're trying to get Western

8:45

mindshare and do a lot of outreach there. So I don't know if these incentives will change the

8:49

model development, because DeepSeek famously is built by a hedge fund,

8:53

Highflyer Capital, and we don't know exactly what they use the

8:57

models for or if they care about this.

8:59

- They're secretive in terms of communication; they're not secretive in terms of the technical reports that

9:03

describe how their models work. They're still open on that front. And we should also

9:06

say, on the Claude Opus 4.5 hype, there's the layer of something

9:13

being the darling of the X echo chamber, on the

9:18

Twitter echo chamber, and the actual amount of people that are using the

9:22

model. I think it's probably fair to say that ChatGPT and

9:25

Gemini are focused on the broad user base that just

9:29

want to solve problems in their daily lives, and that user base

9:33

is gigantic. So the hype about the coding may not be

9:37

representative of the actual use.

9:38

- I would say also a lot of the usage patterns are,

9:43

like you said, name recognition, brand and stuff, but also

9:47

muscle memory almost, where, you know, ChatGPT has been around

9:50

for a long time. People just got used to using it, and it's almost like a flywheel:

9:54

they recommend it to other users and that stuff. One interesting point is also

9:58

the customization of LLMs. For example, ChatGPT has a

10:02

memory feature, right? And so you may have a subscription and you

10:06

use it for personal stuff, but I don't know if you want to use that same thing at work.

10:10

Because it's a boundary between private and work. If you're working at a company, they might not

10:14

allow that or you may not want that. And I think that's also an interesting point

10:18

where you might have multiple subscriptions. One is just clean code.

10:22

It has nothing of your personal images or hobby

10:26

projects in there. It's just like the work thing. And then the other one is your personal thing.

10:30

So I think that's also something where there are two different use cases, and it doesn't mean

10:34

you only have to have one. I think the future is also multiple ones.

10:38

- What model do you think won 2025, and what model do you think is going to win '26?

10:43

- I think in the context of consumer chatbots, it's a question of: are you willing to

10:47

bet on Gemini over ChatGPT?

10:50

Which I would say, in my gut, feels like a bit of a risky bet

10:54

because OpenAI has been the incumbent, and there are so many benefits to that in tech.

10:58

I think the

11:01

momentum, if you look at 2025, was on Gemini's side, but they were starting from

11:05

such a low point. And RIP Bard and these earlier attempts at getting started.

11:13

Huge credit to them for powering through the organizational chaos to make that happen.

11:17

But also it's hard to bet against OpenAI because they always come off as

11:22

so chaotic, but they're very good at landing things. And I think,

11:26

personally, I have very mixed reviews of GPT-5, but it must have

11:30

saved them so much money with the high-line feature being a router where

11:34

most users are no longer charging their GPU costs as much.

11:38

So I think it's very hard to dissociate

11:42

the things that I like out of models versus the things that are going to

11:45

actually be a general public differentiator.

11:50

- What do you think about 2026? Who's going to win?

11:52

- I'll say something, even though it's risky. I think Gemini will continue to make progress on ChatGPT.

11:56

I think Google's scale, when both of these are

11:59

operating at such extreme scales—and Google has the

12:03

ability to separate research and product a bit better, whereas you hear so much

12:07

about OpenAI being chaotic operationally and chasing the high-impact thing,

12:11

which is a very startup culture. And then on the software and enterprise side,

12:15

I think Anthropic will have continued success, as they've again and again been set up for that.

12:19

And obviously Google Cloud has a lot of offerings,

12:23

but I think this kind of Gemini name brand is important for them to build.

12:27

Google Cloud will continue to do well, but

12:31

that's a more complex thing to explain in the

12:34

ecosystem, because that's competing with the likes of Azure and AWS rather than

12:38

on the model provider side.

12:40

- So in infrastructure, you think TPU is giving an advantage?

12:45

- Largely because the margin on NVIDIA chips is insane, and

12:49

Google can develop everything from top to bottom to fit their stack and not have

12:53

to pay this margin. And they've had a head start in building data

12:57

centers. So all of these things that have both high lead times and very hard margins on

DESBLOQUEAR MAIS

Registe-se gratuitamente para aceder a funcionalidades premium

VISUALIZADOR INTERATIVO

Assista ao vídeo com legendas sincronizadas, sobreposição ajustável e controlo total da reprodução.

REGISTE-SE GRATUITAMENTE PARA DESBLOQUEAR

RESUMO DE IA

Obtenha um resumo instantâneo gerado por IA do conteúdo do vídeo, pontos-chave e conclusões.

REGISTE-SE GRATUITAMENTE PARA DESBLOQUEAR

TRADUZIR

Traduza a transcrição para mais de 100 idiomas com um clique. Baixe em qualquer formato.

REGISTE-SE GRATUITAMENTE PARA DESBLOQUEAR

MAPA MENTAL

Visualize a transcrição como um mapa mental interativo. Entenda a estrutura rapidamente.

REGISTE-SE GRATUITAMENTE PARA DESBLOQUEAR

CONVERSAR COM A TRANSCRIÇÃO

Faça perguntas sobre o conteúdo do vídeo. Obtenha respostas com tecnologia de IA diretamente da transcrição.

REGISTE-SE GRATUITAMENTE PARA DESBLOQUEAR

APROVEITE MAIS DE SUAS TRANSCRIÇÕES

Inscreva-se gratuitamente e desbloqueie o visualizador interativo, resumos de IA, traduções, mapas mentais e muito mais. Não é necessário cartão de crédito.

EXPERIMENTE YOUTUBETRANSCRIPT.DEV COMECE GRATUITAMENTE