文本记录English

State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490

4h 25m 4s47,136 字数4,088 segmentsEnglish

完整文本记录

0:00

- The following is a conversation all about the state-of-the-art in artificial

0:04

intelligence, including some of the exciting technical breakthroughs and

0:07

developments in AI that happened over the past year, and

0:11

some of the interesting things we think might happen this upcoming

0:15

year. At times, it does get super technical,

0:19

but we do try to make sure that it remains accessible to folks

0:23

outside the field without ever dumbing it down. It

0:27

is a great honor and pleasure to be able to do this kind of

0:31

episode with two of my favorite people in the AI

0:34

community, Sebastian Raschka and Nathan

0:38

Lambert. They are both widely respected machine

0:41

learning researchers and engineers who also happen to be great

0:45

communicators, educators, writers, and X posters.

0:49

Sebastian is the author of two books

0:53

I highly recommend for beginners and experts alike. First is

0:57

Build a Large Language Model from Scratch

1:01

and Build a Reasoning Model from Scratch. I

1:05

truly believe in the machine learning world, the

1:09

best way to learn and understand something is to build it

1:13

yourself from scratch. Nathan is

1:17

the post-training lead at the Allen Institute for AI,

1:21

author of the definitive book on Reinforcement Learning from Human Feedback.

1:26

Both of them have great X accounts, great Substacks.

1:30

Sebastian has courses on YouTube, Nathan has a podcast.

1:34

And everyone should absolutely follow all of those.

1:37

those. This is the Lex Fridman podcast. To support it, please

1:41

check out our sponsors in the description, where you can also find

1:45

links to contact me, ask questions, get feedback, and so

1:49

on. And now, dear friends, here's Sebastian Raschka and Nathan Lambert.

1:57

So I think one useful lens to look at all this through is

2:01

the so-called DeepSeek moment. This happened about

2:05

a year ago in January 2025, when the open-weight Chinese

2:09

company DeepSeek released DeepSeek R1, that I

2:13

think it's fair to say surprised everyone with near-state-of-the-art

2:16

performance, with allegedly much less compute for much cheaper. And from then

2:24

to today, the AI competition has gotten insane,

2:28

both on the research and product level. It's just been accelerating.

2:32

discuss all of this today, and maybe let's start with some spicy

2:36

questions if we can.

2:38

Who's winning at the international level? Would you say it's the set

2:42

of companies in China or the set of companies in the United States?

2:46

And Sebastian, Nathan, it's good to see you guys.

2:50

guys. So Sebastian, who do you think is winning?

2:53

- Winning is a very broad term.

2:57

I would say you mentioned the DeepSeek moment, and I think DeepSeek is winning

3:01

the hearts of the people who work on open-weight models because they share

3:05

these as open models. Winning, I think, has multiple

3:09

timescales to it. We have today, we have next year, we have in 10

3:12

years. One thing I know for sure is that I don't

3:16

think nowadays, in 2026, that there will be any

3:20

company that has access to technology that no other

3:24

company has access to. That is mainly because researchers

3:28

are frequently changing jobs and labs.

3:32

They rotate. I don't think there will be a clear winner in terms of

3:36

technology access. However, I do think there will be,

3:39

The differentiating factor will be budget and hardware constraints.

3:43

I don't think the ideas will be proprietary,

3:46

but rather the resources needed to implement them. I don't see

3:53

currently a winner-take-all scenario. I can't see that. At the moment.

3:59

- Nathan, what do you think?

4:00

- You see the labs put different energy into what they're trying to do, and

4:04

I think to demarcate the point in time when we're recording this, the hype

4:08

over Anthropic's Claude Opus 4.5 model has been

4:12

absolutely insane, which is just... I mean, I've used it and built stuff

4:16

in the last few weeks, and it's... it's almost gotten to the point where it feels like a bit of

4:20

a meme in terms of the hype. And it's

4:22

kind of funny because this is very organic, and then if we go back a few months

4:26

ago, we can see the release date and the notes, as Gemini 3 from Google got

4:30

released, and it seemed like the

4:33

marketing and just, like, wow factor of that release was super

4:36

high. But then at the end of November, Claude Opus 4.5 was released and

4:40

the hype has been growing, but Gemini 3 was before this. And it kind of feels

4:44

like people don't really talk about it as much, even though when it came out, everybody was like, this

4:48

is Gemini's moment to retake Google's

4:52

structural advantages in AI. And Gemini 3 is a fantastic model, and I still use it.

4:56

It's just kind of differentiation is lower. And I

5:00

agree with Sebastian; what you're saying with all these, the idea space is

5:04

very fluid, but culturally Anthropic is known for betting very

5:08

hard on code, which is the Claude Code thing, is working out for them right now. So I

5:12

think that even if the ideas flow pretty freely, so much of this is

5:16

bottlenecked by human effort and the culture of organizations, where Anthropic

5:20

seems to at least be presenting as the least chaotic. It's a

5:24

bit of an advantage, if they can keep doing that for a while. But on the other

5:28

side of things, there's a lot of ominous technology from China where

5:32

there's way more labs than DeepSeek. So DeepSeek kicked off

5:35

a movement within China, I say kind of similar to how

5:39

ChatGPT kicked off a movement in the US where everything had a chatbot. There's now

5:44

tons of tech companies in China that are releasing very strong frontier open-weight

5:48

models, to the point where I would say that DeepSeek is kind of losing its crown as the

5:52

preeminent open model maker in China, and the likes of

5:56

Z.ai with their GLM models, Minimax's models,

6:00

Kimi Moonshot, especially in the last few months, has shown more

6:04

brightly. The new DeepSeek models are still very strong, but that's kind of

6:08

a... it could look back as a big narrative point where in 2025

6:11

DeepSeek came and it provided this platform for way more Chinese

6:15

companies that are releasing these fantastic models to kind of have this new

6:19

type of operation. So these models from these Chinese companies are open-weights, and

6:24

depending on this trajectory of business models that these American companies are

6:27

doing, they could be at risk. But currently, a lot of people are paying

6:31

for AI software in the US, and historically in China and other

6:35

parts of the world, people don't pay a lot for software.

6:37

- So some of these models like DeepSeek have the love of the people because

6:41

they are open-weight. How long do you think the Chinese companies keep

6:45

releasing open-weight models?

6:47

- I would say for a few years. I think that, like in the US, there's not a

6:51

clear business model for it. I have been writing about open models for a while,

6:55

and these Chinese companies have realized it. So I get inbound from some of them.

6:59

And they're smart and realize the same constraints: a lot of top US tech

7:02

companies and other IT companies won't pay for an API subscription to

7:07

Chinese companies for security concerns. This has been a long-standing

7:10

habit in tech, and the people at these companies then see open

7:14

weight models as an ability to influence and take part of a huge growing

7:18

AI expenditure market in the US. And they're very realistic about this,

7:23

and it's working for them. I think that the government will see that that is

7:27

building a lot of influence internationally in terms of uptake of the technology,

7:31

so there's going to be a lot of incentives to keep it going. But building

7:35

these models and doing the research is very expensive, so at some point, I expect

7:39

consolidation. But I don't expect that to be a story of 2026, where there will be

7:44

more open model builders throughout 2026 than there were in 2025. And a

7:48

lot of the notable ones will be in China.

7:50

- You were going to say something?

7:51

- Yes. You mentioned DeepSeek losing its crown. I do think to some extent, yes, but

7:58

we also have to consider though, they are still, I would say, slightly ahead. And

8:02

the other ones—it's not that DeepSeek got worse, it's just that the other ones

8:06

are using the ideas from DeepSeek. For example, you mentioned Kimi—same

8:10

architecture, they're training it. And then again, we have this leapfrogging

8:13

where they might be at some point in time a bit better because they have the more recent

8:17

model. And I think this comes back to the fact that there won't be

8:21

a clear winner. It will just be like that: one person releases

8:25

something, the other one comes in, and the most recent model is probably always the

8:29

best model.

8:30

- Yeah. We'll also see the Chinese companies have different incentives. Like,

8:33

DeepSeek is very secretive, whereas some of these startups are

8:37

like the MiniMaxs and Z.ais of the world. Those two literally have filed

8:41

IPO paperwork, and they're trying to get Western

8:45

mindshare and do a lot of outreach there. So I don't know if these incentives will change the

8:49

model development, because DeepSeek famously is built by a hedge fund,

8:53

Highflyer Capital, and we don't know exactly what they use the

8:57

models for or if they care about this.

8:59

- They're secretive in terms of communication; they're not secretive in terms of the technical reports that

9:03

describe how their models work. They're still open on that front. And we should also

9:06

say, on the Claude Opus 4.5 hype, there's the layer of something

9:13

being the darling of the X echo chamber, on the

9:18

Twitter echo chamber, and the actual amount of people that are using the

9:22

model. I think it's probably fair to say that ChatGPT and

9:25

Gemini are focused on the broad user base that just

9:29

want to solve problems in their daily lives, and that user base

9:33

is gigantic. So the hype about the coding may not be

9:37

representative of the actual use.

9:38

- I would say also a lot of the usage patterns are,

9:43

like you said, name recognition, brand and stuff, but also

9:47

muscle memory almost, where, you know, ChatGPT has been around

9:50

for a long time. People just got used to using it, and it's almost like a flywheel:

9:54

they recommend it to other users and that stuff. One interesting point is also

9:58

the customization of LLMs. For example, ChatGPT has a

10:02

memory feature, right? And so you may have a subscription and you

10:06

use it for personal stuff, but I don't know if you want to use that same thing at work.

10:10

Because it's a boundary between private and work. If you're working at a company, they might not

10:14

allow that or you may not want that. And I think that's also an interesting point

10:18

where you might have multiple subscriptions. One is just clean code.

10:22

It has nothing of your personal images or hobby

10:26

projects in there. It's just like the work thing. And then the other one is your personal thing.

10:30

So I think that's also something where there are two different use cases, and it doesn't mean

10:34

you only have to have one. I think the future is also multiple ones.

10:38

- What model do you think won 2025, and what model do you think is going to win '26?

10:43

- I think in the context of consumer chatbots, it's a question of: are you willing to

10:47

bet on Gemini over ChatGPT?

10:50

Which I would say, in my gut, feels like a bit of a risky bet

10:54

because OpenAI has been the incumbent, and there are so many benefits to that in tech.

10:58

I think the

11:01

momentum, if you look at 2025, was on Gemini's side, but they were starting from

11:05

such a low point. And RIP Bard and these earlier attempts at getting started.

11:13

Huge credit to them for powering through the organizational chaos to make that happen.

11:17

But also it's hard to bet against OpenAI because they always come off as

11:22

so chaotic, but they're very good at landing things. And I think,

11:26

personally, I have very mixed reviews of GPT-5, but it must have

11:30

saved them so much money with the high-line feature being a router where

11:34

most users are no longer charging their GPU costs as much.

11:38

So I think it's very hard to dissociate

11:42

the things that I like out of models versus the things that are going to

11:45

actually be a general public differentiator.

11:50

- What do you think about 2026? Who's going to win?

11:52

- I'll say something, even though it's risky. I think Gemini will continue to make progress on ChatGPT.

11:56

I think Google's scale, when both of these are

11:59

operating at such extreme scales—and Google has the

12:03

ability to separate research and product a bit better, whereas you hear so much

12:07

about OpenAI being chaotic operationally and chasing the high-impact thing,

12:11

which is a very startup culture. And then on the software and enterprise side,

12:15

I think Anthropic will have continued success, as they've again and again been set up for that.

12:19

And obviously Google Cloud has a lot of offerings,

12:23

but I think this kind of Gemini name brand is important for them to build.

12:27

Google Cloud will continue to do well, but

12:31

that's a more complex thing to explain in the

12:34

ecosystem, because that's competing with the likes of Azure and AWS rather than

12:38

on the model provider side.

12:40

- So in infrastructure, you think TPU is giving an advantage?

12:45

- Largely because the margin on NVIDIA chips is insane, and

12:49

Google can develop everything from top to bottom to fit their stack and not have

12:53

to pay this margin. And they've had a head start in building data

12:57

centers. So all of these things that have both high lead times and very hard margins on

解锁更多

免费注册以访问高级功能

互动查看器

观看带有同步字幕、可调节叠加层和完整播放控制的视频。

免费注册以解锁

AI 摘要

获取由 AI 立即生成的视频内容摘要、要点和结论。

免费注册以解锁

翻译

一键将字幕翻译成 100 多种语言。以任何格式下载。

免费注册以解锁

思维导图

将字幕可视化为交互式思维导图。一目了然地了解结构。

免费注册以解锁

与字幕聊天

提出关于视频内容的问题。直接从字幕中获取由 AI 驱动的答案。

免费注册以解锁

从您的字幕中获得更多

免费注册并解锁交互式查看器、AI 摘要、翻译、思维导图等。无需信用卡。