文本记录English

MIT 6.S087: Foundation Models & Generative AI. AUTONOMY

27m 52s4,342 字数639 segmentsEnglish

完整文本记录

0:00

but uh tin flew in from silic Valley

0:02

he's going to be here in th as well for

0:04

B part but he's going to talk about

0:06

agents so please en Jo yep hi everyone

0:10

so I just going to quickly jump into it

0:13

right away imagine you want to research

0:15

a topic of the future of AI you want to

0:18

understand what's going to happen soon

0:20

in terms of AI space and so on and you

0:21

don't want to see it on on a 1 hour and

0:23

30 minutes lecture um you don't want to

0:25

click on a thousands of links you don't

0:27

want to research and spend long time

0:29

doing that you want to have a concise

0:31

report right away right there this is

0:34

possible now with autonomous agents it's

0:36

also possible now to basically send a

0:39

promt to chpt and get your favorite

0:42

pizza delivered from the favorite

0:44

restaurant um furthermore it is possible

0:47

now to execute on any almost any online

0:50

task like say for instance doing um a

0:54

California driving test online and you

0:57

can look at the Sinister face of this

0:59

guy you can trust he did it himself so

1:02

yeah uh my name is Art I'm uh originally

1:06

I'm software engineer I was born and

1:08

raised in Ukraine I have been developing

1:11

AI products for the past more than a

1:13

decade I guess as every second Ukrainian

1:17

software engineer right now because of

1:18

the geopolitical situation I'm also an

1:20

ethical hacker and work a lot in cyber

1:23

security space um and I'm a Serial

1:25

entrepreneur I came to MIT to do my MBA

1:28

degree but to many people ask me why

1:30

software engineer do an MBA degree and I

1:32

dropped out well not exactly because of

1:36

that I started my own company which is

1:38

called Kraken AGI and we are building

1:41

agents with the mission to drastically

1:43

Elevate Global digital reliability and

1:46

security we basically build in

1:48

autonomous AI agents for cyber security

1:50

and software development

1:52

space now today I want to quickly touch

1:55

on the topic of um terminology in this

1:58

field it's extremely confusion it's

2:01

extremely confusing and convoluted

2:02

unfortunately right now as in any new

2:05

field I also want to explore autonomous

2:07

AI Agents from the perspective of

2:10

specifically AGI artificial general

2:12

intelligence and I want to expose you to

2:15

some um techniques and mechanics that

2:18

are being used in the industry to build

2:19

to build such an agents right now so

2:21

that you maybe can go offline and

2:23

research more on these Topics in terms

2:26

of

2:28

terminology GPT is a model gpt1 gpt2

2:32

gpt3 GPT 4 chat GPT is a SAS

2:36

product gpts are now agents or co-pilots

2:42

or assistants so the the terminology in

2:45

the industry is extremely extremely

2:46

confusing different things mean the same

2:50

um although named differently different

2:53

things that are named the same mean

2:56

different stuff so it's it's like super

2:59

confusing right right now and this is

3:00

fine you need to understand that this is

3:02

fine when you search something on the

3:03

topic of autonomous a agents you will

3:05

see well in some paper neuros symbolic

3:08

linking in another retrieval augmented

3:10

generation Google will tell you this is

3:12

called grounding some other folks would

3:14

use emotional grounding concept so this

3:16

is fine but let's touch on a couple of

3:20

key terms here so the agents have been

3:23

developing for quite some time um since

3:26

the Advent of AI basically in during

3:28

this old um AI deep learning um hype um

3:33

this architecture emerged so it it it

3:36

was quite some time ago and basically

3:38

the main components of it is we have a

3:40

system that can autonomously perform it

3:43

has sensors it it has actuators it

3:46

interacts with an

3:48

environment post generative AI hype so

3:52

like a couple of years ago we got a more

3:55

simplified architecture and a more

3:57

simplified approach now actuators and

4:00

sensors are kind of called tools now in

4:03

this system we have either an llm as a

4:06

core or a family of AI models basically

4:11

um providing reasoning capability for

4:13

the whole agent itself we give out a

4:16

task to an agent um that it needs to

4:19

achieve on a

4:20

goal another thing I want to Define here

4:22

today as well is what is Agi what is

4:24

artificial general intelligence and I

4:27

think um Google has taken an approach to

4:30

that and and actually had very good

4:32

definition AGI is basically something

4:34

that can accomplish any task that human

4:37

can accomplish or can do or kind of can

4:40

interact and react in any environment

4:43

human can interact and

4:45

react now what's today's AI is missing

4:50

to to to be AI well except of this lame

4:54

joke about letter G um today's AI is

4:58

like a monk in a cave meditating so it

5:03

like the llm or Char GPD when you

5:06

interact with it it's a

5:08

snapshot of time and space and knowledge

5:12

it's very wise it has a lot of general

5:14

knowledge but it exists outside of time

5:17

outside of the

5:19

environment and basically you come up to

5:22

char GPT you bring a letter with the

5:24

task written on it CHP gets this letter

5:27

reads it as a monk writes down the

5:29

result gives you back and keeps

5:33

meditating it also has quite a lot of

5:36

constraints specific scalability

5:38

constraints one of the scalability

5:40

constraints is the fact that well if you

5:42

bring like a bunch of books to this mon

5:45

um it will just truncate half of the

5:48

information that has been provided and

5:49

we only work with information that

5:51

allows that its context window

5:55

allows another thing to look on the

5:59

current AI to kind of put in an

6:01

additional analogy what rard mentioned

6:04

on today's lecture about the foundation

6:06

models that the um current llms

6:09

Foundation models are components of the

6:11

brain but not the whole brain our whole

6:14

brains are much more complicated it's a

6:16

family of AI intertwined together and

6:19

they're not the whole body the body has

6:22

the sensors interfaces to act to feel to

6:25

see to have vision and so on current llm

6:28

models on only have limited interfaces

6:31

to do

6:32

that now still considering this can we

6:35

achieve AGI like capabilities

6:38

today and the answer is we kind of can

6:40

or we at least can push to a

6:43

capabilities and there are a couple of

6:45

approaches to that one of the approach

6:47

is what Yan leun he's a chief scientist

6:49

at mattera proposes it's basically he

6:52

basically says Hey llms are dumb they

6:55

can't execute on a bunch of tasks like

6:57

say planning they still hallucinate

7:00

so we need to build a completely new

7:02

architecture basically inherit our own

7:03

knowledge and build a completely new

7:06

approach completely new AI model

7:08

probably based on Transformers or

7:10

something else and um or give the new AI

7:14

new architecture of this new

7:16

ability but I'm software engineer I like

7:19

love to problem solve and in software

7:21

engineering industry we say

7:23

composability over inheritance so uh we

7:27

potentially still can try and use

7:29

something

7:30

that's already exist in the field and

7:31

try to apply try to work around

7:34

limitation that exist there try to uh

7:36

create new techniques around it and

7:38

actually still try to push um the

7:41

existing llms towards agla capabilities

7:44

only using what we have and that's

7:46

another approach that currently the

7:48

industry and most of the Silicon volley

7:50

startups and companies are are taking

7:52

right now how to do that specifically

7:56

how can we build how can we push current

7:58

llms those monks sitting in the caves to

8:00

be more like AGI first we can give it we

8:03

can give the llms um ability to

8:07

self-reflect we as hum humans we are

8:10

thinking iteratively we don't have like

8:14

when we produce an idea our next thought

8:17

is being fed of the S thought that

8:19

happened a second ago so we iteratively

8:22

think about something until we come up

8:24

with the result we don't just go ahead

8:27

right away and produce the idea or or

8:29

suggestion or something this technique

8:32

of like thinking associatively in

8:34

iterations is called The Chain of

8:37

Thought and it's being applied in prompt

8:39

engineering you can apply it on your own

解锁更多

免费注册以访问高级功能

互动查看器

观看带有同步字幕、可调节叠加层和完整播放控制的视频。

免费注册以解锁

AI 摘要

获取由 AI 立即生成的视频内容摘要、要点和结论。

免费注册以解锁

翻译

一键将字幕翻译成 100 多种语言。以任何格式下载。

免费注册以解锁

思维导图

将字幕可视化为交互式思维导图。一目了然地了解结构。

免费注册以解锁

与字幕聊天

提出关于视频内容的问题。直接从字幕中获取由 AI 驱动的答案。

免费注册以解锁

从您的字幕中获得更多

免费注册并解锁交互式查看器、AI 摘要、翻译、思维导图等。无需信用卡。

尝试 YOUTUBETRANSCRIPT.DEV 免费开始