TRANSKRIPTEnglish

MIT 6.S087: Foundation Models & Generative AI. BIOLOGY

37m 44s6,190 ord938 segmentsEnglish

FULLSTÄNDIGT TRANSKRIPT

0:01

goe all right uh well welcome to manolis

0:05

he's a professor at MIT he's doing

0:08

amazing stuff in computational genomics

0:11

and Ai and biology he's going to talk

0:13

about the AI Frontiers here super

0:15

excited to have so give him a warm

0:18

Applause

0:22

thank awesome welcome everyone so um

0:26

basically there's a lot going on in

0:28

biology and there's a lot going on in Ai

0:30

and my goal is to tell you a little bit

0:32

about both and how the field is

0:35

dramatically changing so um uh I'm going

0:38

to tell you primarily about health and

0:41

understanding biology and Medicine who

0:42

wants to live forever here good good who

0:45

wants to live at least live like until

0:47

next year yeah so so there's this joke

0:50

where like like oh who wants to live

0:51

till 100 well the person who's 99 and uh

0:54

yeah we never actually want to die we

0:56

just don't necessarily want to live

0:57

forever but anyway so the goal is how

1:00

can we use AI to truly understand the

1:02

mechanism through which human biology

1:05

works and how we can use that to

1:08

basically develop new Therapeutics that

1:10

put an end to disease as we know it

1:13

who's excited about that good so what's

1:17

our goal our goal is to understand

1:19

medicine and and medicine has truly come

1:21

a long way so this is I was just giving

1:23

a talk in Athens last week that's when I

1:24

made this slide and uh this is how uh

1:27

medicine used to work so you would have

1:29

some type type of uh it's unclear

1:32

whether here you're depicting a God or a

1:33

physician but you know even nowadays the

1:35

distinction is kind of blurred in many

1:37

ways um and then they're doing this

1:39

magic and the patient is sort of

1:41

subjected to that and you know there's

1:42

some peer review Comedia apparently um

1:44

but this is how it used to be done and

1:46

then eventually we started looking at

1:49

things closer and closer and this is

1:52

more than 100 years old the um structure

1:55

of neurons inside our cortex and our uh

2:00

different regions of the brain we can

2:02

basically now see the the fact that

2:05

we're based on smaller and smaller parts

2:08

and the first diagnosis of Alzheimer's

2:10

dates to a 100 years ago from Imaging

2:13

where we could actually see the plaques

2:15

and neurop fiary Tangles that are still

2:17

today the definition of Alzheimer's but

2:20

something dramatically changed in the

2:22

last few years and this something

2:25

is human genetics human genetics

2:28

basically tells us that there's

2:30

something playing a causal role here and

2:33

what that allows us to do is start going

2:34

Beyond just correlation to causation and

2:38

the last part that changed is a lot of

2:40

our own work in being able to gather

2:42

massive massive amounts of data for

2:45

integration so you can basically think

2:47

of this as the Next Generation

2:48

microscope where instead of gathering

2:50

four cells at single cell resolution we

2:54

are gathering 2 million cells at single

2:57

cell resolution and instead of measuring

2:59

whatever we can stain and we can stain

3:01

about like 5 10 20 things at best we can

3:05

measure the expression of 20,000 genes

3:09

for every one of these dots okay so this

3:12

is a 20,000 dimensional space with 2

3:16

million cells projected down into

3:19

something that we humans can visualize

3:21

in 2D okay so what are the Paradigm

3:24

shifts that are happening the first

3:25

paradigm shift is that we're going from

3:27

hypothesis driven to data driven instead

3:30

of just saying oh we have a very

3:31

specific hypothesis like gather a bunch

3:33

of data and then we're going to say yes

3:34

no answer we now have just massive data

3:37

we shoot first and we ask questions

3:39

later so we basically have systematic

3:41

data sets we're building resources

3:43

massive data sharing and really

3:45

comprehensive use of biology the second

3:48

as I mentioned we're going from

3:49

correlation to cation correlation means

3:51

the countries that eat more chocolate

3:53

also get more noble prices does the

3:55

chocolate need a noble prices do they

3:57

just buy uh I don't know more chocolate

3:59

with prices is it correlation causation

4:02

or reverse causation so that's what

4:04

epidemiology has always been fuzzy about

4:07

whereas with genetics we actually now

4:09

understand mechanism we know that this

4:10

if if there's a genetic difference then

4:12

eventually you can establish causality

4:16

and then the last step which is the most

4:18

relevant to this class is we're going

4:20

from classical data analysis where there

4:23

was a different methodology for every

4:25

problem we would just like come up with

4:27

a question develop a statistical test

4:28

and answering and and the humans did all

4:30

the thinking to now and where the goal

4:33

was to develop very few parameters and

4:35

very targeted models to understand so

4:37

that we're not overfitting the data

4:39

whereas now we're basically saying

4:40

billion parameters no problem bring it

4:42

on and we're building this Foundation

4:45

models that are very often multimodal

4:48

where we're learning representations

4:50

we're learning hierarchical deep

4:52

representations and we're truly

4:54

understanding concepts and yielding

4:56

insights everybody with me so far so

4:58

these are the major shifts

5:00

what does that mean that means that we

5:02

com we can combine now causality from

5:04

genetics and big data to truly

5:06

understand the mechanism of disease

5:09

genetics that means that we're starting

5:11

with causality because we know these

5:13

regions have something to do with

5:14

disease the problem is that we don't

5:16

understand the mechanism and that's

5:17

where massive data come in we can

5:19

basically say this correlates with

5:21

Alzheimer's let's go and find out what

5:24

changes in the brain of people that have

5:26

this genetic difference or the people

5:28

that have Alzheimer's or people that

5:29

have environmental exposure and then we

5:31

can figure out the specific genes and

5:33

proteins that are responsible and then

5:36

use those to understand mechanism so

5:39

we're gathering this massive data and

5:41

that's where the Deep learning comes in

5:43

where we can now go from sequence

5:45

information to a model that understands

5:47

the language of biology understands how

5:51

mutations are acting understands how

5:53

proteins are folding how chemicals are

5:56

resulting into their functions and then

5:58

eventually make predictions that we can

6:01

validate experimentally and that's

6:03

another amazing thing about biology in

6:06

society it's very easy to say well maybe

6:08

this causes that but intervention would

6:10

cost like I don't know billions of

6:12

people changing the way that they do

6:14

stuff whereas with Biology we can take a

6:16

cell and then change a gene and then see

6:18

what happens so they're much more

6:20

transparent those models everybody with

6:22

me awesome so what we need to do is

6:26

basically go from Simply there's

6:28

something going on genetic here to

6:30

here's the circuit here's the genetic

6:32

variance the differences in the letters

6:35

here are the motifs the sequence

6:36

patterns that these letters perturb here

6:38

are The Regulators that bind these

6:40

motifs here are the control regions or

6:43

enhancers and the cell types where they

6:45

become active and here are the target

6:47

genes that are controlled so effectively

6:49

a circuitry and then using that my lab

6:52

has basically worked on applying this

6:55

type of methodology to a dozen plus

6:57

disorders from cardiac disease obesity

7:00

cancer Alzheimer's addiction neurod

7:02

degeneration pathogenesis schizophrenia

7:05

psychosis bipolar Down syndrome autism

7:07

PTSD so every aspect of the human body

7:09

and the human brain we can now start

7:11

studying systematically across dozens of

7:14

cell types across hundreds of uh tissues

7:18

and millions of cells and hundreds of

7:20

individuals we can now start asking how

7:23

is the action of disease percolating

7:27

through to give you an example I like to

7:30

joke that we published a paper in the

7:31

New England joural medicine about one

7:33

bit of information in the human genome

7:35

this is about changing a t into a c and

7:38

what we showed is the mechanism through

7:39

which we can translate a region of

7:41

genetic Association the strongest

7:43

association with obesity to a mechanism

7:46

that basically tells us how is that

LÅS UPP MER

Registrera dig gratis för att få tillgång till premiumfunktioner

INTERAKTIV VISARE

Titta på videon med synkroniserad undertext, justerbart överlägg och fullständig uppspelningskontroll.

REGISTRERA DIG GRATIS FÖR ATT LÅSA UPP

AI-SAMMANFATTNING

Få en omedelbar AI-genererad sammanfattning av videoinnehållet, nyckelpunkter och slutsatser.

REGISTRERA DIG GRATIS FÖR ATT LÅSA UPP

ÖVERSÄTT

Översätt transkriptet till över 100 språk med ett klick. Ladda ner i valfritt format.

REGISTRERA DIG GRATIS FÖR ATT LÅSA UPP

MIND MAP

Visualisera transkriptet som en interaktiv mind map. Förstå strukturen med ett ögonkast.

REGISTRERA DIG GRATIS FÖR ATT LÅSA UPP

CHATTA MED TRANSKRIPT

Ställ frågor om videoinnehållet. Få svar från AI direkt från transkriptet.

REGISTRERA DIG GRATIS FÖR ATT LÅSA UPP

FÅ UT MER AV DINA TRANSKRIPT

Registrera dig gratis och lås upp interaktiv visning, AI-sammanfattningar, översättningar, mind maps och mer. Inget kreditkort krävs.

    MIT 6.S087: Fo… - Fullständigt Transkript | YouTubeTranscript.dev