⚠️ Some features may be temporarily unavailable due to an ongoing 3rd party provider issue. We apologize for the inconvenience and expect this to be resolved soon.

TRANSCRIPTEnglish

The Internet Was Weeks Away From Disaster and No One Knew

53m 4s9,360 words1,325 segmentsEnglish

FULL TRANSCRIPT

0:00

(suspenseful music)

0:01

- [Derek] In 2021, a hacker uncovered a fatal weakness

0:04

in the world's most important operating system.

0:07

- What would you do with a key

0:08

that gets you into any server on the internet?

0:12

- Is this live to the public right now?

0:13

- Yeah, it's live on the server.

0:15

- Look, I'm not pleased.

0:16

I would like you to change it back.

0:19

- [Narrator] At the time, just about everyone believed

0:21

that hacking this system was impossible,

0:23

but they were wrong.

0:24

- Well, I can tell you how many systems

0:26

would have been compromised,

0:27

which would have been millions.

0:28

Actually, I'm still surprised the mainstream news outlets

0:31

haven't really covered this very much.

0:33

- How close did we come?

0:34

- We were weeks away from millions of internet servers

0:38

being accessible to whoever crafted the backdoor.

0:41

Anything from spying, to ransom,

0:44

to taking down entire countries,

0:47

you could have done it with this backdoor.

0:49

- This hacker had realized

0:51

the entire operating system rested on a single part,

0:54

maintained by a single person,

0:56

and that by compromising that one part,

0:58

they could infect almost any server on the internet.

1:02

So, how could we ever let ourselves get this vulnerable?

1:07

Well, the story begins with a jammed printer.

1:19

(suspenseful music)

1:19

(upbeat music)

1:19

- [Narrator] The AI lab was buzzing.

1:21

They had just installed the Xerox 9700.

1:24

It was one of the first ever commercial laser printers.

1:27

It was a pretty big deal.

1:29

The only problem was it kept jamming.

1:34

- [Stallman] You'd wait an hour figuring,

1:35

I know it's gonna be jammed,

1:36

I'll wait an hour and go collect my printout,

1:39

and then you'd see that it'd been jammed the whole time.

1:43

Frustration up the wazoo.

1:45

- Richard Stallman, a researcher at the lab,

1:48

thought that he had a solution.

1:50

Years earlier, he had solved a similar problem

1:52

by coding a simple program that sent an alert

1:54

whenever there was a jam.

1:55

Now, it didn't fix the problem mechanically,

1:57

but it did make sure that a jam wouldn't go unnoticed.

2:00

He thought he could do a similar thing now.

2:02

The only problem was that Xerox hadn't provided them

2:05

the source code for the printer,

2:06

and without it, Stallman couldn't write his code.

2:08

So he tracked down the original developer.

2:11

- [Stallman] And I said, "Hi, I'm from MIT.

2:12

Could I have a copy of the printer source code?"

2:16

And he said, "No, I promised not to give you a copy."

2:20

I was stunned.

2:21

I was angry.

2:23

All I could think of was to turn around on my heel

2:26

and walk out of his room.

2:30

Maybe I slammed the door.

2:32

And I thought about it later on

2:35

because I realized that I was seeing

2:37

not just an isolated jerk

2:39

but a social phenomenon that was important

2:42

and affected a lot of people.

2:45

- [Henry] This social phenomenon

2:47

had slowly invaded the world of computer research.

2:50

In the late 60s,

2:51

engineers at AT&T's Bell Labs

2:53

invented an operating system called Unix,

2:56

which they shared widely across universities

2:58

and research labs.

2:59

This was a time of freedom.

3:01

But by the 80s,

3:02

AT&T started going after Unix clone developers

3:05

for copyright infringement.

3:07

Later, they even sued

3:08

the University of California at Berkeley.

3:10

The tech landscape had shifted.

3:12

They wanted to close off software development.

3:15

Companies were now making their employees

3:16

sign non-disclosure agreements,

3:18

prohibiting them from ever sharing their code

3:20

with other programmers.

3:22

- [Stallman] See, this was my first encounter

3:25

with a non-disclosure agreement,

3:26

and I was the victim.

3:28

And the lesson it taught me was

3:30

that non-disclosure agreements have victims.

3:33

They're not innocent, they're not harmless.

3:36

- [Henry] Stallman wondered,

3:37

maybe he could adapt to this new world.

3:39

- [Stallman] But I realized

3:40

that that way I could have fun coding

3:42

and I could make money.

3:44

But at the end, I'd have to look back at my career and say,

3:48

"I have spent my life building walls to divide people.

3:53

and I would've been ashamed of my life."

3:55

- So Stallman chose a different path.

3:58

He quit his job at MIT

3:59

and in 1985 established the Free Software Foundation,

4:03

and it worked to promote four basic freedoms.

4:06

You should be free to run software for any purpose,

4:08

free to study it, free to change it, and free to share it.

4:12

Now, to ensure those freedoms,

4:13

he created a legal license

4:14

that developers could attach to their code

4:17

called the General Public License.

4:19

And to stick it to AT&T,

4:20

he started to work on a project based on Unix

4:23

but built from the ground up,

4:24

so AT&T couldn't sue.

4:27

He called the project GNU,

4:28

a recursive acronym for GNU is Not Unix.

4:33

Now, to replicate a Unix system,

4:34

the GNU Project had to recreate

4:36

three layers of functionality.

4:38

They needed the utilities,

4:39

which were the everyday tools and commands,

4:42

the shell, which is the terminal

4:43

that people use to interact with the machine,

4:45

and finally, the kernel,

4:47

which is the core that talks to the hardware

4:48

and manages memory.

4:50

Now, over the next seven years,

4:51

the GNU Project made much of that from scratch.

4:54

They created the GCC code compiler, the Bash shell,

4:57

and a host of other core utilities.

4:59

But they were always missing one key component.

5:02

The kernel.

5:05

That changed in the fall of 1991

5:07

when Stallman visited the University of Helsinki

5:10

to give a talk promoting the project.

5:12

In the audience was a young computer science student

5:15

who just happened to be building

5:17

his own kernel from scratch.

5:18

His version wasn't free,

5:21

but after hearing Stallman speak,

5:23

the student changed his mind

5:24

and adopted the General Public License.

5:27

At first, he wanted to call it Free Unix, or Freax,

5:31

but his friend thought that sounded terrible,

5:34

so he renamed it after the student himself, Linus Torvalds.

5:37

Linus, Unix.

5:39

Well, that's how he got Linux.

5:43

That kernel, combined with the other components

5:44

from the GNU Project,

5:46

became a full operating system.

5:48

Now, technically, Linux only refers to that kernel,

5:51

but a lot of people use it

5:52

to refer to the whole operating system,

5:54

so GNU and Linux and whatever else.

5:57

Because the code was open and free

5:59

and the projects built on it were too,

6:01

a new model of software development took hold.

6:03

Anyone could inspect the code, improve it, fix flaws,

6:07

and generally just push development forward for everyone.

6:09

So, software split into two competing ideologies.

6:13

Proprietary closed source systems controlled by companies,

6:16

and open source projects where the code was free.

6:19

- It's free in two ways.

6:21

It's free as in you don't have to pay for it,

6:23

but it's all free to change it in any way you want,

6:26

and that seems to be the much more important aspect.

6:29

People are happy to pay for technology,

6:31

but so often do they run into some roadblocks

6:34

where you have to file a support ticket

6:36

with some large company,

6:38

they may or may not get the help they need,

6:40

and engineers are just itching to just fix it themselves.

6:44

- Developers could take that basic code

6:47

which was freely available

6:48

and then add on their own features

6:50

relevant to their specific device.

6:52

They didn't have to reinvent the wheel every time.

6:55

So that's why Linux spread

6:56

into all sorts of different applications.

6:58

- Hello, I'm a Mac.

6:59

- And I'm a PC.

7:01

No one else.

7:01

- No one. (woman clears throat)

7:03

- Hi, I'm Linux.

7:05

There are an estimated 30 million Linux users out there.

7:09

- How long you been standing there?

7:10

- A long time.

7:11

- And it's not even just limited to computers.

7:15

Your electronic vacuum is definitely Linux.

7:18

Your camera is definitely Linux.

7:20

Most TVs, most electronics are Linux.

7:23

- Linux even runs some of the most sensitive machines

7:26

on the planet.

7:27

- You can assume that Linux is pretty much used

7:29

in anything of high-security need,

7:32

not necessarily because Microsoft, for instance,

7:36

couldn't build something equally secure,

7:38

but because usually there's secrecy involved

7:41

in building, let's say, a new weapon system,

7:44

and you don't necessarily want to have to work

7:46

with some tech company.

7:47

You don't want to involve more people

7:49

than absolutely necessary.

7:51

- [Henry] Of the top 500 supercomputers in the world,

7:54

every single one runs Linux.

7:56

It's used in the Pentagon and on US nuclear submarines.

7:59

- Every bank you can think of really,

8:03

manufacturers, hospitals, governments,

8:07

defense organizations and things like that,

8:09

they're all running Linux servers.

8:10

- Today, Linux is everywhere,

8:12

and most people are familiar with Windows and macOS,

8:15

but they are not the most popular operating systems

8:17

in the world.

8:18

No, they are dwarfed by systems running a Linux kernel.

8:22

Android, with over 3 billion devices, is built on Linux.

8:26

And it also powers the majority of internet servers

8:29

in the world.

8:29

- There is no one company that could have imagined

8:32

all the different cases where computers are used these days,

8:36

and Linux, thanks to its adaptability

8:39

where everyone can just tweak it in little ways

8:41

to make it fit their use case,

8:43

now covers all the use cases.

8:47

- But all of this,

8:48

it all relies on one key assumption.

8:51

That the code is secure.

8:53

Now, there's a good reason to feel this way.

8:56

Because there are so many people looking at the code,

8:58

there's this idea that bugs,

9:00

either intentional or unintentional,

9:02

won't be too deep to catch.

9:04

It's known simply as Linus's Law.

9:06

That with enough eyeballs, all bugs are shallow.

9:09

But there's a big problem with this assumption.

9:12

The open source movement isn't one big project.

9:15

It's an ecosystem.

9:16

You need thousands of small tools and libraries

9:19

each doing a different job,

9:20

like networking, security, or compression.

9:23

Now, a lot of these projects start

9:24

because one person wants to fix a specific problem,

9:27

so they build it themselves.

9:29

They're often unpaid,

9:30

coding on nights and weekends just to make the tool work.

9:33

If it's useful, one open source project adopts it,

9:36

then another,

9:37

and suddenly you have millions of machines

9:39

all relying on one person's passion project.

9:42

That's how the entire ecosystem can end up quietly resting

9:45

on a project maintained by a single volunteer.

9:48

There's a famous XKCD comic

9:50

that captures this idea perfectly.

9:53

But what happens when that block is compromised?

9:58

In our story, our person isn't from Nebraska.

10:01

No, Lasse Collin is from Finland,

10:03

and he's been working on a small data compression tool

10:06

called XZ since 2005.

10:08

XZ is so good at compression

10:10

that it's now used in almost every major Linux distribution.

10:14

For the past 20 years,

10:15

almost all of the work of keeping the tool compatible

10:18

with ever-evolving hardware,

10:20

it's all fallen on Lasse.

10:22

He's never been paid for it,

10:23

but up till now, he's been okay with that.

10:26

Recently, though, he's been under more and more pressure.

10:30

"Over one month and no closer to being merged.

10:33

Not a surprise."

10:34

"Progress will not happen until there is a new maintainer.

10:38

Submitting patches here has no purpose these days.

10:40

The current maintainer lost interest

10:42

or doesn't care to maintain anymore."

10:44

Lasse responds, "I haven't lost interest,

10:47

but my ability to care has been fairly limited,

10:50

mostly due to long-term mental health issues,

10:52

but also due to some other things.

10:54

It's also good to keep in mind

10:56

that this is an unpaid hobby project."

10:59

But it's not enough.

11:00

"I'm sorry about your mental health issues,

11:02

but it's important to be aware of your own limits.

11:05

The community desires more.

11:07

You ignore the many patches bit

11:09

rotting away on this mailing list.

11:10

Right now, you choke your repo."

11:13

Lasse is burning out.

11:15

But just when he thinks he can't handle it anymore...

11:19

"Nice job to both of you for getting this feature

11:20

as far as it is already.

11:22

Just trying to do my part as a helper elf."

11:25

Signed, Jia Tan.

11:27

For months, Jia has been taking some of the load off Lasse.

11:30

He's been incredibly helpful.

11:32

Now he offers to step up

11:33

and take over as maintainer of the project.

11:36

To Lasse, it sounds almost too good to be true.

11:39

"As I've hinted in earlier emails,

11:41

Jia Tan may have a bigger role in the project

11:43

in the future."

11:44

Finally, Lasse can step back and breathe

11:46

after 20 years of hard work.

11:49

But Jia is not who he appears to be.

11:53

And he's identified Lasse Collin's XZ project

11:56

as a weak link in the Linux ecosystem,

11:58

one that could give him access

12:00

to almost every computer on the internet.

12:08

(suspenseful music)

12:08

Today we take secure remote logins for granted.

12:11

I mean, they've worked reliably for over 30 years.

12:14

But it all started in 1995

12:16

at the Helsinki University of Technology

12:18

when a hacker captured thousands of usernames and passwords

12:21

sent over the campus network

12:23

in a sniffing attack.

12:25

In hindsight, the problem's obvious.

12:27

These login requests were being sent totally in plain text,

12:30

so anyone who intercepted the data could just read it.

12:36

(suspenseful music)

12:36

When Tatu Ylonen, a computer researcher at the university,

12:39

learned of the attack,

12:40

he made it his mission to ensure

12:41

that it would never happen again.

12:44

- [Tatu] Password sniffing was perhaps

12:46

the most serious security issue on the internet back then.

12:50

- To do this, his solution needed to ensure two things.

12:55

First, machines had to establish a secure connection.

12:58

If both computers could agree on a shared secret code

13:00

that they would use to scramble their data,

13:02

then even if they were overheard,

13:04

anyone without that secret code would just get gibberish.

13:07

Now, you could agree on that shared secret

13:09

ahead of time in person.

13:10

- Password.

13:11

- But on the internet, that's rarely practical.

13:14

No, you have to agree on that shared secret

13:16

ahead of time without ever having met

13:18

and also with someone listening in the entire time.

13:21

It sounds really tricky, but there is a way to do it,

13:24

and I can show you how using this jar of paint.

13:27

Say I'm trying to send a message to Gregor over there.

13:30

First step is we agree on a shared public color.

13:33

Let's pick this red.

13:35

This is no secret, anyone can see this.

13:38

Now we each pick our own private color.

13:40

I'm gonna pick yellow, and he can pick whatever he wants.

13:44

So we take our private color,

13:47

and then I'm gonna mix that with the public color.

13:51

It's worth saying now

13:51

that these mixtures are assumed to be impossible to unmix,

13:54

so even if you know this orange and you know this red,

13:57

you can't exactly deduce

13:58

the exact shade of yellow we used to create it,

14:00

and this is important for the actual computer example later.

14:03

Okay, so I'm gonna send this over to Gregor.

14:05

- So, I mixed in my secret color with the public,

14:07

and I'm gonna pass this to Henry.

14:09

- So, Gregor sent me this,

14:11

which looks like a sort of dark green sort of color.

14:15

And what we're gonna do now is we're gonna mix it

14:17

with my original private color.

14:20

- Okay, now that I have Henry's secret color

14:23

mixed in with the public,

14:24

I'm gonna add some of my own.

14:27

- So we end up with this sort of distinct olive color.

14:33

There's my yellow in there, I can see,

14:35

and whatever Gregor had in his side.

14:36

And the thing is because each set of paints

14:38

went through the same process,

14:40

they both end up with this same olive green,

14:43

even though we never shared our secret colors.

14:46

So we end up with this shared secret color at the end

14:49

that no one else can get,

14:50

and that means that we can use it as our secret code

14:52

when sending information.

14:54

Now, in the real exchange,

14:55

we use big public numbers instead of colors,

14:57

but the idea is the exact same.

14:59

Each side mixes in their own private number

15:02

using some math that, when you try to reverse it,

15:04

leads to a discrete log problem,

15:06

which makes it practically impossible to unmix them.

15:08

That way, we solve the first problem.

15:11

But there is another threat that's unaccounted for.

15:14

Say a hacker, like Casper here, tries to sit in between us.

15:18

Now we can create a legitimate connection,

15:21

so we end up with a shared secret code,

15:23

and Casper could do the exact same thing with Gregor.

15:27

Now, whenever I send a message, he can relay that to Gregor,

15:30

he can change and modify it and send his response back.

15:33

And to each of us, the connection looks legitimate,

15:35

but Casper's sitting between us the whole time.

15:38

He's a man in the middle.

15:40

So, I need a way of authenticating

15:41

that Gregor is really who he says he is.

15:44

Now, we could do this again

15:45

by agreeing on a password ahead of time in person,

15:48

but we need a practical way to do it over the internet.

15:50

This was the second problem that Tatu had to solve.

15:53

To make that happen,

15:54

Gregor can take two really big prime numbers,

15:57

which he keeps secret.

15:58

He then multiplies them together

16:00

to get an even bigger number,

16:01

which he then makes public.

16:03

Now, when I want to send Gregor a message,

16:05

I just take that big public number

16:07

and I scramble it in a way that only Gregor,

16:09

who knows the two prime factors

16:10

that make up that big public number,

16:12

can successfully unscramble.

16:14

For anyone else,

16:15

getting those two prime factors is practically impossible.

16:17

So, as long as I know that that big public number

16:20

actually belongs to Gregor,

16:21

I know that anything encrypted to that key

16:23

can only be read by him.

16:24

This is called RSA encryption,

16:26

and it means that if I know the certificate is valid,

16:29

then I accept the connection.

16:30

And by authenticating Gregor,

16:32

it foils our man in the middle, Casper Devious.

16:35

All right.

16:38

Tatu Ylonen combined these two steps,

16:40

securing the channel and authenticating the user,

16:42

into a protocol for remote logins between machines.

16:46

It gave you the same simple text shell people were used to,

16:49

a plain terminal where you type commands,

16:51

but now the connection was encrypted.

16:53

He called it Secure Shell, or SSH.

16:56

And it was immediately useful.

16:59

Many Linux machines don't even have keyboards or monitors,

17:02

especially not servers,

17:03

so you wanna be able to log in and control them remotely.

17:06

So SSH was soon adopted

17:08

on almost every machine that ran Linux.

17:11

And as Linux spread, so too did SSH.

17:14

Today, when you control a machine remotely,

17:16

there's a good chance you're using SSH.

17:18

- SSH is literally the maintenance backbone

17:22

of the entire internet.

17:23

- And the most widely used open source SSH implementation

17:26

is called OpenSSH.

17:29

And because it's so popular, it's heavily protected.

17:32

- I mean, OpenSSH is probably

17:35

one of the most closely examined projects out there

17:39

because it's just so vitally important

17:42

to the security of servers everywhere.

17:44

Having a way to bypass the authentication in secure shell

17:49

is like having the master key to the hotel.

17:51

It lets you into every room.

17:55

(suspenseful music)

17:55

- [Henry] This is why Jia Tan wants a way into OpenSSH,

17:59

but trying to hack it directly would be almost impossible.

18:03

Lucky for Jia, the open source model doesn't just mean

18:05

that operating systems are stitched together

18:07

from many programs,

18:09

but that each of those programs is itself stitched together

18:12

from other programs.

18:14

Those are called dependencies.

18:15

- OpenSSH is one of the most scrutinized software packages,

18:19

but that doesn't extend to all of its dependencies.

18:23

- Jia believes that if he can compromise

18:25

a dependency of OpenSSH,

18:27

he can sneak an exploit into the main project.

18:30

And it just so happens

18:31

that Lasse Collin's compression tool XZ

18:34

is linked through a chain of these dependencies.

18:43

(suspenseful music)

18:43

Now, Lasse's original goal with XZ

18:45

was to find a better way to compress data on Linux.

18:48

That data could be anything.

18:49

Code, an image, text.

18:51

But what was important to Lasse

18:53

was that once you compressed and decompressed it,

18:55

it had to come back exactly the same.

18:58

The method had to be lossless.

19:09

to Rick Astley's hit "Never Gonna Give You Up"

19:11

and we're gonna try to compress it.

19:13

Now, say we take this

19:14

and we represent it as a stream of characters,

19:17

and each one gets a fixed-width 8-bit code.

19:21

Now, that works, but it's inefficient.

19:23

If we go through this stream

19:25

and just count up how often each symbol appears,

19:27

you'll notice there's a pattern.

19:29

Some appear more frequently,

19:31

like N with 430 uses,

19:33

and some, barely at all,

19:34

like J with one use.

19:36

To save space,

19:37

why don't we give the ones that appear more frequently

19:39

shorter codes,

19:40

and the rarer ones, well, they can afford to be long.

19:43

But how do we do that?

19:45

So, let's start by counting up how often each symbol appears

19:48

and sorting that from most frequent to least frequent.

19:51

We take the two least frequent symbols

19:52

and join them together into a pair.

19:55

We then treat that pair as a new combined symbol

19:57

whose frequency is the sum of the two it represents.

20:00

We can then reinsert that back into the list.

20:03

Then we do it again.

20:04

We take the two least frequent items, combine them,

20:07

and then reinsert them back into the list.

20:09

And we do that over and over again

20:11

until we get this massive structure called a Huffman tree.

20:15

Now, to get our codes, we just walk the tree.

20:18

A step right is a 1, a step left is a 0.

20:21

So, for example, to get R,

20:23

we just go right, left, left, right,

20:25

so the code is 1001.

20:27

So what you'll notice is the more commonly occurring symbols

20:30

naturally appear at the top of the tree,

20:32

so they get shorter codes,

20:33

while the ones that appear less frequently

20:35

are at the bottom of the tree.

20:36

The system works well, but it also has a weakness.

20:39

In our "Never Gonna Give You Up" example,

20:41

it always encodes N-E-V-E-R space.

20:45

It doesn't realize that this whole chunk repeats.

20:48

So, what if instead of looking at symbols,

20:50

we looked at those chunks?

20:52

Now, they don't have to be words,

20:53

they can be parts of words or even longer.

20:55

They just have to be patterns that repeat.

20:57

So let's scan through the text

20:58

but keep a rolling dictionary of what we've just seen.

21:01

Then, as we move forward,

21:02

we can check whether the next chunk has already appeared.

21:05

And if it has, we don't need to write that chunk again.

21:08

We just write a code with two numbers, how far back to look,

21:11

and how many characters to copy.

21:13

Now, when we decompress,

21:15

we can just read along

21:16

and whenever we hit one of these codes,

21:18

we jump back, copy the matching chunk,

21:20

and paste it into place.

21:22

Two scientists, Lempel and Ziv,

21:24

published this algorithm in 1977,

21:26

so it became known as LZ77.

21:30

But some of these symbols and pointers

21:32

show up more often than others.

21:34

They actually have their own frequencies.

21:36

So we can feed that whole stream into another Huffman tree

21:39

to get a second layer of compression.

21:41

And in our demo,

21:42

it actually gets the file down 85% smaller

21:45

than the original.

21:46

This might look new,

21:47

but you've almost certainly used it yourself.

21:50

It's called deflate,

21:51

but it's better known for the files it creates, .zip.

21:55

If you ever clicked Close on this before,

21:57

you've definitely used it.

21:59

But Huffman only uses the overall frequency

22:02

of a chunk repeating.

22:04

Real data isn't just random chunks.

22:07

In our example, after "Never gonna",

22:09

you might get "give you up", "let you down",

22:13

or "run around and desert you".

22:15

You might get "make you cry",

22:16

you might get "say goodbye" or "tell a lie and hurt you".

22:19

Each one has its own probability,

22:21

and you can represent these probabilities

22:23

with a mathematical tool called a Markov chain.

22:26

The algorithm can then encode the stream of data

22:30

so that the more probable next chunks cost few bits

22:33

and the less probable ones cost more.

22:36

If you combine that with a much bigger search window

22:38

so it can point much further back in memory,

22:41

then you get the Lempel Ziv Markov chain algorithm, or LZMA.

22:45

LZMA was developed by Igor Pavlov around 1998,

22:49

and it often beats much more familiar methods.

22:52

In many cases,

22:52

it can shrink files to about 70% of the size

22:55

of a typical .zip.

22:57

Lasse took this elegant compression algorithm

22:59

and made it work on Linux,

23:01

and he called it XZ not because it stood for anything,

23:03

but just because it sounded cool.

23:05

- I'm using XZ quite a lot.

23:07

I think XZ is a wonderful project.

23:10

There are lots of different ways of compressing data.

23:12

Some of them are fast but they don't compress very well,

23:15

and some of them are slow

23:18

but they get extremely good compression.

23:20

- But across Linux,

23:22

projects are constantly shipping the same files and updates

23:25

to millions of machines,

23:26

so XZ is perfect.

23:28

You compress something once,

23:29

then you get a smaller file to download forever.

23:32

Lasse released XZ in 2009,

23:34

and over the next decade and a half,

23:36

it went from a niche tool to the common choice

23:38

whenever a project needed effective lossless compression.

23:42

So, XZ quietly spread everywhere,

23:44

eventually becoming a dependency of OpenSSH.

23:46

(suspenseful music)

23:51

- So, it was at some point in about February 2024

23:56

and Jia Tan, he emails me.

23:59

He's got all these new features in the new version of XZ.

24:03

- [Henry] He wins Rich over almost immediately.

24:06

- So I get to talk to hundreds of contributors all the time,

24:09

and I do get a feel for them.

24:13

I feel, you know, are they good coders,

24:15

which is what I really care about.

24:17

Are they conscientious people, are they helpful?

24:20

Do they respond to bug reports quickly?

24:24

And in all of the dimensions,

24:27

Jia Tan would be a very good contributor

24:30

because he's obviously a good coder.

24:32

He's very responsive, he's very keen,

24:35

and I love all that.

24:36

- All indications are that Jia is a great contributor,

24:40

and this puts Rich at ease,

24:41

so he lets his guard down.

24:43

And that's often where the problems start on the internet.

24:45

You can't keep your guard up forever.

24:47

But lucky for us,

24:48

with today's sponsor, NordVPN,

24:50

you don't have to.

24:51

NordVPN's Threat Protection Pro

24:53

blocks dangerous websites before they load.

24:56

It stops malicious downloads

24:57

and it strips out trackers and intrusive ads automatically.

25:00

And it works even when you're not connected to the VPN,

25:04

so a lot of these attacks never get the chance to start

25:06

in the first place.

25:07

I use NordVPN whenever I'm traveling

25:09

or working on public wifi

25:10

because it means that I don't have to think

25:12

about who's running the network.

25:13

It's just one click

25:14

and it's so fast that I often forget that it's on.

25:16

Not just that,

25:17

if there's a show that's no longer available in my region

25:20

or a sports team that's blacked out,

25:22

like I'm often watching international football

25:24

and they don't quite have it where I'm going,

25:26

well, in that case,

25:27

I can just switch my server location with one click

25:30

to unlock the content.

25:31

Apparently you can even use it

25:32

to find better deals on plane tickets

25:34

by changing your IP address to another country.

25:37

I haven't tried it yet, but that sounds fascinating.

25:40

So, if you wanna try it, you can get the best deal

25:42

by going to nordvpn.com/veritasium.

25:45

When you use that link or this QR code,

25:48

you'll get a huge discount.

25:49

Also, you get a 30-day money back guarantee through Nord.

25:53

It's a no brainer.

25:54

So again, that's nordvpn.com/veritasium

25:58

or you can click the link in the description below.

26:00

Thanks so much to Nord,

26:02

and let's get back to Jia

26:04

and the prize he's got his eyes on.

26:06

- At this point, we were preparing RHEL 10.

26:10

- [Henry] See, Red Hat ships two major flavors of Linux.

26:13

Fedora, which is free and publicly available,

26:16

and Red Hat Enterprise Linux, or RHEL,

26:19

which is available through a paid subscription.

26:21

This one has to be stable and secure

26:23

because it's widely used on the most important machines,

26:26

like in governments and hospitals.

26:28

Jia wants his code in RHEL,

26:30

but RHEL only has a new major release

26:32

about once every three years.

26:34

- So, there's definitely a deadline,

26:36

and that deadline was around sort of March, April in 2024.

26:40

- Jia has to act fast.

26:42

He wants complete control of any compromised machine.

26:45

And to pull it off, he has three steps in his plan.

26:49

Step one, the Trojan horse.

26:53

The code for XZ lives on a website called GitHub,

26:56

which tracks all edits to XZ's code using a tool called Git,

27:00

which was also developed by Linus Torvalds.

27:02

So, Jia starts by making small changes.

27:04

He changes the primary contact for bug reports

27:07

to his own email.

27:08

He tweaks small tools that will help him later.

27:11

But he can't sneak in the payload this way.

27:13

I mean, it'd be too obvious.

27:15

So he needs a way to sneak it in

27:16

without it ever appearing as normal source code on GitHub.

27:19

- So, when you're writing compression software,

27:22

it's very often the case

27:23

that your software is full of these binary blobs,

27:27

as we call them,

27:28

so just lumps of binary which are used to test

27:31

the compression or the decompression is still working.

27:34

- Nobody reads these test blobs.

27:36

They're included without ever appearing

27:38

in the human readable source code.

27:40

They're assumed to be garbage data.

27:42

But for Jia, this is the perfect place to hide his payload,

27:46

inside something that at first glance looks harmless.

27:50

But in reality, it's a Trojan horse.

27:54

But with a Trojan horse inside of XZ,

27:57

it's still just a lump of data in a binary blob.

28:00

He has to unpack it.

28:02

So, in the code that builds the project,

28:04

he slips in a small easy-to-miss change.

28:07

It hides among all the automatically generated code

28:09

and quietly unpacks his payload,

28:12

inserting it into the XZ library.

28:15

But now that it's inside of XZ,

28:16

it still has to pick the right time to act.

28:19

On to step two, Goldilocks.

28:24

Jia's end goal is to compromise a very specific part

28:27

of the SSH connection process,

28:29

the RSA authentication step.

28:31

He realizes that if he can slip

28:33

a small malicious component in there,

28:35

let's call it the payload,

28:36

then every time SSH checks for a key,

28:39

his code will run first.

28:40

It will quietly look for a special master key

28:43

that only he knows,

28:44

and if it sees that key, it'll let him straight in.

28:46

If it doesn't,

28:47

it'll call the real code and no one's the wiser.

28:50

So, he will have his backdoor entrance to OpenSSH.

28:53

But he can't just go in and rewrite RSA Decrypt,

28:56

the function that verifies the client's identity

28:58

during the login.

29:00

It's not that easy.

29:02

See, when you build an application,

29:04

you could take all the code you need

29:05

from different libraries

29:06

and bundle it into your application.

29:08

But there's a big drawback to this approach.

29:11

If 10 different applications on a system

29:13

all bundle the same library,

29:15

you end up with 10 separate copies on your machine,

29:18

so it's redundant.

29:20

That's why modern systems mostly use shared libraries.

29:23

When an application starts,

29:24

the linker fills in a table of addresses.

29:27

These addresses point to the functions

29:28

and variables it needs

29:30

from the libraries it links to.

29:32

That table is called the Global Offset Table, or GOT.

29:36

Now, when it wants to use something from a shared library,

29:38

it just checks the GOT

29:40

and jumps to the right spot in memory.

29:42

RSA Decrypt doesn't belong to OpenSSH at all.

29:45

It comes from a shared crypto library.

29:48

So to hijack authentication,

29:49

Jia can overwrite the GOT entry

29:51

that tells SSH where it is.

29:54

And to do that, he can use a little known tool

29:56

called an IFUNC resolver.

29:59

- The IFUNC is used where

30:01

let's say you wanna optimize your code

30:03

to run on Intel's hardware and AMD hardware.

30:06

Now, you could write the software just for Intel,

30:08

and it would run very fast on Intel

30:10

and it probably would run very badly on AMD hardware.

30:13

- [Henry] Instead, you keep multiple versions

30:15

of the same function

30:16

and the IFUNC resolver picks the right one

30:18

for the hardware you're on.

30:20

At first glance,

30:20

that sounds like a way for Jia to trick the system

30:23

into thinking it's running hardware

30:24

that needs his own compromised version of RSA Decrypt.

30:27

But there is a catch.

30:29

A library can only define IFUNC resolvers

30:31

for its own functions.

30:33

And since RSA Decrypt doesn't belong to XZ,

30:35

it can't use an IFUNC resolver to override it.

30:38

But IFUNC can still help him.

30:40

- So it will,

30:41

very, very early on in the running of the program

30:44

it will do this sort of determination

30:46

of what hardware is available,

30:47

and crucially, it does let you run your own code

30:51

in the library very early on.

30:53

- Now, at this early stage,

30:56

from within an IFUNC resolver,

30:58

Jia could try to directly rewrite the GOT entry

31:01

for RSA Decrypt.

31:03

But at this point, the system is still filling in the GOT,

31:06

so even if Jia changes the RSA Decrypt slot,

31:08

the loader will come along later

31:10

and write the real address back in,

31:12

wiping out his change.

31:13

And there's a limit on the other side as well.

31:16

To make this sort of hijacking harder,

31:17

once every entry is filled on the GOT,

31:20

the system marks the table Read Only.

31:22

That means that if Jia waits too long,

31:24

the RSA Decrypt entry is frozen.

31:27

So he has to slip it in at a very precise moment.

31:30

After the RSA Decrypt entry is filled in legitimately,

31:33

but before the table gets marked Read Only.

31:36

And that tiny window is the Goldilocks zone.

31:41

And to hit it, he's gonna need another tool.

31:44

So, linking shared libraries in the GOT often leads to bugs,

31:47

so Linux has a special debugging feature

31:49

that tracks what the system's doing.

31:52

It lets you run code

31:53

whenever the linker writes a symbol's address into the GOT.

31:56

It's called a dynamic audit hook,

31:58

and normally you'd use it to profile performance.

32:01

But crucially for Jia, there are no real guardrails.

32:04

The hook can run any code he wants.

32:06

And this is where IFUNC finally pays off.

32:09

Jia uses an IFUNC resolver to set the audit hook early.

32:13

Then, when the linker writes in

32:14

the real RSA Decrypt address,

32:16

the hook fires and swaps in his payload.

32:20

Right in the middle of the Goldilocks zone.

32:23

There is one final complication, though.

32:25

Audit hooks are normally configured by the system,

32:27

not by libraries like XZ.

32:30

So when Jia is first looking for the audit hook variable

32:32

that he's supposed to rewrite,

32:33

it's actually hidden from him,

32:35

so he first has to find it.

32:38

Within the IFUNC,

32:39

he scans a small region of binary code,

32:41

hunting for signs of the hook.

32:43

But it's just raw bites,

32:44

so he writes a tiny decoder

32:46

to turn them back into instructions that he can read.

32:49

Now Jia can find where the hook lives in memory

32:51

and finally plant his code.

32:53

Then, when RSA Decrypt gets called legitimately,

32:55

it triggers the payload and he's in.

32:58

But now that he's in, what does he do?

33:00

And how does he get out of there cleanly?

33:02

Step three, the cat burglar.

33:06

With Jia's exploit in place,

33:08

SSH isn't just checking for a legitimate login anymore.

33:11

It's also listening for a hidden master key.

33:14

And Jia is careful,

33:15

he doesn't want anyone else stumbling onto the backdoor,

33:18

so that master key isn't just a simple password.

33:21

It's actually a mini cryptographic exchange of its own.

33:24

First, the backdoor code checks for a shared secret,

33:27

and then, second, it authenticates the user.

33:29

And only if both checks pass does the payload run.

33:32

In effect, it's like the backdoor is running

33:34

a miniature version of the encryption from SSH

33:37

inside of SSH.

33:39

But in SSH,

33:40

it uses that encryption to keep the attackers out.

33:42

In this case, the backdoor is using that encryption

33:45

to make sure that it's only the attackers that can get in.

33:48

But he's still careful.

33:49

One of the main ways defenders catch intrusions

33:51

is through SSH logging.

33:53

So, to cover his tracks,

33:55

he wipes evidence of the backdoor ever firing.

33:57

And this is on top of the numerous safety checks

34:00

that he's inserted throughout the process

34:01

to make sure the system supports the backdoor

34:04

and doesn't crash and draw attention.

34:06

And this is the genius of Jia's trap.

34:09

It's cautious and meticulous,

34:11

designed to slip through only where it will run invisibly.

34:15

With all three of these steps complete,

34:16

he can finally control the machine undetected.

34:20

All he needs to do now is get his updated XZ

34:22

implemented in the next release.

34:25

But just as Jia is completing his backdoor,

34:27

an open source developer requests to remove the dependency

34:30

that links XZ to OpenSSH.

34:33

This would spell disaster for Jia Tan.

34:35

He becomes frantic,

34:37

pushing harder and harder to get his compromised XZ

34:39

into major Linux releases.

34:41

He gets it into an early experimental build of Debian.

34:44

He files a request to have it added to Ubuntu.

34:46

He's trying to land the backdoor everywhere he can

34:49

before anyone realizes what's going on.

34:52

And it's then that Rich gets his first message from Jia.

34:55

Over the next few weeks, he gets more and more insistent,

34:58

urging Rich to add the updated XZ

35:00

into the next release of Fedora.

35:01

- I'm always very keen

35:03

to talk to keen upstream contributors,

35:06

contributors who are really excited

35:09

about new things in their software,

35:11

who are really willing to help us get stuff into Fedora.

35:15

So, you know, that's great, love it.

35:16

That kind of makes my day, it's my happy place.

35:19

- Eventually, Jia gets what he wants.

35:22

Rich adds the updated XZ to a pre-release version of Fedora.

35:25

Jia has succeeded.

35:28

Except there's a bug.

35:31

In low-level code like the backdoor,

35:33

things you normally take for granted,

35:34

like memory management,

35:36

are not done automatically.

35:37

If a function grabs a bit of memory,

35:39

it also has to give that memory back when it's done.

35:42

And if it doesn't, then every time the function runs,

35:44

it grabs more and more memory and then never releases it.

35:47

Over time, the program just keeps growing.

35:50

That's called a memory leak.

35:52

And to catch problems like this,

35:53

developers use a tool called Valgrind.

35:56

It runs the program more slowly

35:57

but watches every memory operation for anything suspicious.

36:01

Valgrind is raising hell on Jia's code.

36:04

- We put XZ, this version, 560,

36:08

into Fedora 40.

36:10

We get a bug report initially.

36:13

- And the backdoor in XZ specifically is generating

36:16

invalid writes errors.

36:18

Well, the logic was written by hand,

36:20

bypassing the compiler's safety checks,

36:22

and so they accidentally wrote outside the memory stack.

36:25

Now, lucky for Jia, all this isn't immediately obvious.

36:28

Rich still hasn't noticed what's happening.

36:30

- New software has bugs, right?

36:32

It's the state of nature of software.

36:35

Software is absolutely full of bugs all the time.

36:37

- [Henry] Now, the real problem is inside the malicious code

36:40

in the test file.

36:41

But Jia can't just go and fix that,

36:43

that would completely expose the backdoor.

36:45

So he invents a cover story.

36:47

He claims that the random data he used

36:49

to generate the original test files,

36:51

well, it's not reproducible,

36:52

so he's replacing it.

36:54

And in this updated code, he fixes the memory error.

36:56

- It's a very convincing and plausible explanation

37:00

for why this test blob has to be updated.

37:03

But of course, it's not the real reason.

37:05

- All right, so now the real fix is in,

37:07

but if the bug just magically went away,

37:09

it would look a bit suspicious.

37:10

So he has to find a way to cover it up.

37:12

- So what he then does is he changes the IFUNC code

37:17

in a way where he adds like a whole bunch of comments

37:20

and changes to the code around it

37:24

that doesn't actually change the code

37:26

but is plausible enough

37:28

to look like he's changing how the IFUNC works

37:30

to fix the Valgrind bug.

37:32

- It does, listening to it and I'm like

37:33

I know that this is the evil hacker Jia Tan,

37:36

but I'm like, ooh, that's clever.

37:38

You know? - Yeah, I mean, look,

37:40

the guy is obviously not an idiot, right?

37:44

But none of this is suspicious.

37:47

This is what we expect from compression software.

37:50

And as a packager, it's not really my job

37:52

to fix every bug in upstream software.

37:57

As soon as it gets to a certain level of difficulty,

38:00

my thought here is, well,

38:03

Jia Tan has actually been writing this software, right?

38:05

So he's got it all in his head, he knows how it works.

38:09

It's easier for me to just give him the problem.

38:12

And I send the bug over to him

38:13

and like a day later he sends the fix back.

38:16

From my point of view, it's problem solved.

38:17

It worked, system worked, right?

38:19

I made the right call.

38:20

I don't see,

38:22

at that point, knowing what I know then,

38:25

I don't see that there's any problem.

38:27

- So we downloaded Jia Tan's version of XZ,

38:30

which was available on Fedora publicly,

38:33

but we made a slight modification.

38:35

Instead of using Jia's secret code, we're using our own,

38:38

and that means that we can take advantage of Jia's backdoor.

38:41

In this case, we're targeting the veritasium.com website.

38:45

And once we get control of it,

38:47

I got a little trick in store for Derek.

38:49

Now, to make sure I don't mess with any real traffic too bad

38:52

and lose my job,

38:53

we actually cloned the Veritasium website

38:55

and put it on a very similar URL,

38:57

but it will work the same.

38:59

Of course, Derek doesn't know that I've covered my bases.

39:01

- Oh no.

39:04

Man, when you guys do these things, I just,

39:07

I start to get more and more scared now.

39:09

I want it to work for the video,

39:10

but I also don't want it to work

39:12

'cause I don't wanna screw stuff up, so.

39:14

- Yeah, it's the risk you take, I guess,

39:16

letting us run rampant.

39:17

- It is a concern.

39:18

- I'm gonna execute a script here,

39:20

which is gonna open up.

39:22

It's opening up a port on the Veritasium server.

39:26

And then on this side I'm gonna execute a little script.

39:31

- Uh-oh. (Henry laughs)

39:36

Henrytasium.

39:37

Who is this goof?

39:38

On the main photo,

39:40

you spent time getting all suited up there.

39:42

- Of course.

39:44

- Looking sharp, sir. - Thank you, thank you.

39:47

- [Derek] "Videos Derek would never approve of."

39:49

Uh-oh. - The concept was

39:51

over the years that we've worked together,

39:53

you've said no to a bunch of my ideas,

39:55

and I figured now with control of the website

39:57

it's about time the world saw it.

39:59

- "Surviving 7 days living underwater.

40:02

How do saturation divers live at -1,000 feet?"

40:06

I mean, you wouldn't be outside, right?

40:08

So I don't know why you need goggles there

40:10

and like a respirator but you're not underwater.

40:14

"Why it's almost impossible to shoot 4,000 meters."

40:18

It's a sniper video.

40:21

Yeah.

40:22

"The CIA lied: exposing how the CIA lied about torture."

40:25

I feel like that still goes into a tough territory for us.

40:29

"How xenon gas replaced oxygen.

40:31

I attempted to climb Mount Everest on xenon gas."

40:34

That sounds like a terrible idea.

40:36

This is what this whole video is about,

40:38

this whole video is just about

40:40

trying to get me to green light your projects.

40:44

You know, if people like these video ideas,

40:46

they can feel free to let us know in the comments

40:47

and we can actually make them.

40:49

The top upvoted comment one, I will green light happily.

40:53

- Let's go!

40:54

- Is this live to the public right now?

40:57

- It is live, yeah, it's live on the server, yeah.

40:59

- If anyone's on the website right now,

41:01

that would be very strange for them.

41:03

Look, I'm not pleased, I would like you to change it back.

41:08

It doesn't seem like this should be possible

41:11

on a Linux server.

41:12

So the big question is, how did you do it?

41:15

- The address is the server,

41:17

the seed is our code to get in,

41:19

and then the command is what we're doing

41:20

to essentially open up, in this case nc,

41:24

which is like opening up a port on the machine

41:25

that we can then access from this second terminal.

41:28

Then what we're doing is on this side

41:30

we're running a script that's connecting

41:32

to that port that's just been opened up,

41:34

copying our files and then by the end we're gonna have

41:37

root access on the server.

41:39

That means that it thinks that we own the thing.

41:42

- That's so crazy.

41:44

This is a very scary hack.

41:47

I do not like it.

41:48

- Another thing is that this is a very obvious way

41:51

of demonstrating this attack.

41:53

Like I've changed everything on the website,

41:54

you immediately know that I've gone in

41:56

and hacked the server.

41:57

If we were doing this for real,

41:59

we would do it a lot sneakier.

42:00

- I mean, as you say, right?

42:02

The thing to do would not be

42:03

to totally rework someone's website so everyone notices,

42:06

but to change it subtly so nobody notices

42:09

so you can skim data or, yeah, like get credit card details

42:13

or take payments to a different location,

42:16

stuff like that.

42:17

- So you can copy anything you want,

42:19

you can change anything you want,

42:20

you can delete anything you want.

42:21

So if there's any interesting documents or crypto tokens,

42:25

any files you're interested in, those are yours now.

42:28

If there's secret communications going across these,

42:31

and let's keep in mind all of our communication networks

42:34

are also built around Linux,

42:36

those communication streams are yours now.

42:39

If you wanted to encrypt something and ask for ransom,

42:43

that's possible now.

42:44

- [Henry] The possibilities really are endless.

42:47

After two and a half years of hard work,

42:49

slowly infiltrating the XZ Project

42:50

and weaving in this ingenious backdoor,

42:53

Jia's done it.

42:55

He now has free rein on any machine

42:57

that installs the new Fedora pre-release.

42:59

And he also gets the same access on Debian testing

43:01

and Ubuntu's pre-release environments.

43:04

And with RHEL 10 coming up,

43:05

his code could infect some of the most important computers

43:07

in the world.

43:08

Now he should be able to relax, wait for the release,

43:11

and he's got his backdoor key.

43:13

But just when he thinks everything's going right...

43:23

(suspenseful music)

43:23

Andres Freund is a German programmer.

43:26

He's not a security researcher, he's not a hacker.

43:29

He's just an employee at Microsoft

43:31

working on an open source project called Postgres.

43:34

One day in March 2024,

43:36

he tries out the unstable release of Debian

43:38

to make sure that Postgres will run smoothly.

43:40

But while checking the server connection times,

43:42

he notices something odd.

43:44

A slowdown.

43:46

It's not much.

43:46

In the worst case, it's only half a second,

43:49

but it's enough to make Andres suspicious.

43:51

We tested the connection times ourselves

43:53

on our own version of the XZ hack

43:54

and we found the exact same thing.

43:56

Consistent slowdowns of about 400 to 500 milliseconds.

44:01

Andres had already seen the problems

44:02

with XZ and Valgrind weeks earlier

44:04

and this only makes him more suspicious,

44:06

so he digs in deeper.

44:08

He looks at recent additions to OpenSSH

44:10

and traces the delay back to an update in XZ.

44:13

He sees the binary test files

44:14

but notices that they were never used in a test.

44:17

It's even stranger.

44:18

Andres tries to get back to work,

44:20

but he can't stop thinking about it.

44:22

- [Andres] I remember sitting in a bunch of meetings

44:23

and like not really being able to concentrate

44:26

because it feels like,

44:29

I need to continue looking into this.

44:31

- Eventually, Andres sees it.

44:32

This isn't some bug, this is a backdoor.

44:36

And this backdoor is meticulous.

44:39

It hunts through memory to find the audit hook,

44:41

it implements a decoder to read those raw bites,

44:44

and then it wraps everything in custom encryption

44:46

and safety checks

44:47

so that it only triggers on the right kind of connection.

44:50

I mean, it even garble its own strings

44:52

so that it won't be detected.

44:54

It's incredibly cautious.

44:56

But all of that takes time, and in the end,

44:59

that's what grabs Andres's attention.

45:01

- If they had done less obfuscation,

45:02

I probably would not have noticed that anything was wrong.

45:05

- [Henry] Now, XZ's security contact is Jia Tan,

45:08

so Andres can't exactly report it

45:10

through the usual channels.

45:12

Instead, he emails the Debian security team directly

45:15

and posts a detailed report

45:16

to a public security mailing list.

45:19

Then, all hell breaks loose.

45:21

- I'm called up

45:24

on I think it was a Friday evening,

45:28

in fact, I'm sure it was a Friday evening,

45:30

to join a internal Red Hat meeting.

45:34

It's immediately obvious that this is not a normal meeting

45:38

because like our head of security is there.

45:41

It's explained to me that it's been found

45:45

by somebody in the community

45:46

that XZ has a backdoor,

45:48

and immediately I'm like, WTF?

45:52

How did this happen?

45:53

- To cover their bases,

45:54

Red Hat quickly rolls Fedora back

45:56

and tells all their users to revert,

45:58

and the whole open source community

45:59

starts digging into the project

46:01

to understand what went wrong.

46:04

One thing is clear, though.

46:06

Andres is a hero.

46:07

- Now, the fact that this was discovered

46:10

in a different test at all,

46:11

that was lucky.

46:12

But then what are the chances

46:15

that someone who isn't looking for a security bug

46:17

spends days investigating this?

46:19

So, big kudos to the researcher,

46:23

and yeah, saved us all

46:25

from possibly a doomsday on the internet.

46:27

- I think that Andres did a brilliant job

46:31

because he did what I should have done, actually,

46:34

which is I should have looked at the, you know,

46:35

I should have looked at the bug when I saw it

46:39

and I should have gone there, you know,

46:42

like a crazy hound sort of sniffing around

46:45

trying to find out what's going on.

46:46

- [Henry] Andres even gets a shout out

46:48

from the CEO of Microsoft.

46:50

But when the story breaks,

46:51

the mainstream response is surprisingly muted.

46:53

- Actually, I'm still surprised now

46:56

that the mainstream news outlets

46:59

haven't really covered this very much.

47:01

Well, I can tell you how many systems

47:03

would have been compromised,

47:04

which would have been millions,

47:05

- Anything from spying, to ransom,

47:09

to just taking down entire countries,

47:13

you could have done it with this backdoor.

47:14

- [Henry] I guess the big question is, who is Jia Tan?

47:19

- That's the question, isn't it?

47:22

Okay, so my feeling is that Jia Tan,

47:24

the person that I talked to I believe is one person,

47:28

but I also believe

47:29

that behind him must be a group of people.

47:34

And they worked for quite a while.

47:36

I mean, they were at this for perhaps two and a half years

47:39

that we know about.

47:40

- If you look back at the accounts pressuring Lasse,

47:43

they share some similarities.

47:45

They use free email addresses

47:47

and they have almost no footprint outside of the XZ threads.

47:51

These were very likely sock puppet accounts,

47:54

identities manufactured to apply pressure

47:56

as part of a multi-stage social engineering campaign.

47:59

- Now, who spends a million dollars

48:02

and takes two and a half years

48:03

to attempt to break into every hotel room on the internet

48:07

with a master key?

48:09

(suspenseful music)

48:10

I think it's not a criminal organization

48:12

because I don't think a criminal organization

48:14

would have that patience

48:16

to spend that time without any real return.

48:20

So I think it has to be a nation state actor here.

48:26

- A lot of the aliases, like Jia Tan,

48:29

they sound like Asian names,

48:30

and the published changes are all timestamped in UTC+8,

48:34

Beijing time.

48:36

So the signs point to China.

48:38

And that's why it's probably not China.

48:41

I mean, why would they make it that obvious?

48:43

Every other part of the operation

48:45

has been so meticulous, so cautious.

48:48

And they also worked on Chinese New Year,

48:49

but not on Christmas.

48:51

And over the years, there were nine changes

48:53

that fall outside of the Beijing time into UTC+2,

48:58

which is a time zone that includes Israel

49:00

and parts of Western Russia.

49:02

That's why some experts have speculated

49:04

that this could be the work of APT29,

49:07

a Russian-state-backed hacker group also known as Cozy Bear.

49:11

- But again, do we know?

49:13

No, of course we don't know who it is,

49:15

and we likely will never know.

49:16

Jia Tan himself just disappeared

49:18

as soon as this exploit became publicly known

49:21

and never heard from again.

49:23

- In a sense it doesn't matter

49:25

whether this was Russian or Chinese or Iranian.

49:28

We need to protect from these types of backdoors

49:31

no matter where they're coming from.

49:32

- I see this as like, you know, the canary in the coal mine

49:36

of what's gonna be happening

49:38

as attackers get more sophisticated,

49:41

they make fewer mistakes.

49:43

You know, the gloves are off in a way.

49:46

I don't think that the Linux community is fully,

49:50

you know, is fully ready for this yet.

49:54

- In the aftermath of XZ,

49:56

the open source community

49:57

poured over countless small similar projects

50:00

looking for similar campaigns,

50:03

but they found almost nothing.

50:04

- I'm worried that we didn't find other backdoors.

50:07

The incentives are just too clear.

50:09

There are state-sponsored parts of either governments,

50:13

militaries or even private contractors working for states

50:17

that are all preparing for the next cyber escalation,

50:21

some kind of a war, some kind of a geopolitical conflict,

50:24

and where are all of those backdoors?

50:26

There's just too many people incentivized to put backdoors

50:29

for the few backdoors that we're actually discovering.

50:32

- Now, some experts have argued

50:34

this reveals a fundamental flaw in the open source model,

50:38

but not everyone agrees.

50:39

- Closed source software would be no better here.

50:42

In fact, who's to say that there aren't already state spies

50:46

working as paid software engineers

50:48

at some of the larger companies

50:50

putting in exactly backdoors like this?

50:53

But then there would be no community member

50:55

running free testing and detecting this by chance.

50:59

This backdoor, if anything,

51:00

underlines the ethos of open source.

51:03

- I mean, just think of what it took

51:05

to get this done in public.

51:07

There was a multiple-year social engineering campaign,

51:10

there were all these layers of misdirection,

51:12

and then there was code that was designed

51:13

to withstand constant scrutiny.

51:16

Compare that now with a closed source hack.

51:18

Sometimes all it takes to get a backdoor installed there

51:20

is a court order,

51:22

or you have a public company

51:24

that can just brush a breach under the rug.

51:27

I actually used to work as an open source researcher myself

51:30

at the Japanese telecom giant NTT,

51:33

and my perspective is

51:34

that it's only because this is an open source project

51:36

that it's been picked apart, analyzed,

51:39

and turned into a conversation about security at all.

51:42

One that focuses on the fundamental vulnerability.

51:45

It's not the code, it's the people.

51:48

And how the system has not supported them enough.

51:51

- I feel for Lasse that

51:55

he's given this beautiful gift

51:58

to the whole world

52:02

and, you know, what have we,

52:03

what has humanity done back to him, right?

52:06

We've poisoned his gift.

52:09

And then I think implicitly a little bit,

52:12

not everyone's saying this,

52:13

but implicitly we're blaming him

52:17

for not being there to maintain this stuff for free forever.

52:23

But why are we demanding

52:25

that Lasse do anything

52:28

when he's not being paid for this stuff?

52:30

And that's, in my opinion, quite unfair.

52:35

On this Saturday evening,

52:37

we were working together on a workaround

52:40

for this bug in RHEL 9

52:43

that he's added to XZ,

52:44

and he absolutely could have told us to get lost,

52:49

and didn't.

52:50

What a brilliant guy.

53:00

(electronic beeping) (music fades out)

UNLOCK MORE

Sign up free to access premium features

INTERACTIVE VIEWER

Watch the video with synced subtitles, adjustable overlay, and full playback control.

SIGN UP FREE TO UNLOCK

AI SUMMARY

Get an instant AI-generated summary of the video content, key points, and takeaways.

SIGN UP FREE TO UNLOCK

TRANSLATE

Translate the transcript to 100+ languages with one click. Download in any format.

SIGN UP FREE TO UNLOCK

MIND MAP

Visualize the transcript as an interactive mind map. Understand structure at a glance.

SIGN UP FREE TO UNLOCK

CHAT WITH TRANSCRIPT

Ask questions about the video content. Get answers powered by AI directly from the transcript.

SIGN UP FREE TO UNLOCK

GET MORE FROM YOUR TRANSCRIPTS

Sign up for free and unlock interactive viewer, AI summaries, translations, mind maps, and more. No credit card required.

GET STARTED FREE SIGN IN