TRANSCRIPTEnglish

Cursor, Claude Code and Codex all have a BIG problem

44m 9s9,436 words1,293 segmentsEnglish

FULL TRANSCRIPT

0:00

You've probably seen me shifting around

0:01

my dev tools over the last few years

0:03

from Copilot to Super Maven to Cursor to

0:06

Cloud Code to Codeex to the Codeex app

0:08

and many, many more things. But I've

0:10

noticed something about all of these

0:11

solutions. A core problem that seems to

0:14

be holding all of them back. You've

0:16

probably noticed it, too, but maybe not

0:17

in the same way that I have, or maybe

0:19

not understood the underlying problem

0:21

that's causing it. To put it frankly,

0:24

they all suck. Just like they're bad. I

0:27

come from an era where the tools we use

0:29

were so carefully finely crafted to

0:31

behave super well and I long for the

0:34

days of sublime text almost every day

0:36

where I open up cursor and watch it

0:38

shift around my UI a whole bunch

0:40

whenever I click anything and then

0:41

nothing does what it's supposed to do.

0:43

You'd think with all of these AI tools

0:45

making it easier than ever to write code

0:46

that we'd have things that work better,

0:49

right?

0:51

About that. That's actually the thing I

0:54

really want to talk about today. The

0:55

fact that all of these tools were

0:56

written with these models is not an

0:58

advantage. In many ways, it's a

1:00

disadvantage. The problem with cursor,

1:03

cloud code, codecs, and all of these

1:04

other tools is that they were built with

1:07

those same tools. Historically, this has

1:09

been a good thing. When you write the C

1:11

compiler in C, you're able to do awesome

1:13

things. When you write the C compiler in

1:15

C, it makes you a good target to dog

1:18

food the language, and it makes the

1:20

thing that you're maintaining much

1:21

easier to maintain by the users of the

1:23

thing. Generally speaking, dog fooding

1:25

your stuff is good. Generally speaking,

1:28

and now I have to drop some

1:30

controversial takes that I don't think

1:32

it's the right call for a lot of these

1:34

companies. In fact, I think their choice

1:36

to bet on these things as early as they

1:38

did might be causing them a lot of

1:40

problems. This video is going to ruffle

1:42

some feathers and sadly many of those

1:43

feathers belong to friends of mine,

1:45

companies in my portfolio, etc. I am an

1:48

early investor in cursor. I technically

1:50

have something tied to anthropic. I

1:52

don't really know cuz it was a scout

1:53

check to bun that now is part of

1:55

anthropic codeex I funny enough have no

1:57

financial tie to anything open AI beyond

2:00

using their stuff. So as always account

2:02

for bias. I do my best to be transparent

2:04

but I'm also about to talk about these

2:05

companies in a way that is going to piss

2:07

all of them off and probably get me into

2:09

some trouble. So this is going to be

2:11

fun. If all of my early investments are

2:13

going to go to zero due to all the

2:14

things I'm talking about here, I need to

2:16

make sure we make some money. So we're

2:18

going to take a quick break for today's

2:19

sponsor before I start roasting. If

2:21

you've been around for a while, you

2:22

probably remember how much I loved

2:23

today's sponsor, Augment. They were my

2:25

favorite way to figure out what's going

2:26

on in really big code bases because

2:28

their indexing engine was incredible.

2:30

But then I started using other tools and

2:32

as such stopped using Augment as much.

2:34

Can't our agents figure out what they

2:36

need to know nowadays anyways? Well,

2:38

kind of. But you don't get that absurd

2:39

level of immediate responsiveness that I

2:41

got I just fell in love with with

2:43

Augment. So, they decided to put this

2:44

out for everyone to use via a tool you

2:46

can add to whatever agents you're using

2:48

today. As you guys know, I like Codeex

2:50

quite a bit, but has a bad habit of

2:52

searching for ever trying to find

2:54

information. Here I have the huge T3

2:56

chat codebase, and sometimes Codec can

2:58

just search this forever trying to find

3:00

things. Well, Augment has their own CLI

3:02

you can use, and it's really cool. I

3:04

just opened it and told it to index the

3:05

codebase. Where it gets way cooler is

3:08

when you use it in other tools. So, now

3:10

it is indexed. Check this out. We're

3:12

going to switch over to codeex. It just

3:13

booted the MCP. Let's ask it for

3:15

something a bit annoying. trace the

3:17

logic around how subscriptions work for

3:21

paying users and it immediately uses the

3:24

codebase retrieval tool that is provided

3:26

by augment. This tool is weirdly capable

3:29

of finding exactly what you need and

3:31

almost nothing else. See how quickly it

3:32

found all of this. This is real time.

3:34

We're under 20 seconds in. This would

3:36

have taken codeex like 5 to 10 minutes

3:39

before and possibly wouldn't have been

3:41

as accurate because it might have missed

3:42

things that were related. Indexing your

3:44

codebase makes it possible to find

3:46

related information that doesn't come up

3:48

via GP, and the results end up

3:50

consistently being way better than I

3:52

would have expected for the exact same

3:53

prompt. Working on a small codebase or a

3:55

small project, this isn't going to be a

3:57

groundbreaking thing. But for those of

3:58

us that rotate between small and big

4:00

code bases, it makes them both feel

4:02

almost identical to work in, which is a

4:04

magical unlock for those of us building

4:06

bigger and bigger things. Augment always

4:08

felt like they understood enterprise and

4:10

big business work better than the

4:11

competition and have never felt it more

4:12

than I have using the context engine in

4:14

other tools. If you feel like your

4:16

agents don't understand your codebase,

4:17

fix that now at soyv.link/ augment. So

4:20

if we want to explain really simply the

4:22

problem with cursor cloud code and

4:24

codecs, simple. They suck to use.

4:27

They're super inconsistent. Not just

4:29

because the models are

4:30

non-deterministic, but because the

4:31

actual code that you're relying on is

4:33

really bad. They change every day and in

4:36

ways that are obnoxious. One of the most

4:38

common comments I get on my videos is,

4:40

"How do you get the agent and editor

4:42

tabs on the navbar in Cursor?" If you're

4:45

not familiar, Cursor had this wonderful

4:47

agents editor tab toggle in the corner

4:49

here where you could switch between an

4:51

agent mode where you were just looking

4:52

at and managing the agents and the

4:54

editor was secondary and the traditional

4:56

editor mode that we all expect something

4:58

like cursor to be. I loved this and I

5:01

switch behind them all the time. I would

5:02

spend most of my time in the agent mode

5:04

and then switch to the editor mode when

5:05

I really wanted to dig into the code.

5:07

Soon after this launch, and this was the

5:08

cursor 2.0 launch, I went to the cursor

5:11

office to talk about a lot of the

5:12

problems I was having. I was like this

5:14

close to what was going to be the cursor

5:16

crash out video. I ended up ranting a

5:18

lot about my problems with cursor. It

5:20

went kind of viral on Twitter. They had

5:21

me to the office. They sweet talked me.

5:23

They bought me some chocolate. Told me

5:25

things that I'm still horrified about to

5:26

this day. And then I walked off. They

5:28

have made progress since. But one of the

5:30

things they told me that I told them

5:32

this is [ __ ] terrible and stupid and

5:34

you should not do this is they decided

5:36

they were going to remove the agent

5:38

editor toggle in favor of letting people

5:41

customize it more. So instead of that,

5:43

we now have all these fun buttons, these

5:46

toggles, the change layout, which

5:48

underneath has the agent and editor

5:50

layout, but those aren't things you

5:52

toggle between. Those are default

5:54

template starting points for how you

5:56

might want to configure your editor. And

5:58

now everything's [ __ ] broken. I

6:00

switch to editor mode. I switch to agent

6:03

mode. The sidebars change where they

6:05

are. So here the sidebar for the agent

6:08

stuff is on the left. When I switch to

6:10

the agent mode, it ends up moving to the

6:13

right. There was a hotkey to toggle

6:15

between these things before. I think

6:16

they changed it. But now when I do

6:18

almost any of the things I did before,

6:20

oh [ __ ] of course I leak my [ __ ]

6:22

email. The fact there's a one-click leak

6:24

email button in the editor is enough of

6:25

a reason for me to curse them out. What

6:27

the [ __ ] cursor.

6:29

I don't think there's any product that

6:30

has caused me to leak my email more than

6:32

[ __ ] cursor has. I'm very thankful I

6:34

use my GitHub email and not one that

6:36

actually matters because it's leaked in

6:37

half the times I've opened the [ __ ]

6:38

app. You see what's happening? I'm

6:40

trying to not crash out at Cursor both

6:42

because I'm an investor and because I'm

6:44

afraid of the company and also just cuz

6:46

Crash Out content isn't my favorite

6:47

thing to do. But half the time I open

6:49

this editor pisses me off. Just being

6:50

real. It It is falling the [ __ ] apart.

6:53

It's bad. Regardless, the thing I'm

6:55

trying to say is the feature that was in

6:57

my video that people liked enough that

6:58

they're asking for it in the comment

7:00

section just got deprecated for no

7:02

[ __ ] reason at all. There's no reason

7:04

to have removed the agent editor toggle.

7:06

I think it is dumb that they did that. I

7:08

told them in the office it was stupid

7:09

that they were doing that. They did it

7:11

anyways. And now I get questions all the

7:13

time like, "Where's that button? That

7:14

seems really useful." I agree. It was

7:16

really useful. It [ __ ] sucks. They

7:17

removed it.

7:19

Okay, calm down. Stop just raging at

7:22

cursor. command Q and go back to VS Code

7:25

for now. Please cursor, let me switch

7:27

back. I want to, you know, I want to.

7:29

You gave me a bunch of credit to

7:30

incentivize that I use it. And I still

7:31

don't because I don't like using your

7:33

editor right now. I use cursor in the

7:35

browser more than I use it in the

7:36

[ __ ] IDE at this point.

7:38

Anyways, if you thought my cursor

7:40

crashes were bad, wait till we start

7:42

talking about claude code. Oh man, the

7:46

fact that a CLI has even more

7:49

non-deterministic [ __ ] behavior than

7:52

a fork of a giant app like VS Code. Like

7:55

like just seriously imagine this.

7:57

Imagine 5 years ago somebody told you

7:59

that they were moving away from idees

8:01

and towards CLI because they wanted

8:04

something simpler and less buggy and you

8:06

had to sadly respond to them, "Sorry

8:08

about that. The CLI are actually more

8:10

buggy." And they just stare at you like,

8:12

"What the [ __ ] did you just say? The CLI

8:14

is more buggy. What? How the [ __ ] do we

8:17

get here? Obviously, something like

8:20

pasting images is not going to be the

8:22

most consistent thing in the world in a

8:24

terminal UI, but the fact that it is so

8:27

non-deterministic and broken is absurd.

8:30

It's absurd enough that it caused me to

8:31

just [ __ ] lose my [ __ ] on stream last

8:33

week. It just took so long. There are so

8:37

many things with cloud code that are

8:38

driving me mad. What just happened

8:39

there? because you don't know what keys

8:41

I was pressing. When you paste an image

8:43

in cloud code, it takes time and the

8:46

images are often big enough that it has

8:49

to run local compression before

8:51

uploading it to their server. It doesn't

8:53

block the input when that happens and it

8:55

doesn't show you that anything's

8:56

happening. So, I just submitted a

8:58

message while it was waiting for the

8:59

image to attach. It submitted without

9:02

the image attached because it doesn't

9:03

block or wait until it's done. We even

9:05

figured this out with T3 chat in our

9:06

first month. And then when it was done,

9:08

it wasn't there. It didn't even show it

9:10

in the little section on top here. So I

9:13

repasted it and sent it. And when I sent

9:15

the follow-up, it are you kidding.

9:18

Literally, while I am explaining this,

9:20

it just failed to compact. I I do not

9:23

get how anyone does serious work in

9:25

cloud code. It is not a serious

9:26

application. It is

9:30

what the hell is going on? This model,

9:33

this harness, this ecosystem feels like

9:36

it is burning. What the [ __ ] That's the

9:40

moment I decided I had to do this video.

9:42

I still feel rage from that one. The

9:44

amount of things that went wrong there.

9:46

First off, pasting an image didn't block

9:48

the input. So, I accidentally submitted

9:50

without the image. But, it turns out if

9:52

you submit a message while the image is

9:54

still processing, it will silently be

9:56

attached to the next thing you sent. So,

9:58

I pasted a second image accidentally

10:00

because I wanted the model to know what

10:01

the UI looked like. So, I pasted the

10:02

screenshot again. It showed me I had one

10:04

attachment. I sent it now has two.

10:06

Obnoxious. I went to go demonstrate

10:08

this, pasted a third image, and before I

10:11

could even send it, I hit the context

10:13

limit because it failed to correctly

10:14

compress the other image because of the

10:16

weird race condition that existed. And

10:18

as a result, the thread was dead. There

10:20

was no reviving this thread. The work

10:22

was gone. It just it died. It's over.

10:23

It's done. When was the last time you

10:25

used a dev tool that wasn't some crazy

10:26

AI crap that like one little bug in the

10:29

UI could just throw away that whole path

10:31

of work and force you to restart from

10:33

scratch? It's meme tier. It's [ __ ]

10:36

meme tier. I actually cannot fathom that

10:38

this is how we are writing code every

10:40

day now. It is absurd. Like the anti-I

10:42

people should not be talking about like

10:44

how I don't know [ __ ] co-pilot sucks

10:46

or whatever the hell they're bitching

10:47

about. Just come use the tools and then

10:50

talk [ __ ] on this and not even the

10:52

performance. Like the performance is

10:53

unacceptable. We'll get to that. But the

10:55

the state of the UX, the lack of

10:57

consistency, the the fact that it just

10:59

it feels like I'm the only person using

11:00

it because I don't see anyone else

11:02

bitching about these things. It's so

11:04

bad. I have yet to have an experience

11:06

with Claude code that didn't feel like

11:08

they were forcing an old broken UI into

11:11

the terminal with all sorts of

11:12

nondeterministic [ __ ] The worst

11:14

part isn't that it fails all the time.

11:16

It's that it never fails the same way

11:17

twice. It's a slot fest. And that's the

11:20

theme I want to drive home here. All of

11:22

these [ __ ] tools committed to vibe

11:25

coding way too hard way too early. And

11:29

sure, vibe coding is a slur, whatever.

11:31

We all have our terms we like to use.

11:33

What I'm referring to here is letting go

11:35

a bit and letting the agent do its

11:36

thing. Steering less, coding more, and

11:39

being a little too willing to accept the

11:41

code that the agent made. You can say

11:43

I'm exaggerating about this here that

11:45

there's no way that they actually vibe

11:46

coded Cloud Code. Want to [ __ ] bet

11:49

within anthropic? Claude code is quickly

11:50

becoming another tool we can't do

11:52

without. Engineers and researchers

11:53

across the company use it for everything

11:55

from major code refactors to squashing

11:57

commits to generally handling the toil

11:59

of coding. This was in February 24th.

12:01

Wait, it's not February 24th. Oh,

12:03

February 24th of last [ __ ] year when

12:06

the best model available was Sonnet 37.

12:08

They were already using Claude Code for

12:10

the majority of their work on Sonnet 3

12:14

fucking7. That's the problem here.

12:16

Sonnet 37 was a very impressive model.

12:18

It was able to do tool calls reliably.

12:20

It was meaningfully smarter than 3.5. It

12:23

could do some little UI fixes in ways

12:25

that were better than before. Maybe,

12:27

just maybe, if your merge conflict was

12:28

simple enough, it could help you resolve

12:30

it. Maybe, you cannot build a serious

12:33

application with Sonnet 3 [ __ ] 7.

12:35

Let's be real here. This is the thing I

12:37

want to emphasize.

12:39

They committed to using AI to code too

12:42

early because they wanted to build

12:44

things using AI, using the tools they

12:46

were making to maximize the chance that

12:48

they would make something good. Both so

12:50

they could iterate faster, but also

12:52

because they wanted to commit to using

12:53

AI for things as they're building tools

12:55

to use AI for things. And what resulted

12:57

is a total [ __ ] slopfest. To better

13:00

explain this, cuz I know a lot of you

13:02

guys have not had traditional jobs. And

13:04

even for those who have, you probably

13:06

were brought into a giant codebase that

13:07

has already existed for years. The way

13:09

that code bases work over time is very

13:12

interesting. If we do this as a very

13:15

very vague quality over time, the way

13:18

that working in a codebase tends to work

13:20

is this. Initially, it does like okay,

13:24

it quickly has like a nice place. It

13:26

goes down a bit, you care more, you

13:28

restore it, but eventually you hit a

13:30

plateau and it stops improving. And when

13:34

I say quality here, I mean a lot of

13:36

different things. I mean the quality of

13:38

the experience users are providing. I

13:40

mean the quality of the patterns that

13:41

exist in the codebase that we're relying

13:43

on. I mean things like the packages

13:45

you're using and what versions you're on

13:47

of them. I mean the likelihood that

13:48

you'll ever upgrade the package in the

13:50

first place. If I init a new project and

13:52

I'm on React 19 and a month later React

13:55

20 comes out, I'm probably going to

13:56

update. If I've been using that codebase

13:58

for 4 years and React 20 comes out, I'm

14:00

much less likely to update. Codebase

14:02

inertia is real. Every codebase has a

14:05

point in time where the quality of

14:08

working in it stops improving and it

14:10

stops being a focus of the team. The

14:12

amount of time it takes to get there can

14:14

vary a lot based on different things. I

14:16

would argue generally speaking 3 to 6

14:20

months of focused effort from the team

14:22

is what you get and at that point the

14:25

quality of the codebase is the the bar

14:27

that you're going to hit. Like things

14:29

can get worse. Absolutely. Believe me,

14:31

things will always get worse. But the

14:33

quality of your codebase 6 months in is

14:35

the best it will ever be. If you don't

14:38

have the codebase exactly how you want

14:39

it and working in the way that you want

14:41

to work in it at the 6-month mark, it

14:43

will be a downhill ride and you will

14:45

never see the light again. And I think

14:47

that's what's happened to a shitload of

14:49

these projects. If the 6-month mark was

14:52

a pile of vibecoded slop that you wrote

14:54

with Sonnet 35 and 37, you're [ __ ]

14:58

It's not going to get better. And this

15:00

is the problem. A lot of these projects

15:02

are no longer months in. They're now

15:04

many years in. Claude Code is roughly a

15:07

year and three or so months in. And it

15:09

was really good for those first 6

15:11

months. It felt like it was improving

15:12

consistently up until I don't know,

15:14

Augustish, September maybe. And now I've

15:17

just felt the downward trend as it gets

15:20

buggier, as it gets slower. It's now at

15:22

the point where I'll just I'll show you

15:24

my favorite one here. I'm just going to

15:26

open Claude Code and I'm going to

15:27

immediately start typing. This is me.

15:30

Okay, it wasn't that bad that time. Half

15:32

the time when I open Cloud Code, it

15:34

doesn't actually start recognizing my

15:36

inputs until I'm a word or two words in.

15:39

How the [ __ ] is a CLI app locking the

15:43

input box? This happening on the web was

15:46

a meme that kind of destroyed the web's

15:48

reputation. The idea that your keys

15:50

would be sticky, it would take too long

15:52

to start showing you the changes. And

15:54

now we have it in our terminals. Like,

15:57

what? Ah, apparently open code's bad

16:00

about this, too. Let's try typing.

16:02

Typing. Wow, I got to type a lot before

16:04

I started recognizing characters. Here's

16:06

what we'll do. I'm just going to run my

16:08

finger like 1 to 10.

16:12

I got all the way to the end

16:15

and it still wasn't recognizing inputs.

16:18

That is pretty hilarious. That's what

16:19

you get for getting out of that mode.

16:21

They just did the database migration, so

16:22

that's cool. So, let's try one last

16:23

time.

16:25

Okay, got to eight and nine. Yeah,

16:28

insanity. Uh, how do we get here? I do

16:33

sympathize for the cursor team a bit

16:35

because they started with one of the

16:37

most complex giant code bases and none

16:39

of their engineers had worked on it

16:40

before because they forked VS Code. That

16:43

is a genuine difficult challenge. Taking

16:45

something as big as VS Code and making

16:47

meaningful changes and having to

16:48

maintain it over time while also making

16:50

sure to bring in whatever is needed from

16:52

the actual OG codebase when things are

16:55

being upstreamed. That sucks. It's real

16:57

difficult maintenance work and they've

16:59

now diverged so far from the original

17:00

that bringing things in is basically

17:02

nonviable. And honestly, it might be

17:04

time to cut ties with that codebase. I

17:06

think it, as crazy as it sounds,

17:08

considering how many hundreds of

17:09

engineers have had working on it for

17:10

years, it might be better to wipe their

17:11

hands clean and start from scratch.

17:13

Maybe bring the harness over and nothing

17:14

else. I don't know. I'm not in the

17:15

codebase. I don't know how it works. All

17:17

I know is that it sucks to use. Cloud

17:19

code is no [ __ ] excuse. Cloud Code

17:21

was the start from scratch. But now,

17:24

because the team insists that 100% of

17:26

Cloud Code's code is written by Cloud

17:27

Code, the slop fest continues to the

17:30

point where it's easier to buy their

17:32

core dependency of Bun and hope it can

17:34

fix the hellish performance issues

17:36

they've caused for themselves than it is

17:37

to just unfuck what they've done. Can

17:39

Can we take a moment to appreciate the

17:41

level of absurdity there? spending an

17:44

absurd amount of money to buy one of the

17:46

most talented native developers in the

17:49

industry and his team building a

17:51

JavaScript runtime in Zigg so they can

17:53

make things more performant. Acquiring

17:56

that just to try and hopefully maybe

18:00

make your CLI tool not use 2 gigs of

18:02

RAM. What?

18:04

How are we here? Like the meme isn't,

18:08

oh, they used React and React's bad. The

18:10

meme is that they vibe coded the whole

18:11

[ __ ] thing and now they're in a pile

18:13

of slop.

18:15

So what do we have so far? Codebase

18:17

inertia is real. You will not top the

18:19

quality of your codebase 6 months in for

18:21

the rest of the time that codebase

18:22

exists. And everybody who bet on these

18:24

models early models like Sonnet 35, the

18:26

old GBT models, all of these early

18:28

models you use for code, all of them

18:30

sloppy their [ __ ] so fast that there is

18:33

no return. And to be clear, modern

18:35

models are great. Opus 4.6 six and

18:37

codeex 5.3. They're miracle workers.

18:40

They can do things they never would have

18:41

imagined. They are much worse off trying

18:44

to clean up bad patterns than they are

18:46

trying to make new ones. And here is

18:49

where the harsh reality is. Let's say

18:51

your code base is decent. 90% of it is

18:55

good. We'll have green be good and we'll

18:58

say whatever is left here, 10, 20,

19:01

whatever you want to measure it as. This

19:02

other section here is bad. This is your

19:04

codebase. Now the code base needs to get

19:07

bigger. More people are coming in. More

19:10

changes are being made. Code base is

19:12

quickly growing. Generally speaking, it

19:15

doesn't matter if you're vibe coding, if

19:17

you're using AI tools, or if you're just

19:18

hiring traditional people, the codebase

19:20

gets copied around. The patterns used in

19:23

one place will be used in others. And

19:25

generally speaking, the pattern that is

19:27

being used is the one that is the

19:29

easiest to find and the easiest to copy.

19:31

Sadly, the ones that are the easiest to

19:33

find and copy are very rarely the good

19:35

patterns. So, what ends up happening is

19:38

the good parts of your codebase expand

19:40

linearly, and the bad ones tend to

19:43

expand exponentially. So, once you have

19:46

the the starting point, we'll say the

19:47

the the point you're at at that 6-month

19:50

mark, the way things go for over time is

19:53

that the bad parts exponentially

19:54

increase and the good parts linearly

19:56

increase. So, you very quickly end up in

19:58

a slop fest. The models accelerate this.

20:02

Codeex loves referencing the codebase

20:04

for examples to use in the work that it

20:07

does. Codeex will very happily copy a

20:10

[ __ ] pattern from somewhere in your

20:11

codebase and apply it somewhere else

20:13

because it thinks it passed your bar.

20:14

It's in the codebase. And honestly,

20:15

that's fair. One of the best moments I

20:18

had in my time at Twitch is when I filed

20:20

my first PR to the new web repo that

20:22

became the Twitch site, the rewrite

20:23

entire script in React. And I made some

20:25

dumb changes in it. And when I was asked

20:27

about those changes, because one of the

20:29

people reviewing it was like, "Wait, why

20:30

did you do that?" I showed them the code

20:32

example I found and the page in the docs

20:34

that led me here, they were like, "Oh,

20:36

that's really bad. This shouldn't happen

20:38

anywhere." So before I even got to fix

20:40

my PR, they updated the docs to no

20:43

longer steer in that way. And they filed

20:44

multiple PRs, removing any other place

20:46

with similar patterns. So it was less

20:48

likely a less experienced TypeScript dev

20:51

like myself would end up in that

20:53

position. The agents accelerate that. If

20:55

a junior Theo can come in and make a

20:57

dumb mistake because he copied code from

20:59

somewhere else, the agents can make that

21:01

10 times faster. And they do. So if

21:03

you're not starting from a really good

21:04

spot before the agents take over your

21:06

codebase, probably a little bit [ __ ]

21:08

And I think that's what's happening. No

21:11

model can be better than the code it is

21:12

starting with. And the code these things

21:14

start with are garbage because a lot of

21:16

it was written by worse models in the

21:17

past. And I can tell you from experience

21:19

that this is the case across almost

21:21

everything I've worked in. That's why it

21:23

was nearly exactly on that six-month

21:25

mark that I made the move away from my

21:27

custom sync engine in T3 chat over to

21:30

Convex because I knew we were quickly

21:32

approaching the point where we couldn't

21:34

make things better in the codebase.

21:35

There are lots of subtle improvements

21:37

you can make like you can change your

21:38

llinter to be something better. You can

21:39

apply a new lint rule and clean some

21:40

things up. You can upgrade a library

21:42

here and there, but after that six-month

21:44

mark, the majority of that code is going

21:46

to stay there. It just is. And any

21:49

patterns you've established already,

21:50

those aren't going anywhere. So, how do

21:52

we fix this? How do we prevent this? How

21:54

do we make our code bases pleasant at

21:57

the start and stay pleasant both for

21:58

humans and more importantly for agents?

22:01

This is actually something I feel

22:03

somewhat qualified in specifically

22:05

because I would have to do this a lot.

22:07

Not because like every other coder was

22:09

bad and I was good, but because I cared

22:11

a lot about velocity and developer

22:13

experience and those things tend to line

22:15

up really well with the quality of the

22:17

codebase. Not that going faster means

22:19

the codebase is better. actually kind of

22:21

the opposite. The codebase being really

22:23

well built and laid out makes it easier

22:25

to make changes fast. So the first thing

22:27

optimizing for ease, clarity, and speed.

22:30

Really trying to make patterns in your

22:31

codebase established early that are easy

22:34

to make contributions to that are clear

22:36

in what they do and are fast to change.

22:39

Ideally, a small change should only have

22:41

to touch a small number of files. And a

22:42

big change should probably have to touch

22:44

a big number of files. Common mistake I

22:46

see is people architecting their

22:47

codebase. So big changes are simple and

22:50

as a result the small changes end up

22:52

being really really complex. So common

22:54

if small changes and big changes take

22:56

the same amount of effort you [ __ ] up

22:57

both sides and it's just so common. This

23:00

is also why things like Tailwind are so

23:01

cool because it reduces the number of

23:03

services that have to be hit to make a

23:04

change. This is also why things like

23:06

GraphQL are bad because you have to

23:08

touch way more [ __ ] to just add a little

23:10

bit of information to your UI. So

23:12

optimizing your codebase so that it is

23:14

easier to understand what's going on,

23:15

clearer as well, and fast to make

23:17

contributions to is great. But there's a

23:19

bigger piece here that I really want to

23:21

emphasize. Tolerate nothing. If a bad

23:24

pattern makes it in, it will multiply.

23:26

Bad code multiplies way, way faster than

23:29

good does. Because the bad code wouldn't

23:31

have made it in if it wasn't convenient.

23:33

Bad code and convenient code have a lot

23:34

of overlapping characteristics. But bad

23:37

code multiplies too aggressively to ever

23:40

let it in. You have to be strict about

23:43

this. You can't make the exception of,

23:45

well, we need to hit this deadline, so

23:46

we know this is slow, but we'll fix it

23:48

later. Later is another word for never

23:51

in the software development world.

23:52

You're not going to fix the thing. So,

23:54

don't tolerate it. Don't let it in the

23:56

codebase. And along those lines, if you

23:59

do stumble upon something bad, if you do

24:01

find something in the codebase that

24:02

smells, don't hesitate. Don't look into

24:04

the history, don't question why it's

24:06

there. Murder it with intensity. There

24:08

is no room in our code bases for slop.

24:12

It spreads too fast. If you do happen to

24:14

stumble upon it, drop everything to kill

24:16

it. I don't care what deadline gets

24:17

missed. I don't care what manager is

24:19

mad. I don't care what agent is

24:21

insisting that it's totally fine. If it

24:23

smells, it is bad. And if it is bad, it

24:25

should be removed. no tolerance. This

24:27

does mean you have to pay more attention

24:28

a bit. Not that you have to read every

24:30

line of code, but you need to keep an

24:32

eye on the general patterns that your

24:34

codebase is evolving. How are these

24:36

things interacting with each other? What

24:37

are the methods they use to define

24:39

things? How concisely can they describe

24:42

things? A simple litmus test for this is

24:44

ask an agent how a feature works. If it

24:46

can give you a good answer in under

24:48

three minutes, you're probably fine. If

24:50

it takes more than 5 minutes to search

24:52

things and doesn't give you a good

24:53

answer, either of those things being the

24:55

case, honestly, throw it out. You'll

24:57

start to build an intuition for this,

24:58

not just by like reading the code, but

25:00

seeing how long it takes for agents to

25:02

complete the code. If you ask an agent

25:04

to do something that should be simple

25:05

and it takes longer than it should, you

25:08

know something is off and you need to go

25:10

fix it. Something I was known for in my

25:12

time at Twitch was what I referred to as

25:14

sledgehammer style development. When I

25:16

came into something that wasn't working

25:18

how it was supposed to, it was often

25:20

significantly easier to just remove it

25:22

and start again than it was to try and

25:25

fix the thing that was broken. The

25:27

reason that wouldn't work is because it

25:28

was too expensive to reproduce it. If

25:30

you wanted to throw out 5,000 lines of

25:32

code, good luck, because the average

25:34

engineer race between 10 and 100 a day,

25:36

it's going to take you 50 days to

25:37

replace it all. What if it didn't? What

25:40

if you could write the thing correctly

25:41

in a few hours instead? Here's where we

25:44

get into how we can actually do this.

25:46

Right now, a lot of these problems are

25:49

because AI accelerated natural problems

25:51

in code bases. More devs are maintaining

25:54

bigger code bases and they're looking at

25:55

the code less. If I was to just do my

25:58

job as a manager, this can happen the

26:00

same exact way without AI. If I write

26:03

this codebase, I get it to a decent

26:04

state, and then I let my team take it

26:06

over, and I don't code review too

26:07

closely, I let them do their thing, tech

26:09

debt starts to pile up. Bad patterns

26:12

start to clone and appear all around the

26:14

codebase, and then it all collapses.

26:16

I've had to learn patterns to address

26:17

this as a manager already, and those

26:19

same patterns apply here. But there is

26:22

one big difference. You don't have to

26:24

feel bad telling the agent it did things

26:25

wrong. It [ __ ] sucks and needs to fix

26:27

it. If my logs with cloud code were to

26:29

go public, I would probably be

26:31

institutionalized for some of the things

26:32

I've said to it. I am not rude to my

26:34

employees. I go very out of my way for

26:36

that. I don't even raise my voice to my

26:38

team. The thought makes me feel sick,

26:39

genuinely. But when [ __ ] [ __ ] sucks,

26:41

you need to yell about it sometimes. And

26:44

it feels kind of nice to yell at the

26:45

agent when it [ __ ] up. But more

26:46

importantly, you can give the agent [ __ ]

26:48

work. No one wants to be the person to

26:50

upgrade a 5 plus year old codebase to

26:53

modern patterns. The agent will do it.

26:55

Nobody wants to port this old internal

26:57

service that 15 people at the company

26:59

use, but the agent will do it. The cost

27:01

of sledgehammers has gone down

27:04

exponentially. Historically, it just

27:06

wasn't worth it to delete the 5,000

27:08

lines of code and replace it cuz it was

27:10

too expensive to replace it. Now, it's

27:12

not. You do need to make sure the new

27:16

solution is well aligned. But if you can

27:18

find the right compartmentalized pieces

27:20

of your codebase that are worth

27:22

sledgehammering and rebuilding, it's

27:24

absolutely worth it. now. And I've been

27:25

doing this a bunch. There have been many

27:27

places in many of my code bases where I

27:29

was like, "This just [ __ ] sucks. Can

27:30

we rewrite this?" And it worked. It's

27:32

crazy. And I'm not saying that these new

27:34

models are magically able to never

27:36

produce slop. If your codebase has a lot

27:38

of slop in it, the models will reproduce

27:40

it faster than your engineers can. And

27:42

both can already do it quite fast. But

27:43

if you do a good job of planning with

27:46

the model, specking out exactly what you

27:48

want. Because this is probably the thing

27:49

the new models are significantly better

27:51

at is planning and conversations. spend

27:54

a bunch of time in the back and forth

27:56

with the model. Write a better plan.

27:58

Tell the model, "This sucks. I want to

28:00

delete this entire folder. Let's work

28:02

together to make a better version. What

28:04

would it look like?" Maybe you have some

28:06

ideas. Maybe you have some syntax or

28:08

patterns or an API definition that you

28:09

know would be better. Write it out with

28:11

the model. Go back and forth. At the

28:13

end, you have a markdown document that

28:15

you can read and determine, is this good

28:18

or is this bad? And once you've decided

28:20

if this makes sense or doesn't make

28:22

sense, you can tell it to go build. And

28:24

as long as you have a good enough plan

28:25

in place before you start that piece, it

28:28

will probably come out okay. Will it be

28:30

as good as handwritten code by a human?

28:32

Depends a lot on the human. I would

28:34

argue the average human is a worse

28:36

engineer than the average model at this

28:37

point. I have not hired many engineers

28:39

in my career that are better than codeex

28:41

5.3. Hired a few. They're hard to find,

28:44

but they all make mistakes. They all

28:45

have the same problems. And we need to

28:47

build systems that prevent that. First

28:49

point is spend way more time in plan

28:52

mode helps a ton. Just working with the

28:55

model to make sure there is a good plan

28:57

that everything in the plan sounds good

28:59

to you. Just take the time. You'll be

29:01

amazed at what it can do. And actually

29:02

read the plan. I know a lot of people

29:04

aren't reading the plan anymore. Depends

29:05

on the size of the change and how much

29:07

you care. But if you want to make sure

29:08

this codebase is maintainable over time,

29:11

treat it accordingly. Next, and hope

29:12

this was obvious by now, use the latest

29:15

stuff. If you're still using Sonnet 4

29:17

because your company hasn't approved

29:18

Opus 4.5 yet, make them approve it or

29:21

get a real job. Like the companies that

29:23

don't understand this are so [ __ ]

29:26

Similar to this, throw away way more

29:29

code. I find most engineers are still

29:31

too attached to code and are scared to

29:34

throw away the 5,000 or 10,000 lines.

29:37

Don't be. Aggressively toss it. If any

29:40

part of you is telling you, I should

29:41

probably delete this thing, you probably

29:43

should have deleted it a month ago. Most

29:44

engineers have this bar really poorly

29:47

configured where they try too hard to

29:48

fix the code instead of replacing the

29:50

code. And on that note, actually, don't

29:53

be afraid to branch off. What I mean by

29:55

this is that I've seen a lot of times

29:57

where a code base has too many things in

30:00

it that don't really belong in that

30:02

codebase. This often happens because of

30:04

things like it's too hard to deploy this

30:06

thing again somewhere else, so we're

30:08

just going to stuff all the features

30:10

into version one. Silly example of this,

30:12

I'm a Twitch streamer. I mostly a

30:14

YouTuber, but I do actually stream on

30:16

Twitch. I'm live right now with my

30:18

audience filming this video. I used to

30:20

work at Twitch. I was working on safety

30:21

at Twitch. I was building the internal

30:24

safety platform and I was also helping

30:26

build the core Twitch site. We had a

30:29

really bad incident where one particular

30:31

game was being spammed with horrible,

30:34

terrible, nasty things. The category had

30:36

a bunch of fake streamers spinning up

30:38

and streaming vile [ __ ] At the time,

30:40

the internal safety tools did not have a

30:43

way to review things pre-report. The

30:45

point of the internal tools was to

30:46

review reports, not to review live

30:48

streams as they were live, which meant

30:51

that there was no way for our safety

30:52

moderators, admins, internal safety

30:55

tooling team to easily go through a

30:57

category and ban a bunch of different

30:59

content creators. Yes, this was the

31:01

artifact incident. God, the word

31:03

artifact still haunts me to this day

31:05

from how [ __ ] up this incident was. We

31:06

needed a way to set up our admins to

31:09

quickly ban streamers in this category

31:11

if they were doing nasty stuff. You

31:13

could easily see just browsing the

31:15

category if they were actually playing

31:17

the game or not. The Twitch safety team

31:19

needed a way to deal with this. And the

31:21

proposal I was given was to add a

31:24

permaban button that only admins could

31:26

click in the main Twitch site. So if an

31:29

admin was signed in on Twitch, they

31:31

could just scroll through a category and

31:32

have a one-click instab. I immediately

31:36

jumped on that and said, "Are you

31:37

[ __ ] kidding? We're never exposing

31:40

the internal permaban endpoint to the

31:43

public [ __ ] Twitch codebase. Are you

31:45

joking?" No, never.

31:48

And obviously like a normal user can't

31:50

hit it cuz it required the elevated off

31:52

of the global mods and admins. But the

31:55

code even making it into the main site

31:57

was horrifying. And regardless of all of

32:00

the potential security safety

32:01

implications here, the code smell to

32:04

have a custom code path that only

32:06

applies to 10 to 15 people in a codebase

32:09

that serves millions a day. No, we're

32:12

not building a feature for 10 to 15

32:14

people in a codebase that serves

32:16

millions. You're [ __ ] joking. Just

32:18

like basic tech debt math tells you to

32:20

not do that. But these types of

32:22

proposals to this day are very common.

32:24

Why don't we just add the feature really

32:26

quick? Well, it probably belongs in its

32:28

own codebase, right? Well, then we have

32:30

to go link it to all the right

32:31

dependencies. We have to get permission

32:33

from the team to deploy it. We have to

32:34

host it. We have to figure out all these

32:36

pieces. We have to admit it. Like making

32:38

the new project to do this one-off thing

32:41

just wasn't worth it all the time.

32:43

Thankfully, for this particular

32:44

instance, I had enough ownership of the

32:46

internal codebase and enough familiarity

32:47

with the external one that I knew I

32:49

could quickly recreate the category

32:52

browser in the internal tool to make it

32:54

easier for the admins to ban people from

32:56

it. So, that's what I did. It ended up

32:57

working fine, but I had to go do that

33:00

and I had to fight my team the whole

33:01

time telling them, "Sorry, like I know

33:03

this is going to take slightly longer.

33:05

It's not going to take that much longer

33:06

and it's not going to be a huge risk. We

33:08

need to do it this way." And the only

33:10

reason they let me, by the way, it

33:11

wasn't even the internal versus external

33:13

thing or the security or any of that. It

33:15

was because the external core Twitch

33:17

site only deployed once a day at around

33:19

1 p.m. if I recall. So, we were going to

33:21

have to get exceptions for them to

33:22

redeploy it just for this one button to

33:24

be added for the admins. But our

33:26

internal tool I could deploy whenever I

33:27

wanted. So, it was a deployment

33:30

architectural thing where the internal

33:32

deployment was slightly easier because I

33:34

owned it. That was the only reason that

33:36

this button didn't end up in the public

33:37

codebase. So, why the [ __ ] did I just go

33:39

down this tangent? Hear me out. All of

33:42

the reasons, 100% of the reasons to put

33:45

that in the existing codebase for the

33:47

public Twitch site instead of making a

33:50

new thing or putting it in the internal

33:51

thing instead. 100% of those are gone.

33:54

It is easier than ever to build a new

33:56

codebase. It is easier than ever to port

33:58

features from codebase A to codebase B.

34:00

It is easier than ever to get things

34:02

deployed and shipped. There's no more

34:04

excuse. And I find a lot of these nasty

34:07

code bases are because of things making

34:09

it in that don't belong. I would be

34:12

blown away if the cloud codebase wasn't

34:15

full of deprecated features that were

34:17

hidden under feature flags and never

34:18

shipped. Things that are specific to old

34:20

models that no one's using anymore.

34:23

Integrations with systems that don't

34:24

exist. Internal tools that they use

34:26

themselves that aren't part of the

34:28

external code that they don't want other

34:29

users having. I would be floored if less

34:33

than 50% of the codebase for cloud code

34:35

was stuff like that. And you have to

34:37

fight that instinct. You need to push

34:40

back when things don't belong in the

34:42

codebase. The number of code bases we

34:45

all have in our lives should 10x over

34:48

the next year. I went from actively

34:50

working in two to three code bases to

34:52

like 40. And yes, there's a lot of

34:54

context to manage. Yes, that's a lot of

34:55

things going on. But the harsh reality

34:57

is that most of those code bases once

34:59

they do the one thing don't have to be

35:00

touched the same way that the shitty

35:03

folder in another codebase doesn't have

35:04

to be touched. The problem comes when

35:06

you have to do anything with that big

35:08

codebase. If you have the one major

35:10

codebase that matters, like in this case

35:11

the Twitch site, doing an upgrade to the

35:14

React version should not require an

35:15

internal tool that no one's used for 2

35:17

years to be bumped as well. If you build

35:19

a strong discipline here, if you get

35:21

good at keeping unrelated things out of

35:23

that codebase, just making new modules,

35:25

making new repos, making new projects

35:28

for the things that don't need to be in

35:29

your main codebase, suddenly maintenance

35:32

becomes way less miserable. And I know

35:34

for a fact that is not how any of these

35:37

things have worked. Cursor keeps adding

35:39

things into cursor. They keep building

35:41

on top of the mess and the result is an

35:44

unstable shitow. Cloud code is a

35:46

slopfest, constantly building on top of

35:48

itself with things that no one uses and

35:50

they don't even need anymore. Fight that

35:52

urge. Push to keep things out. Make it

35:54

easier for your team to do this as well.

35:57

If it's too hard to deploy a new

35:58

service, fix that. It should be easy for

36:00

anyone at your company to deploy a new

36:02

internal service to a new internal URL

36:04

or subdomain. It should be trivial. If

36:06

it's hard to get your codebase to spin

36:07

up a new repo on your internal GitHub

36:09

enterprise or whatever shitty source

36:10

solution you're using instead, fix it.

36:12

It should be trivial for anyone in your

36:14

company to spin up and deploy 10 new

36:16

code bases in a day. The agents have

36:18

made everything else easy. If that part

36:19

is why you're blocked, fix it.

36:21

Incentivize new project creation instead

36:24

of old project sloppification. If it's

36:26

not essential to the features that the

36:27

majority of your users are using it for,

36:29

make it something else. And the most

36:31

important final piece here, ask

36:33

questions. Specifically, ask the AI

36:35

agent when it's doing work, what is it

36:37

doing? Why is it doing it? Where did it

36:39

get this idea from? Why does it think

36:41

this thing matters? Why did you choose

36:43

to go down this path? If you ever notice

36:45

the agent doing something wrong, there

36:47

is a reason for it. These are

36:48

non-determinism machines. Like, it's

36:50

never going to do the same thing twice,

36:51

but it's going to do them close enough

36:52

and the reasons are going to be

36:53

relatively consistent. If it did this

36:56

thing wrong, figure out where it got the

36:58

idea from. If it came from your

36:59

codebase, eliminate it. If it came from

37:02

your quad MD, fix it. Adjust it. The

37:05

problem with these tools is that they

37:07

didn't have this diligence because they

37:09

were too focused on how fast can we ship

37:12

and not how well can we ship. And I feel

37:14

this deeply. I feel it so deeply that I

37:17

bullied Cursor into doing a month of no

37:19

new features and just cleaning up the

37:20

slop. And honestly, some of these code

37:23

bases probably need to just be reset

37:26

from scratch. There's a project I think

37:28

about a lot and you guys are going to

37:30

think I'm insane for going down this

37:32

rabbit hole, but just hear me out, guys.

37:34

There's a game called Vampire Survivors.

37:36

Many of y'all may have heard of this

37:38

game before. Vampire Survivors was

37:41

originally based on a shitty mobile slot

37:44

war game where you would just walk

37:45

around in circles and a gun would

37:48

autoshoot and destroy monsters until you

37:50

eventually couldn't finish them all off

37:51

and you would die. Fun fact about

37:53

Vampire Survivors, it was written in

37:56

Phaser.js in the browser, but you might

37:58

see to the right there, Vampire

38:00

Survivors for Nintendo Switch. The

38:03

Nintendo Switch is not a competent

38:05

console. When it came out, it came out

38:07

with a processor that was already over

38:09

two years old. The RAM speeds are slower

38:12

than most hard drives are, not as these

38:14

hard drives. The Switch was a [ __ ] piece

38:16

of hardware. There was no world in which

38:19

they were going to get the PhaserJS

38:21

version along with a full JavaScript

38:23

engine running in even vaguely a

38:25

performant way for the Switch. So, they

38:27

rewrote it in C++. That became the

38:30

Vampire Survivors console version. That

38:33

is now also the version you get on

38:35

Steam, as far as I know. I might be

38:36

wrong on that, but I'm pretty sure the

38:37

Steam version is also in C++. The lead

38:40

dev does not know C++. He hired a couple

38:44

additional developers to join in and

38:46

port the game to other systems. But

38:48

here's where things get real fun. When

38:50

the creator of the game wants to build a

38:52

new feature for the game, test out new

38:54

ideas, play with things, and just

38:56

improve the game, he doesn't do it by

38:58

editing things in the C++ codebase. He

39:01

does it by slopping away in his giant

39:04

[ __ ] show of Phaser.js for the browser.

39:06

And once he has it in a good enough

39:07

state where he feels like the game plays

39:09

well, he hands that off to the team to

39:12

go build it in the real codebase.

39:13

They're maintaining two code bases in

39:16

parallel. One version, the web version,

39:18

the phaserjs version is the one that the

39:21

game designer and creator uses to figure

39:23

out what does and doesn't work to

39:25

iterate on ideas to improve the game.

39:28

And once that version has things figured

39:31

out, well, the team is assigned the task

39:33

of porting it to the polished, refined,

39:35

reliable C++ edition. I think we're

39:38

going to see more of this going forward.

39:40

I do legitimately believe the future's

39:42

going to look something like this where

39:44

you maintain two or maybe even more

39:47

versions of a given codebase. Maybe you

39:49

use the vibe code slopware tools to make

39:52

something that works and test out

39:53

theories. Maybe you even ship that

39:55

version to some users to see how they

39:57

react to that. I've already seen this

39:59

happen in the past doing things like

40:01

using framer mocks to test ideas out for

40:03

users. What if we did this more

40:05

sincerely? What if we did this as part

40:07

of our actual design methodology? What

40:09

if we thought about building through

40:11

slop as a way to test ideas and then we

40:14

used more established engineering

40:16

practices to turn those ideas into good

40:18

reliable product? What if cursor just

40:20

cut their losses, treated the existing

40:22

code base as the slopfest version, used

40:25

that to prototype ideas, play with

40:27

things, and then started from scratch a

40:29

new version where they would port over

40:31

the parts that matter and leave behind

40:33

all of the slop that doesn't. It's

40:35

suddenly way cheaper to do that.

40:38

Maintaining the same codebase twice,

40:39

once as an internal version for

40:41

prototyping and once as an external

40:42

version for actual usage sounds insane

40:45

and would not have made any sense at all

40:47

in the past. I think it's starting to. I

40:50

don't know if this is actually going to

40:51

be a thing. I don't know if this is

40:52

going anywhere, but I think it's a

40:54

viable path for the companies that are

40:56

in this place where they've slopped

40:58

themselves into a corner and they need

40:59

to get out of it, but they don't want to

41:01

lose the iteration speed and the

41:02

experimentation that they can do in the

41:04

slopfest edition. I think this might

41:06

make sense. And we're even considering

41:08

dog fooding it ourselves for some fun

41:10

things we're cooking over at T3. I see

41:12

Julius has been cooking with all things

41:15

T3 code since we went live. He already

41:17

seems to have added hotkeys since. If

41:19

all goes well, this will hopefully be

41:22

out by the time this video is. Might not

41:24

be the case, but we have been deep in

41:26

the creation of T3 code, which is meant

41:28

to be a much more stable way to interact

41:30

with these agents. And this might end up

41:32

being the first time we test out that

41:34

theory where we have a vibecoded slop

41:37

version and poor Julius is stuck with

41:39

the task of making it actually work

41:40

reliably. We effectively already work

41:42

this way where I'll put up a slop PR

41:44

that makes the feature work that shows

41:46

what my intentions are and what I

41:47

expected to do and then my team will go

41:49

and build it correctly. half or more of

41:52

my PRs on T3 chat don't get merged even

41:54

preAI because I built a lot of these

41:56

things to show the UX feel it see if it

41:59

works right and if it doesn't retry and

42:02

do it again and then my team would be

42:03

stuck with the horrible task of making

42:05

that an actual good reliable feature

42:07

happened all the time the amount of

42:08

times my [ __ ] PRs have been trumped by

42:10

Julius is genuinely hilarious AI is only

42:12

accelerating it but I think there might

42:14

be something here this idea of

42:16

maintaining two versions of the codebase

42:17

or at least throwing away a lot of the

42:20

code that you're using knowing that

42:21

you're writing a ton of code to test

42:23

ideas not to ship. Instead of measure

42:25

twice, cut once, one of my favorite

42:27

phrases ever. Maybe it's file twice,

42:30

merge once. Maybe it's file 10pr once.

42:33

But the idea that like code is cheap, so

42:36

we should merge all of this cheap code.

42:38

Terrible, horrible, nasty. It's resulted

42:40

in the hell that we're in today. Maybe

42:42

these tools, as useful as they are, are

42:44

good for vetting ideas. and then the

42:46

final version we end up committing gets

42:48

a little bit more care in the approach.

42:50

Maybe or maybe the model just gets so

42:53

good that none of this matters and they

42:54

can rewrite everything from scratch

42:55

anyways. Who knows? All of this can go

42:57

anywhere. This is all speculation about

42:59

patterns that may or may not work. My

43:01

experience for maintaining large code

43:03

bases and all that and more. I don't get

43:06

anybody who says that engineering skills

43:08

don't matter. I've never felt like my

43:09

skills are being pushed to their limits

43:10

more than they are today. There's

43:12

opportunity for good engineers to do

43:14

great things now. And I hope that this

43:16

absurd rant helps you understand why and

43:19

also maybe helps you understand why all

43:21

of these slop fests are so miserable to

43:22

work in and maintain. I don't see a

43:24

future where the existing cursor and

43:25

cloud code bases become nice to work in

43:28

or nice to use the results of. I do see

43:30

one where they're treated as a slot fest

43:31

they are. A new better thing is built

43:33

from scratch and as a result they have a

43:36

better app. But I really don't see a

43:38

future where claude code the codebase

43:40

gets better. I see a world where cloud

43:42

code the executable gets better because

43:43

a new code base is used to serve it. No

43:45

idea where this will all go. I just

43:47

wanted to complain a bunch. I appreciate

43:49

you all for listening. I hope that this

43:50

helps you maintain code bases a bit

43:52

better and maybe at the very least vent

43:54

your frustrations with these existing

43:55

tools a little better, too. It's never

43:57

been more important to maintain your

43:58

codebases well cuz the agents are more

44:01

than happy to tear them to shreds. Let

44:02

me know what y'all think about this or

44:04

if I'm just going mad. Until next time.

UNLOCK MORE

Sign up free to access premium features

INTERACTIVE VIEWER

Watch the video with synced subtitles, adjustable overlay, and full playback control.

SIGN UP FREE TO UNLOCK

AI SUMMARY

Get an instant AI-generated summary of the video content, key points, and takeaways.

SIGN UP FREE TO UNLOCK

TRANSLATE

Translate the transcript to 100+ languages with one click. Download in any format.

SIGN UP FREE TO UNLOCK

MIND MAP

Visualize the transcript as an interactive mind map. Understand structure at a glance.

SIGN UP FREE TO UNLOCK

CHAT WITH TRANSCRIPT

Ask questions about the video content. Get answers powered by AI directly from the transcript.

SIGN UP FREE TO UNLOCK

GET MORE FROM YOUR TRANSCRIPTS

Sign up for free and unlock interactive viewer, AI summaries, translations, mind maps, and more. No credit card required.

GET STARTED FREE SIGN IN