OpenAI just dropped their Cursor killer
FULL TRANSCRIPT
I have a confession to make. For the
past 2 weeks, I've entirely stopped
using cursor and I've barely used Claude
Code at all. The reason why is a new app
that OpenAI just put out for developers.
It's called Codeex. Yes, again, while
the name is annoying, the product is
actually really, really good. Way more
so than I ever would have expected. It's
a different way of managing your agents
across projects while you're doing real
work. It was built by and for developers
and it feels fundamentally different
from any other similar tool I've used
and everyone else I know with early
access feels the same. This honestly
feels like one of those moments where
just like the way we build changes on a
fundamental level. I know we're having a
lot of those lately, but like I can't
stop using this app. I was under NDA,
but I wanted to keep using it. So, I did
something I never thought I would do. I
bought a second laptop. Yeah, it's that
good. I still can't believe who I've
become that I haven't opened cursor. I
didn't even install it on that laptop.
I've just been building with the new
codeex app and they put it out for free.
By the way, I have so much to say about
this. I can't wait to show you guys how
it works, why I like it so much, and all
the cool workflows that I've built
around it. As you all know, Open Eyes
never pay me a scent. In fact, they're
not even giving us free access to the
models during the early testing anymore.
I've paid every cent of everything I've
done here. And this app's also in direct
competition with a ton of my
investments. So yeah, I am the most
biased against this you can imagine, and
I'm still about to shill the absolute
hell out of it. I just want to make sure
you all know nobody is paying me today
to talk about anything other than
today's sponsor. As you can probably
tell, I've been building a lot more
lately. Not just like writing more code,
but also building that code on servers.
And that means I have to wait for the
builds to complete. And I used to have
to wait a long time, especially when I
started building with Rust. And then I
moved to blacksmith and all of the pain
disappeared. I'm going to be real with
you guys. I thought this is a cool
product and I was really happy to
recommend it before, but then my team
moved over and that means my team set it
up. Not that it was hard to set up. You
just link it to your GitHub and then
change one line. But now that it is set
up, anytime I make a new project and I
want the CI to be less slow, I literally
go to the runs on and change Ubuntu
latest to blacksmith and then suddenly
it's like 8 to 10 times faster. Here are
the real builds I was doing for my new
CLI app that I'm writing in Rust. And
you can see the builds here are between
two and 3 minutes. Occasionally, if
we're lucky, one will be like a little
under 2 minutes, but they're two to
three, sometimes six plus. And now you
can see 47 seconds, 46 seconds, 48
seconds. Those long builds are
publishes, so they're a bit different.
This looks way cooler from their
dashboard, which is, by the way,
significantly nicer for tracking your
actions than anything that I've seen
inside of the official GitHub action
stuff. Not only can I trivially check my
workflows and jobs and see what's going
on with a really nice UI, they have
search that works. Oh, it's so good. You
can even see success and failure rates
across different actions at different
steps to make it way easier to debug
what's going wrong. Here, we can even
see how the length of our jobs is
distributed. So, across T3 chat, the
vast majority of our jobs are under 30
seconds now, and the worst ones are up
to 11 minutes. And we can look and see
why. What steps took how long. This is
so good. It's so good. We have fixed so
many things about our CI. Our developers
are significantly happier. I'm over the
moon with this. It's one of those things
where once you've tried it, you can't go
back, which is why it's crazy that let
you start for free. If you're not using
Blacksmith, you're wasting time and
money. Fix both now at soy.
So, what even is Codeex? Well, it's a
CLI. It's a series of models for coding.
It's a web app that is used for making
PRs to repos. And now, it is an
application you can install on your Mac
for managing all of the work being done.
To be very very clear, this is still
using the same agent base that the CLI
uses. In fact, it shares history between
the CLI runs and the app runs. So, if I
go do something in the CLI in one of
these directories, it'll carry over to
the app as well. And so does your
config, which is really nice. It does
just feel like a UI for the CLI, but
with a lot of nicities handled as well
that make it way better for day-to-day
use, especially if you're one of those
people that spins up a lot of different
projects or branches and works on
multiple different things at the same
time. I've never parallelized my work
quite as much as I am today with the
Codeex app. I want to make something
very clear though. They are not the
first ones to do this. I've seen a
handful of these CLI wrappers pop up
that make it easier to manage what
projects you're working in in the
threads going on within them. apps like
Conductor, which is built largely around
cloud code, but does also work with
codecs to make it easy to take a given
git repo and work on multiple things in
it at the same time. Conductor is really
cool, but has had terrible performance
issues for me, so it's not a thing I've
used that much. Funny enough, there's
also codeex monitor, which was made by a
user of Codex that wanted to do
something like this, and he was also
brought in as an early tester. And for
one last throwback, anti-gravity. Not
that anti-gravity does this directly, it
doesn't. But what it did do, and I even
call this out on my video, was the agent
manager, which let you keep track of
work going on across different projects
at the same time. When I first played
with anti-gravity, this was like the one
thing in it that I thought was actually
quite cool. And a lot of people said I
was wrong there. In particular, Ben, my
channel manager, called me out hard for
this. And now he's admitting he's wrong,
too. So, this orchestration layer of a
UI for managing your work threads does
seem pretty damn good. And I've yet to
meet anybody who's used it heavily that
hasn't committed. Even my own team with
Ben and Julius have been making the move
to this workflow in this UI. And I I
should just show you guys why. On this
laptop, I only have a handful of my
projects. But even then, you see I did a
lot of legitimate work here. Let's start
with lawn, which is one of the like 10
new apps that I've been working on. This
is Lawn. It's meant to be a cheap,
simple alternative to Frame.io. I'm
pretty close to finish with this one. I
might actually release it in the next
week or two. We need it for my team and
I'm excited to get it out for others if
they end up wanting it as well because
frame.io is a bit rough nowadays. It's
an easy way to upload videos and do
video review with your team which is
essential for us as we do a lot of
videos. And as you can guess, I built
this whole app inside of codeex. I did
hop to cloud code here and there mostly
for UI work like the homepage. I did a
bunch of different additions on it
there. But all of the wiring pieces
together, all of the actual scaffolding
of the app and building it and creating
the functionality, all of it was done
across many threads inside of this app.
To be clear, not on this computer, on my
other laptop. I might plug it in in a
sec, but just want to show the type of
thing I'm building here. Right now, when
I click a timestamp, it doesn't move to
that point. I'll ask it to fix that.
When a user clicks a timestamp in a
comment, the video player should
automatically skim to that point in the
video. Hit send and now it's going. And
as we all know with the GPT5 models,
this will take a bit. On one hand, it's
because they're actually slow. Like
their TPS is not super high. On the
other hand, it's because they're very
thorough. This model loves to just scan
the codebase and find every single thing
that might be related to what you're
trying to do and figure out a cohesive
and coherent plan to do all of that. But
here's where things get really cool. We
know it's going to take a while. So,
let's go do something else. Maybe I'll
go back to my blog and work on some
things here. I'm currently in the
process of rethinking how I manage blog
posts because I've been trying to write
more. This one ended up making changes I
didn't expect it to. I had asked it to
explore, but it didn't. It just wrote
code. But if I want to see what it
wrote, I just click the open button with
VS Code, which by the way, they have
call outs for basically every editor you
would possibly want to use. If it's
installed on your computer, it'll
probably appear here. So, I just open
that in VS Code. Now, we can see all the
blog posts I had open are dead because
it changed the structure of how my blog
is organized. I now have active and
archived folders for all of my blog
posts. This is significantly easier to
manage. I'm actually quite hyped for
this. This is a good change. So, let's
say I want to commit this. There's a
little button up here, commit. You also
have a code diff panel button that you
can look in here and see what actually
changed. It's fine. Not going to
complain too much about it, but
honestly, I don't find myself in there a
whole lot because I much prefer just
opening in my editor or somewhere else.
They also have a terminal on the bottom
if you want to have a terminal directly
here. Some people really like that. I
personally don't, but there are some use
cases where it's great that I'll show
you in a sec. But I want to commit this.
So I'm going to just click commit. And
here's a feature I bullied them into
adding. Leave blank to autogenerate a
commit message. Nice. So I can commit
commit and push or commit and create PR.
I think commit and create PR sounds like
a good idea. So we will do that. And as
we are here, I just noticed the other
work we were doing completed. And this
is part of the magic is I'm just hopping
between different threads, seeing where
tasks are. I feel more like a manager
than ever, but I also feel way more
productive than ever. Now I have the PR
for all these changes at the same time.
I have lawns update here with the video
player changes. Pop back in. See if
clicking a time stamp works. Look at
that by OpenAI.
>> And now in literally 5 minutes, I got
two major changes on two real projects
done. But what happens if you want to
work on two things at the same time in
the same project? This is where work
trees come in and they I'll just say
they have an interesting way of doing
work trees. I don't think anyone's
gotten this right because there's so
many layers to work trees. If you're not
familiar, work trees are a concept
within Git to take a git repo, make a
new branch, and put it somewhere else on
your computer. So, you can be working on
two things at the same time. This is a
feature that just didn't matter too much
for a long time because like who's
working on two things at once? Hi, I am
I'm working on like 10 things at once
now and it sucks. So, work trees are the
promised solution that fixes all of
this. And now that I've gone really deep
on them, I'm telling you, we've been
lied to. Work trees are not a good way
to do anything other than make it harder
to figure out where your code is. I have
been very upset with them on a
fundamental level. They just feel like
the wrong primitive. Like, you can't
have the same branch or commit checked
out in two places at the same time. So,
you always have to make a branch, but
then keeping them in sync sucks. So, the
way that they're doing this in codeex is
a bit different. Initially when you
create a work tree it makes a copy of
the project and puts it in another
directory somewhere else. It's not a
traditional get work tree though it is
an actual copy and then you can sync the
changes back in after. So let's try this
with lawn really quick. We have the
actual redesign branch that we're on. I
click work tree. It switches us to from
main which is not what I want cuz that's
not the branch I'm on. I hate that. One
of the many things I've complained
about. Hopefully it'll get fixed. Switch
that to actual redesign. That's where we
want to start from. When a user uploads
a new video, a thumbnail needs to be
created for that video. I think the
easiest way to do this is going to be to
do it on client side when they upload.
We should grab a still, either the first
still or something from early in the
video and upload that alongside the
video so that we can use it as the
thumbnail. And now this is being worked
on in a work tree. And I have the option
to sync this with my local or somehow
create a PR with it. I've yet to figure
out how to actually do that part. I
honestly have found the work tree
implementation to somehow be one of the
most thought out work tree
implementations I've seen, but also one
of the least useful. I have not been
enjoying work trees a whole lot inside
of codeex. And I do hope in the future
they find a way to make them feel a bit
better because having to sync to local
then push just doesn't feel great. I'm
on a branch of a branch right now. I
should be able to make a PR that merges
it in or have some other way to put this
work up that isn't pulling it back in
first. As such, there's a different
workflow that I've been finding
surprisingly useful. There's one more
button to the right, cloud. Turns out
cloud environments are a lot less
painful when you just have them in the
same exact UI. So, I'll paste the same
prompt. I have to pick a cloud
environment because it needs to know how
to use it in the cloud. It's one of the
most annoying things. It can't just use
a random folder on your computer.
Everything inside of codeex has to be a
git repo. And if you're using the cloud,
it has to be on GitHub and already
synced with them. And as such, we must
do this. So now lawn is a new
environment that I can create. I'm not
going to bother with all of those
things. I don't need it to actually have
internet access or do things should
actually give it internet access now I
think about it. So we can do search and
whatnot. But uh yeah, and as always,
we're switching that back on. So the
agent has unrestricted access. But the
fact I have to go configure all these
things and manage environment variables,
figure out what container we're in and
all that, this is a very unsolved space
in my opinion. I do dream of a future
where the cloud stuff's a little less
annoying to config. But now that I've
made these changes, lawn is available.
We're starting from main. Shouldn't
start from main. We start from the
redesign. And then I hit go. And now we
have two separate tasks going on my
computer in the same project. Neither of
which are touching code in the actual
folder the project lives. This one is
going to happen in the cloud and it will
eventually propose a pull request. This
one is happening on my machine but in a
work tree so that I can pull it back
into the main branch later. And again,
when we have all of this going, I can go
hop into other projects like the native
clone of the T3 chat app that I've been
working on that's gone surprisingly far.
Or the file system that I built. Yes.
Okay. Not a real file system. It's an
alternative way of syncing files between
machines that I also have built almost
entirely within codecs. Or T3 chat
itself, which wait, a month ago? Have I
had this app for a month? Nope. The way
that works is it's syncing with the
project because it knows where it is on
my computer that I've used Codex for it.
So it has the history of different
things I've done in Codex on my machine
as well as the things I've done in Codex
Cloud on my machine. And here we can see
some of the like heavier things I asked
it to do like improving the performance
for long threads, overhauling
compaction, some of the bolder things I
was just curious if it could do. And
it's all here. And at the same time, we
have our main threads going up there.
It's just it's so easy to hop around
that I've become more frustrated about
the other things I have to hop around
in. Like between my browser, my code
editor, and other surfaces, it's
annoying, especially when you also have
to like spin up a dev environment or
manage environment variables. I still
haven't figured out how to make it so
when I create a work tree that the env
is carried over. There's a surprising
amount of config available. They've been
pretty open to whatever suggestions we
have about things we want to do or do
different. And here we can see that a
lot of the config is just put under the
config toml. But there's a lot of other
little things we can do. We can give
some custom instructions. We can change
how detailed the info we get out in the
thread is. And by default, they don't
actually show the code output. They only
show steps with code commands because
again it's a very different way of
building. This no longer is we're
commanding our code editor via AI. This
is now we are orchestrating agents that
control our code for us with a better
UI. and it's fully caused me to lose
interest in terminal based UIs for real
code work. But I was saying before, I
don't really know how to set up
environment stuff properly. The
environment has a name. It has a setup
script and also platform overrides,
which is interesting because right now
it's a Mac only app. Obviously, that'll
change in the future. Right now, it's
just a Mac. So, nothing in here seems to
show me how do I get over my environment
variables. This will run in the project
route, but how do I know what path
things are going to? There is no clear
method anywhere here on how to manage
environment variables, which kind of
makes the whole thing feel useless
beyond just writing code somewhere else
for you in parallel. I do genuinely hope
they find ways to clarify these things
and make it easier in the future. At the
very least, they should just copy the
existing cursor configs that describe
how to do a lot of the stuff in work
trees. Yeah, we'll get there when we get
there, I'm sure. There's also MCP
servers cuz of course there is. They
also have a pretty cool skills browser.
You can see all of the ones I have,
which yeah, I have quite a few now. I
know I've become a skills guy. It is
what it is. They also have a bunch of
recommended ones like deploying on
Cloudflare, managing Atlas so that it
can see what's going on in the browser,
controlling linear issues through the
app. So if you tell it about an issue
and then it addresses the issue, it can
update the status accordingly. Pretty
cool. Built-in image gen, which is
really cool. One of the demos they did
is making a full game with one prompt
where the agent could generate images
for the things that it needed and also
use Atlas to see it as it went. And my
personal favorite, yeet, which will
stage, commit, and open a PR. When you
think it's ready to go, you just tell it
to yeet, and it will use its skill and
figure that out. Silly, but useful. I
haven't set it up just yet myself. I
just have my usual skills, but I can see
myself using that one. Actually, good,
useful, tasteful things. And as you can
guess, the team building this has been
using it to build it. So, they have been
deep in here making tons of changes. The
speed at which things have been getting
fixed when I report them is absurd. This
isn't some random side project they're
going to half-heartedly invest in. It
does really feel like a thing they want,
they use, and they really want to see
succeed. I've been surprised at how much
effort they're putting in. Not like how
hard they're pushing it on me. They're
really not that much. I seem to be the
person in the test group that's gotten
the most excited about this, but we've
all been using it a ton and it's been
really good. But now the question is
left of what do I use every day? I
mentioned before that I've still been
using cloud code for a bunch of random
things, but what are those things? Most
of my cloud code use has been general
computer things, things that don't need
to be so deeply tied to a git repo. So I
was trying to set up a way to easily
create scripts that I could execute from
a directory that would just become
globals. I did that. I do a lot of
changes to like my zshell config via
cloud code. I find files. I write
scripts to do one-off tasks. I scrape
websites. I get information. I use cloud
code similar to like how I use a
computer. But codeex is how I actually
build. The way I think about it at this
point is if I have any intention to
commit and push this, I'm probably doing
it inside of codecs. If I'm just
changing things on my computer,
around, writing one-off scripts, or the
one real use case, doing passes on UI
and sometimes even full-on UI overhauls,
then I use Cloud Code. Everything else I
have been doing since I got access to
this app, I have done inside of Codeex.
It really feels like it fixes the
biggest issue I had with codecs, which
is that the model was slow and using it
was unpleasant as a result and I would
just lose track of which terminal tab
had which run going. Because the codeex
models have a generally higher hit rate
and success rate for these big heavy
tasks, I found myself paralyzing with
them already. I was at the point where I
would run something that I knew would
take a while in codeex and then go spin
up a work tree manually and play with it
in cloud code to see how it would feel
to use while waiting for the codeex run
to complete. And now there's a UI tailor
made for this use case. And it's so
good. It's so good. It's so nice having
a UI that you can just do things in. And
like crazy, I know being able to paste a
screenshot and see it in the UI. It's
been a while since I could do that
because I've been spending too much time
in these terminals and now I'm spending
almost none. It's crazy how fast things
change. I went from using VS Code for
eight years to cursor for a year and a
half to cla code for 2 months to this.
And I bet it's going to keep changing.
Sorry for the scenery change. I forgot
to talk about automations. They're
really cool and I needed to clean some
things up before I could show you
because the screen disappears if you
have any of them set up, which is really
annoying. Automations are kind of like
cron jobs with prompts that have access
to files, folders, and all the things
agents can do on your computer. I
haven't played with them too too much
yet, but the little bit I have was cool
enough that I wanted to add this in
quick. The examples are all pretty basic
things, but you can see how they would
be useful. Something like writing a
change log automatically, summarizing CI
failures, scanning recent commits, and
looking for bugs. Let's just go with
this one. I'm going to change it to last
48 hours. Here, we'll pick a project.
Let's say T3 chat. Make sure you pull
the latest. So you are checking recent
changes.
72 hours. This is the weekend and I
don't think we've shipped too much over
the week. You can choose when it runs.
So this would be 9:00 a.m. every day of
the week. It would run this and look for
things that might be bugs. You can also
add multiple different projects at once,
which is super super cool. So I'll do
these two projects. Set that up. Test
just so we can see what it looks like. I
definitely have my potential concerns,
but uh Oh, look. It does actually spawn
as a thread here. That's cool. So, if
you go to the different projects, you'll
see a thread that it made with a work
tree to do the thing that you requested.
That's really cool. It's like an
automatic thread creation. We even set
this up to automatically add new models
when they're added on open router. Like
I Oh, boy. Again, this when I play with
this app, I think of things. It's a
different way of building that lets you
like use your brain in a different way.
Obviously like a terminal is kind of the
same and you could have set this up
yourself by setting up a cron in your
terminal with cloud code with a specific
path but just having it in front of you
like this feels different and I wanted
to capture that. And look here we are no
actionable bugs but there is some things
that it scanned. Yeah, this is really
cool. By the way, switched to my other
laptop so you can see a lot more of what
I've been building to the point where we
got to a show more there. But I want to
show you guys a project I was having a
lot of fun with vibe faster. I was so
pumped with the workflow I had in this
app that the performance issues it had
at the time got really really
frustrating. There was a brief moment,
like one build where it was eating
memory and my battery and it really
really frustrated me. But I had gotten
so hooked on this workflow that I
wouldn't take no for an answer. So I
plugged in a charger and started
building a native alternative. Yes,
really. I built a full desktop app from
scratch with Swift and AppKit, not Swift
UI cuz Swift UI performs like And
this is my attempt to rebuild the things
I loved so much about Codeex app in a
native build. But like that's how much I
loved this. I loved it so much that I
went and tried to build my own native
alternative so that I could have better
battery life and I could not feel as bad
watching my battery drain hoping that my
task completed before it went down. And
then they fixed all those problems. And
now I'm happy as can be. I made this
post a few days ago obviously inspired
by the thing I was testing. All these
agent coding TUIs are a phase and it
will be shortlived. Most devs will be
back in GUIs and IDEIDes in a few
months. Notice that I said GUIs and IDE.
That was intentional because while
Codeex isn't an IDE, it has absolutely
killed my interest in opening them. I
just don't care anymore. I didn't think
this would happen as fast as it has, but
I don't want to be in there editing
code. I don't want to be in there
looking over every line. I want to make
sure I understand what the agent's doing
and why. and build something awesome
with it. And this has been an incredible
tool for doing exactly that. So much so
that I don't even regret the $4,000
gigantic monstrosity of a laptop
currently sitting in front of me that I
bought exclusively to be able to do this
work with separately from my computer
that I use on stream.
If you can't tell, it's been a rough two
weeks of me not being able to talk about
this. And I'm so pumped I can finally
share it with y'all. I know I get hyped
about a lot of different things on this
channel. It's kind of my thing. I love
trying new things, but none of them have
changed my day-to-day work quite this
much in a long time. And I'm so hyped
that you guys get to go play with it
yourselves, too. That's all I have on
this one. I'm going to go back to
building because I am fully addicted
again, like the way I was as a kid. It's
been so genuinely fun, and I can't wait
to go back and keep doing it. Until next
time, peace nerds.
UNLOCK MORE
Sign up free to access premium features
INTERACTIVE VIEWER
Watch the video with synced subtitles, adjustable overlay, and full playback control.
AI SUMMARY
Get an instant AI-generated summary of the video content, key points, and takeaways.
TRANSLATE
Translate the transcript to 100+ languages with one click. Download in any format.
MIND MAP
Visualize the transcript as an interactive mind map. Understand structure at a glance.
CHAT WITH TRANSCRIPT
Ask questions about the video content. Get answers powered by AI directly from the transcript.
GET MORE FROM YOUR TRANSCRIPTS
Sign up for free and unlock interactive viewer, AI summaries, translations, mind maps, and more. No credit card required.