Claude Just Solved Session Limits
全トランスクリプト
So today, Claude announced that they
agreed to a partnership with SpaceX, so
Elon Musk's company, and it's going to
substantially increase their compute
capacity, which means that they've been
able to increase their usage limits for
cloud code and for the cloud API. So
today in San Francisco was the first
code with cloud event 2026, which is
basically just a big developer
conference that they're doing in San
Francisco, London, and Tokyo over the
next month or so. And they actually got
so much demand for this that they
extended it. It looks like they did an
extra day in each of these locations. So
pretty cool. But as you guys probably
are aware if you've been using Claude
for a while is that this past couple
months, this past quarter has been awful
with outages. There's been so many times
where Claude has just died while you're
trying to use it. And there's probably a
lot of different reasons for that. You
know, like they were doing a lot of
testing. They were shipping so many
features. They had Opus, they had
Mythis, but really the main reason is
that they just didn't have enough
compute to handle how much demand there
was. There was way too many people
trying to use Claude than what Claude
could actually support. So today I
wanted to actually break this down and
help you guys understand what this means
for you practically and what you might
want to start doing differently. So
higher usage limits for Claude and a
compute deal with SpaceX. Very very
interesting. So because of this
partnership, what that means effective
immediately is that first of all,
they're going to be able to double
Claude Code's 5-hour rate limits.
Double. Whether you're on pro, max, or
team, your 5-hour limit is going to be
doubled. So, when you're here inside of
Claude, whatever plan you're on, this is
going to last two times as long. Now,
what they also did is they removed the
peak hours limit reduction on Cloud Code
for Pro and Max accounts. So, if you
guys remember maybe a month ago or so,
they came out with this announcement and
said, "Hey, during peak hours, so like
weekday morningsish, you will hit your
session limit faster." Because once
again, they had way too many people
using Cloud Code during peak hours and
they didn't have enough compute on their
side. And then they also did that thing
where they tested out not letting people
buy a pro plan. So the $20 a month plan
and using cloud code. You had to buy a
max plan in order to use it unless you
were already a subscriber. So they were
doing all these things to figure out
like how can we actually compete or how
can we actually keep up with how much
demand we're getting for our platform
right now? Because you also have the
issue of if you go out and buy a bunch
of extra compute and it's just sitting
there and it's not being utilized, then
that's also money going down the drain
because the compute is just sitting
there not being utilized. So, if you
really think about like the math and the
projections you'd have to do, it's
really not a very simple problem to
solve. And I'm not sure if this was the
intention. Remember when people were
using the cloud subscription for like
OpenClaw and Hermes Agent and then they
said, "Hey, you can't do that anymore.
It's against our terms of service." Yes,
it is against their terms of service.
But I wonder if there was also some
motivation there to be like, "Wo, we got
to like chill a little bit with how many
people are just abusing their
subscriptions right now." And we know if
we switch over to API keys then a lot of
people might stop using our models as
much for openclaw or Hermes. So anyways
just a quick thought but then the final
thing is that they are raising their API
rate limits considerably for claw opus
models. So it is a really decent chunk.
So per minute you used to only be able
to send 30k input tokens at a time or
you'd be rate limited and that has been
upgraded by like 16%. On the output side
it used to be 8,000 a minute and now
it's 80,000 a minute. So, every single
tier got a really a significant jump
here. I mean, if you think about only
being able to output 8,000 tokens in a
minute, I have gone past that so many
times. Once again, this is not for cloud
code. This is for the API. But besides
just the partnership with SpaceX, they
were also on kind of a buying spree. You
know, they have an agreement with
Amazon. They have an agreement with
Google and Broadcom. They also have a
partnership with Microsoft and Nvidia
and a big investment in American AI
infrastructure with Fluid Stack. So,
this was obviously a big announcement
with SpaceX, but they were kind of
working towards this and they've been
moving in the direction of figuring out
how to get more compute either way. And
the day before this actual conference,
they made that announcement about
partnering with that Goldman Sachs, JV,
as well as Blackstone. And you can just
tell that they're really going after
enterprise here. And they need to have
compute to be able to handle enterprise.
And they are expanding internationally.
So, anyways, let's just start to break
this down. I don't want to waste
students time. I want to keep this video
pretty quick. So, the headline, we read
all this, right? Anthropic released
three big changes. Um they also released
something pretty interesting with
managed agents. They gave them like web
hooks and autodreaming and um you know
multi-agent orchestration and I'm not
going to cover that right now. I'm
definitely going to play around with it
and I'll bring a video if I find some
interesting stuff. But this one's more
about the actual usage limits. The
5-hour rate limits got doubled. Peak
hours throttling has been removed and
API rate limits have been improved
significantly. So why does this matter?
Well, for months, everyone who was
building with cloud code has been
hitting walls. There's been so many
complaints about people hitting the
limits so fast. So many complaints about
people wanting to upgrade from pro to
max to max higher. And even on the
highest max plan still getting shut down
and not being able to use 5 hours worth
of cloud code. And I don't know how many
of you guys are, but if you were using
the API for Opus and you were trying to
build production agents or, you know,
apps that have some AI on the back end
with Opus, you might be hitting rate
limits very frequently. So, these are
the three changes that we just talked
about that have happened today. And
let's look at this rate limit thing
again because the statistics are pretty
interesting. The lowest tiers obviously
got the biggest multiples. We see 16
here and we see 10 here. But still
having all of these other ones, you
know, the input's getting more than the
output just because input is much less
expensive than output tokens. But this
basically meant if you had half a
million input tokens per minute, you
could pump roughly 370 pages of context
per minute. And that's just on tier one.
But before today, you would have only
had 30k tokens, which might have been
like 20 to 22 pages. And on the output
side, we can now generate way more
content much quicker. So if you wanted
to have like a bunch of different agents
running in parallel, that just would
have been really, really hard to do
under the previous rate limits with
Opus. And so obviously how they paid for
this was the SpaceX deal, right? They
got 300 megawatts of capacity. They got
over 220,000 Nvidia GPUs. And they did
this all super super fast, which is
really impressive. And I'm not going to
get very technical here, right? Compute
is expensive and these AI models need
compute. If you think about like you
さらにアンロック
無料でサインアップしてプレミアム機能にアクセス
インタラクティブビューア
字幕を同期させ、オーバーレイを調整し、完全な再生コントロールでビデオを視聴できます。
AI要約
動画コンテンツ、キーポイント、および重要なポイントのAI生成された要約を即座に取得します。
翻訳
ワンクリックでトランスクリプトを100以上の言語に翻訳します。任意の形式でダウンロードできます。
マインドマップ
トランスクリプトをインタラクティブなマインドマップとして視覚化します。構造を一目で理解できます。
トランスクリプトとチャット
動画コンテンツについて質問します。AIを利用してトランスクリプトから直接回答を得られます。
トランスクリプトをもっと活用する
無料でサインアップして、インタラクティブビューア、AI要約、翻訳、マインドマップなどをアンロックしてください。クレジットカードは不要です。