TRANSKRIPTEnglish

Claude Just Solved Session Limits

10m 23s2,478 ord356 segmentsEnglish

FULLSTÄNDIGT TRANSKRIPT

0:00

So today, Claude announced that they

0:01

agreed to a partnership with SpaceX, so

0:03

Elon Musk's company, and it's going to

0:05

substantially increase their compute

0:07

capacity, which means that they've been

0:08

able to increase their usage limits for

0:10

cloud code and for the cloud API. So

0:12

today in San Francisco was the first

0:14

code with cloud event 2026, which is

0:16

basically just a big developer

0:18

conference that they're doing in San

0:19

Francisco, London, and Tokyo over the

0:21

next month or so. And they actually got

0:23

so much demand for this that they

0:24

extended it. It looks like they did an

0:26

extra day in each of these locations. So

0:28

pretty cool. But as you guys probably

0:30

are aware if you've been using Claude

0:31

for a while is that this past couple

0:33

months, this past quarter has been awful

0:35

with outages. There's been so many times

0:37

where Claude has just died while you're

0:40

trying to use it. And there's probably a

0:41

lot of different reasons for that. You

0:42

know, like they were doing a lot of

0:43

testing. They were shipping so many

0:44

features. They had Opus, they had

0:45

Mythis, but really the main reason is

0:47

that they just didn't have enough

0:48

compute to handle how much demand there

0:50

was. There was way too many people

0:51

trying to use Claude than what Claude

0:53

could actually support. So today I

0:55

wanted to actually break this down and

0:56

help you guys understand what this means

0:58

for you practically and what you might

1:00

want to start doing differently. So

1:01

higher usage limits for Claude and a

1:03

compute deal with SpaceX. Very very

1:05

interesting. So because of this

1:07

partnership, what that means effective

1:10

immediately is that first of all,

1:11

they're going to be able to double

1:12

Claude Code's 5-hour rate limits.

1:14

Double. Whether you're on pro, max, or

1:16

team, your 5-hour limit is going to be

1:19

doubled. So, when you're here inside of

1:20

Claude, whatever plan you're on, this is

1:22

going to last two times as long. Now,

1:25

what they also did is they removed the

1:26

peak hours limit reduction on Cloud Code

1:29

for Pro and Max accounts. So, if you

1:31

guys remember maybe a month ago or so,

1:33

they came out with this announcement and

1:34

said, "Hey, during peak hours, so like

1:36

weekday morningsish, you will hit your

1:37

session limit faster." Because once

1:39

again, they had way too many people

1:41

using Cloud Code during peak hours and

1:43

they didn't have enough compute on their

1:44

side. And then they also did that thing

1:46

where they tested out not letting people

1:48

buy a pro plan. So the $20 a month plan

1:51

and using cloud code. You had to buy a

1:53

max plan in order to use it unless you

1:55

were already a subscriber. So they were

1:57

doing all these things to figure out

1:58

like how can we actually compete or how

2:00

can we actually keep up with how much

2:01

demand we're getting for our platform

2:03

right now? Because you also have the

2:04

issue of if you go out and buy a bunch

2:07

of extra compute and it's just sitting

2:08

there and it's not being utilized, then

2:10

that's also money going down the drain

2:11

because the compute is just sitting

2:13

there not being utilized. So, if you

2:15

really think about like the math and the

2:16

projections you'd have to do, it's

2:18

really not a very simple problem to

2:20

solve. And I'm not sure if this was the

2:21

intention. Remember when people were

2:23

using the cloud subscription for like

2:24

OpenClaw and Hermes Agent and then they

2:26

said, "Hey, you can't do that anymore.

2:27

It's against our terms of service." Yes,

2:29

it is against their terms of service.

2:31

But I wonder if there was also some

2:32

motivation there to be like, "Wo, we got

2:33

to like chill a little bit with how many

2:35

people are just abusing their

2:37

subscriptions right now." And we know if

2:38

we switch over to API keys then a lot of

2:40

people might stop using our models as

2:42

much for openclaw or Hermes. So anyways

2:44

just a quick thought but then the final

2:46

thing is that they are raising their API

2:48

rate limits considerably for claw opus

2:50

models. So it is a really decent chunk.

2:53

So per minute you used to only be able

2:55

to send 30k input tokens at a time or

2:57

you'd be rate limited and that has been

2:59

upgraded by like 16%. On the output side

3:02

it used to be 8,000 a minute and now

3:03

it's 80,000 a minute. So, every single

3:05

tier got a really a significant jump

3:08

here. I mean, if you think about only

3:09

being able to output 8,000 tokens in a

3:11

minute, I have gone past that so many

3:14

times. Once again, this is not for cloud

3:15

code. This is for the API. But besides

3:17

just the partnership with SpaceX, they

3:19

were also on kind of a buying spree. You

3:21

know, they have an agreement with

3:22

Amazon. They have an agreement with

3:23

Google and Broadcom. They also have a

3:25

partnership with Microsoft and Nvidia

3:26

and a big investment in American AI

3:28

infrastructure with Fluid Stack. So,

3:30

this was obviously a big announcement

3:31

with SpaceX, but they were kind of

3:33

working towards this and they've been

3:34

moving in the direction of figuring out

3:35

how to get more compute either way. And

3:38

the day before this actual conference,

3:39

they made that announcement about

3:41

partnering with that Goldman Sachs, JV,

3:43

as well as Blackstone. And you can just

3:46

tell that they're really going after

3:48

enterprise here. And they need to have

3:49

compute to be able to handle enterprise.

3:51

And they are expanding internationally.

3:53

So, anyways, let's just start to break

3:54

this down. I don't want to waste

3:55

students time. I want to keep this video

3:56

pretty quick. So, the headline, we read

3:58

all this, right? Anthropic released

3:59

three big changes. Um they also released

4:01

something pretty interesting with

4:02

managed agents. They gave them like web

4:04

hooks and autodreaming and um you know

4:07

multi-agent orchestration and I'm not

4:08

going to cover that right now. I'm

4:09

definitely going to play around with it

4:10

and I'll bring a video if I find some

4:12

interesting stuff. But this one's more

4:14

about the actual usage limits. The

4:16

5-hour rate limits got doubled. Peak

4:18

hours throttling has been removed and

4:19

API rate limits have been improved

4:22

significantly. So why does this matter?

4:25

Well, for months, everyone who was

4:26

building with cloud code has been

4:28

hitting walls. There's been so many

4:29

complaints about people hitting the

4:30

limits so fast. So many complaints about

4:32

people wanting to upgrade from pro to

4:34

max to max higher. And even on the

4:37

highest max plan still getting shut down

4:39

and not being able to use 5 hours worth

4:41

of cloud code. And I don't know how many

4:43

of you guys are, but if you were using

4:44

the API for Opus and you were trying to

4:46

build production agents or, you know,

4:48

apps that have some AI on the back end

4:50

with Opus, you might be hitting rate

4:51

limits very frequently. So, these are

4:53

the three changes that we just talked

4:54

about that have happened today. And

4:56

let's look at this rate limit thing

4:58

again because the statistics are pretty

5:00

interesting. The lowest tiers obviously

5:01

got the biggest multiples. We see 16

5:03

here and we see 10 here. But still

5:05

having all of these other ones, you

5:06

know, the input's getting more than the

5:08

output just because input is much less

5:10

expensive than output tokens. But this

5:12

basically meant if you had half a

5:13

million input tokens per minute, you

5:15

could pump roughly 370 pages of context

5:17

per minute. And that's just on tier one.

5:19

But before today, you would have only

5:20

had 30k tokens, which might have been

5:22

like 20 to 22 pages. And on the output

5:24

side, we can now generate way more

5:26

content much quicker. So if you wanted

5:28

to have like a bunch of different agents

5:30

running in parallel, that just would

5:31

have been really, really hard to do

5:32

under the previous rate limits with

5:34

Opus. And so obviously how they paid for

5:36

this was the SpaceX deal, right? They

5:38

got 300 megawatts of capacity. They got

5:40

over 220,000 Nvidia GPUs. And they did

5:43

this all super super fast, which is

5:45

really impressive. And I'm not going to

5:46

get very technical here, right? Compute

5:48

is expensive and these AI models need

5:50

compute. If you think about like you

LÅS UPP MER

Registrera dig gratis för att få tillgång till premiumfunktioner

INTERAKTIV VISARE

Titta på videon med synkroniserad undertext, justerbart överlägg och fullständig uppspelningskontroll.

REGISTRERA DIG GRATIS FÖR ATT LÅSA UPP

AI-SAMMANFATTNING

Få en omedelbar AI-genererad sammanfattning av videoinnehållet, nyckelpunkter och slutsatser.

REGISTRERA DIG GRATIS FÖR ATT LÅSA UPP

ÖVERSÄTT

Översätt transkriptet till över 100 språk med ett klick. Ladda ner i valfritt format.

REGISTRERA DIG GRATIS FÖR ATT LÅSA UPP

MIND MAP

Visualisera transkriptet som en interaktiv mind map. Förstå strukturen med ett ögonkast.

REGISTRERA DIG GRATIS FÖR ATT LÅSA UPP

CHATTA MED TRANSKRIPT

Ställ frågor om videoinnehållet. Få svar från AI direkt från transkriptet.

REGISTRERA DIG GRATIS FÖR ATT LÅSA UPP

FÅ UT MER AV DINA TRANSKRIPT

Registrera dig gratis och lås upp interaktiv visning, AI-sammanfattningar, översättningar, mind maps och mer. Inget kreditkort krävs.

    Claude Just So… - Fullständigt Transkript | YouTubeTranscript.dev