TRANSCRIPTEnglish

Claude Code + Ollama: FREE Local AI Coding FOREVER (Step-by-Step Tutorial)

15m 34s3,162 words419 segmentsEnglish

FULL TRANSCRIPT

0:00

Hi guys, welcome back. And today in this

0:02

video be talking about how you can use

0:04

cloud code without paying even a single

0:06

dollar. So we are going to be using the

0:09

power of cloud code with our local large

0:11

language model running within our own

0:13

machine. And you can see that the

0:15

pricing of the cloud code at the moment

0:17

is like $1.17 if you are going to be

0:19

going with an annual subscription

0:21

discount and $1200 will be charged up

0:24

front while you're going to be

0:25

purchasing this particular plan. And

0:27

this is only in US dollar which is

0:28

mentioned. But if I'm buying in New

0:30

Zealand, it is $34 if I'm going to buy

0:33

this particular subscription over here,

0:35

which is to be honest quite huge. And if

0:37

I'm going to be spending this much of

0:38

money, I know that it's going to be very

0:40

very faster because cloud code is super

0:42

fast and the models that they have is

0:44

super powerful. But what if we have an

0:46

ability to run everything within our

0:48

local machine without spending a single

0:50

dollar and that we can achieve with the

0:52

power of the cloud code itself, which

0:55

they have got the access to all lama

0:57

models these days. So if you just go to

0:59

the cloud code over here as you can see

1:02

over here this is the cloud code that we

1:04

always know and we have used before and

1:06

the installation is quite

1:07

straightforward. This is how you install

1:09

it. If you really not installed it

1:10

before I'll quickly show you in this

1:11

particular video like how you can do it

1:13

and once you do the installation then if

1:15

you just go to Olama over here they now

1:18

have an ability to go and connect with

1:20

the cloud code and use the models which

1:23

are there in the Olama models itself. So

1:25

you can basically use the models from

1:27

Olama within your cloud code itself. And

1:30

this time I'm going to be using my Asus

1:33

GX10 super computer which is this one as

1:36

I have already shown demo before and

1:39

this particular computer as you can see

1:40

come with a Nvidia Blackwell chip and it

1:44

is like 128 GB of uh the GPU memory as

1:48

well and it's it's really really really

1:49

very very powerful. So I'm going to be

1:51

using this particular machine for this

1:52

demonstration while I'm going to show

1:54

you. I can also use my Apple M1 machine

1:56

as well, but it is bit slower

1:58

comparatively the Asus uh GX10 machine

2:01

itself. So, I'm going to be using the

2:02

Asus GX10 and I'm going to show you how

2:05

it's amazingly you can use everything

2:07

from the ground up. Well, as said, let's

2:09

get started in the video and I will show

2:10

you everything and I'm quite excited to

2:12

show you what I have got installed

2:14

today.

2:16

[music]

2:17

So, the first thing we need to do is to

2:18

install the cloud code. So, the first

2:20

thing is I'm going to go copy this

2:21

particular command as you can see over

2:23

here. So I am actually in the Olama docs

2:26

itself and in the integration you have

2:28

something called as coding and in the

2:30

coding you have something called as

2:31

cloud code. So you can just use the

2:33

cloud code over there and I am just

2:36

going to see in the documentation they

2:38

have mentioned that the cloud code

2:39

through Olama anthropy compatible API

2:43

enables to use the model such as GLM4.7

2:46

QN3 coder and GPT OSS model. So these

2:50

are the recommended model by the uh by

2:53

the Olama team itself to use with the

2:56

Olama model that you have got and uh I

2:59

think I'm just going to go with what

3:00

they have told. So I'm going to show you

3:02

how amazing it they are. So I'm going to

3:04

first do the installation of the cloud

3:05

code. So I'm going to go copy this and

3:07

I'm going to open my terminal over here

3:10

and I'm going to paste this command. I'm

3:12

going to hit enter. There we go. The

3:14

installation is done. It also tells you

3:16

that the native installation exist but

3:18

the uh local bin is not in your path. So

3:21

you need to add it. So I need to either

3:23

add in the z file or in the bash rc

3:25

file. But I'm just going to do the

3:27

export command for now instead of just

3:29

doing it for this demonstration. So I'm

3:32

going to go paste this over here. So now

3:34

I have the cloud code uh running. So I'm

3:36

just going to do a source of the zshrc

3:39

which means it's going to just reload

3:41

the terminal session. Uh and now if I'm

3:43

going to do a cloud over here, it is

3:45

just going to work. See if it is green

3:47

color, which means it's it's happy. And

3:49

now it's uh going to work for us over

3:51

there. So I'm going to go uh with one of

3:54

the project that I have got which is

3:55

going to be the EA um EA app uh project

3:59

which is this one. And I have opened

4:01

this particular application in the Ryder

4:03

IDE as you can see over here. And if you

4:06

wanted to use the writer IDE and all the

4:08

products from JetBrines for free of cost

4:11

for 3 months, you can use the link in

4:13

the description below which is going to

4:14

give you the uh the offer for 3 months

4:17

for free of cost. So if you just go to

4:19

the website over here, let's say

4:21

jetbrians.com

4:23

uh and you can see that they have got

4:24

the uh the coding agents natively

4:27

integrated in the ide. So you can use

4:29

that particular part over there free of

4:31

cost as well. And if you just go to the

4:33

all the products over here. So all these

4:35

products that you are seeing over here

4:36

are available for free of cost for 3

4:39

months. Just use the coupon code below.

4:40

It is going to give you the discount.

4:42

Thank you JetBrian for making this

4:43

happen. Well, as that said, this is the

4:45

rider ID that I'm going to be using for

4:48

this particular demonstration over here.

4:50

And if I'm going to run this particular

4:51

application, I quickly show you how this

4:53

application looks like. So if I'm going

4:54

to run this app, it is going to

4:56

basically open two um two pages. one is

4:59

the web URL and another one is the back

5:02

end uh with an API over here. So if I'm

5:05

going to just run this particular

5:06

product, it is the product that you are

5:08

seeing over here. The list of product

5:09

that I have got. I can either create a

5:11

product or I can either delete the

5:13

product or edit the product or I can

5:14

view the product. So all of these things

5:16

I can do from here itself. It's a very

5:18

very super simple UI that I have got and

5:20

I have also got the APIs and if you have

5:22

watched my Udemy courses before you know

5:25

how this application is built and how

5:26

this application is tested in playright

5:28

selenium with AI I have also covered

5:31

everything with this core with this

5:33

particular application. I have did many

5:34

time with this particular application.

5:36

So this is not a new if you have already

5:38

following my Udemy courses. So I'm going

5:40

to just go to the hyper terminal over

5:42

here one more time and because we have

5:45

already installed the cloud code uh now

5:48

we need to use the lama to make it

5:51

happen. So the way you can actually do

5:53

it is there is a command called as lama

5:57

and then there is a command called as

5:59

launch. So you need to make sure that

6:00

you have updated the latest version of

6:02

Olama as well. If not these features may

6:05

not work. uh just update the Olama to

6:07

the latest version and [snorts] then see

6:09

that there is something called as claude

6:11

over here and then just use the command

6:14

hyphen config and if you hit enter it is

6:17

going to show you all the models which

6:19

are running within my ASUS GX10 machine

6:22

and how do I connected this I have

6:24

already talked about that in my other

6:25

video but I'll quickly show you how I

6:27

did it I am using the uh Nvidia spark

6:30

link option which does the connectivity

6:33

for me over here see this is the way

6:35

that I have connected to my ASUS GX10

6:37

machine. Uh this is the Olama that I'm

6:39

connected with. And you can see that

6:42

just if I'm going to show you one more

6:43

time over here in a different window

6:46

over here. So if I'm going to just do an

6:48

Oola Lama list, you can see that I have

6:51

got these many models over here. So

6:53

these are the models which are there

6:54

within my ASUS GX10. But the moment I'm

6:57

going to disconnect with my uh Asus GX10

7:00

and if I'm going to be doing an Olama

7:03

list this time these are the model which

7:05

are running within my local machine. So

7:06

the model is see they are completely

7:08

different. There is a Kim K2.5 cloud

7:10

model GP2 uh GPT OSS 120 billion

7:13

parameter deepsee uh v1 3.1 671 billion

7:18

cloud model. So these are the model

7:19

which are running within my local

7:21

machine as you can see over here. uh and

7:24

the moment I'm going to connect with the

7:26

Asus GX10 over here using the Nvidia

7:29

sync and if I'm going to connect to the

7:32

Olama over there. Now what happens is

7:35

the models are going to be different as

7:37

you can see see that these are the

7:38

models are coming from the ASUS GX10

7:40

itself and you can see that the the

7:42

parameter that I'm running it over here

7:45

120 billion parameter which is 65GB of

7:48

storage uh it's taking and 20 billion

7:50

parameter and there is a Q uh three

7:54

quarter 30 billion parameter as well. So

7:56

I'm going to use this particular model

7:57

this time and I'm going to see how it

7:58

works. So I'm going to copy this

8:00

particular model. I'm going to close

8:01

this window because it's not required.

8:03

And because I've already connected to it

8:05

now see that the moment I'm going to do

8:07

lama launch cloud of config it is going

8:10

to show me all the model that is there

8:12

within my machine. See that all these

8:16

models which are there in my local Olama

8:18

of ASUS GX10 and I'm going to choose

8:21

this uh QN3 quarter 30 billion parameter

8:24

and the moment I select it it's going to

8:26

say do you want me to launch the uh

8:28

cloud code now? And I'm going to say

8:30

yes. And the moment I'm going to say yes

8:31

over here, now all the configurations

8:34

are going to be delegated to cloud code

8:37

from this point on. Yes. See launching

8:39

cloud code with Q3 quarter 30 billion

8:42

parameters. So now it is connected to

8:44

cloud code. So if I'm going to say /mod

8:48

uh which is this one and you can see

8:50

that it shows me that there are these

8:52

models available but the one that I'm

8:54

connected to is the QN3 quarter 30

8:56

billion which is a custom model. Wow.

8:59

which is cool. So we have connected to

9:00

the local uh model over here. So I'm

9:03

going to just say escape. And now I'm

9:05

going to ask it to write some code. So

9:07

the code which I'm going to ask it to

9:08

write itself is that uh you can see that

9:11

the UI was quite normal, right? Like

9:13

it's not really modern or anything like

9:15

that. But I wanted this UI to be

9:17

modernized. Uh and I wanted the the UI

9:20

to look and feel more modern approach

9:22

instead of having it like this. So I'm

9:24

going to say uh maybe I'm going to say

9:26

this EA web app, right?

9:29

Can you try to modernize the UI of my EA

9:36

web app application which is built using

9:40

C.NET

9:43

MVC

9:44

framework. That's all. And now I'm going

9:47

to just hit enter. So from this point on

9:50

it is going to start doing the app

9:52

buildings and everything within my local

9:56

large language model over here. Uh

9:58

instead of using anything from the cloud

10:00

itself. So I don't even have the cloud

10:02

subscription right now. I have

10:03

completely got rid of it and I'm

10:05

actually using everything from my local

10:08

large language model itself. And this

10:10

machine I know it's very very powerful

10:11

machine. It's also very costly but at

10:13

least you see that this is the way that

10:15

you can actually use these kinds of

10:17

machine for doing the development

10:19

purpose as well as testing purpose using

10:21

the local models and there are many

10:23

advantages of having these machines uh

10:25

and also having these uh cloud over here

10:28

because see that the the token is not

10:30

that bad. It is doing quite good at the

10:32

end of this particular uh execution that

10:35

you are seeing over here. I will show

10:36

you how many tokens been generated or

10:39

used to perform this operation and we'll

10:42

see how it's going to look like. So I'm

10:43

just going to wait for the entire

10:46

execution to happen. Ah look at that. It

10:48

has already found that the UI this is

10:50

what it is and we want to modernize

10:52

things. So there is an UIUX improvement.

10:55

Uh it says migrate the bootstrap 5 for

10:57

better responsiveness and modern

10:59

component uh and blah blah blah and

11:01

component modernization technology/

11:04

stacks and the performance improvement

11:06

accessibility. Wow, that's pretty cool.

11:08

I didn't even know any of these. So I'm

11:10

going to just say uh yes, allow all the

11:12

edit during the session giving the

11:14

entire um stuff to my local large

11:17

language model. And because this is

11:18

local large language model uh and of

11:20

course I have control over this model. I

11:22

can unplug this anytime. Uh now I'm just

11:25

going to see how things are going to

11:27

work. So let's just wait for this to

11:29

happen.

11:37

All right, you can see that the changes

11:38

have been implemented right now. So we

11:40

have got everything done over here with

11:43

all the changes that this tool was doing

11:45

all these. I'm just scrolling up over

11:47

here. You can see that how many changes

11:49

it did around 23,000 tokens it was it

11:52

was actually doing to make this happen

11:55

to complete this entire task. So now let

11:57

me go and stop this entire execution and

12:00

probably rebuild the solution because

12:01

this is a net code. So I need to rebuild

12:04

it. Uh oop I can see there is an error

12:07

coming up over here somewhere. If I'm

12:09

just going to go up a bit uh there is a

12:11

red color line there. Oh possible

12:14

conflict of the assert with the same

12:15

target. Uh there we go. I think we have

12:18

some error here. So the build has got

12:20

some error. So I'm going to ask the same

12:22

to my cloud code and we'll see how that

12:24

works. Okay, there we go. And now I have

12:26

asked the cloud code to uh to see if

12:30

there is any um any way to fix this

12:32

particular issue. So I'm just uh ask

12:35

that. So let's wait for uh this to be

12:38

resolved. So it's again running it. Now

12:40

I will need to wait and see how much

12:43

time it's going to take. So last time

12:44

the entire application building was

12:46

taking around 15 minutes. It's it's not

12:49

as uh faster as you can imagine. It is

12:52

slower. Uh if you can do the same thing

12:55

with the with the cloud models, for

12:57

example, cloud code 4.5 or Opus 4.6,

13:01

it's going to be super super faster. But

13:03

this is this is slower to be honest

13:05

because it's all running in my local

13:06

machine and it's just getting warmer as

13:08

well uh to be honest. And uh I will just

13:11

need to wait and see how long the entire

13:14

fix is going to take. Yeah, it's doing

13:16

something. So uh let's see how long it's

13:19

going to complete. So I'll just wait for

13:21

the fix to be fully uh done and then

13:24

I'll be back. All right, finally the

13:26

error is also gone because I just

13:27

executed the check the error if it is

13:29

gone and it is seems to be gone and

13:31

there is this particular UI as you can

13:33

see. Finally, it has built this

13:35

particular UI as you can see. It creates

13:37

a product and the UI is completely

13:39

amazing. Uh, and there is also 24

13:42

products been listed which is also been

13:44

shown over here. And there is a

13:45

homepage. Look at that. Like how amazing

13:47

it is. And there is a product uh page

13:50

over here. Also shows the type as

13:52

peripheral something like that. And you

13:54

can see uh the view and there's an edit

13:56

button. Wow, this is pretty cool. So all

14:00

of these are happening just from our

14:03

local large language model as you can

14:04

imagine and this is working as expected.

14:07

This is the power of the local large

14:09

language model running on the uh ASUS

14:12

GX10 uh and also how you can work

14:16

everything offline instead of going on

14:18

uh to the internet by using the powerful

14:20

large language model something like

14:22

that. So yeah, I can see that it's all

14:24

just working fine and it's just working

14:26

as expected. And I could see that this

14:28

there is a potential of using the local

14:30

large language models running on the

14:31

local machine with the cloud code. But

14:34

if you ask me if they are very faster, I

14:36

would probably say no. They're not even

14:38

close to the the models that are running

14:41

on the cloud. If you have a cloud model,

14:43

they do way more faster than running on

14:46

the local machine. That's what I can

14:48

see. And also uh they are not quite

14:50

reliable as you can do it with the cloud

14:52

models. That's my honest opinion. But

14:55

but still if you think that you have

14:57

some use cases to use your local large

14:59

language model and do all these

15:01

development, you can still get these

15:03

kind of operations. But it's going to

15:04

take a long time. I did this entire

15:07

recording for more than 25 minutes to

15:09

get this part. But if I'm going to do

15:11

the exact same thing with the cloud

15:12

model, I could have done that in less

15:14

than five or 6 minutes. That's the max.

15:17

That's it guys. Once again, thank you so

15:19

much for watching this video. This is

15:20

how you can use cloud code with Olama

15:24

and you can use the local larger

15:26

language model to do all of these

15:27

operation. Thank you so much. Catch you

15:30

in the next one.

UNLOCK MORE

Sign up free to access premium features

INTERACTIVE VIEWER

Watch the video with synced subtitles, adjustable overlay, and full playback control.

SIGN UP FREE TO UNLOCK

AI SUMMARY

Get an instant AI-generated summary of the video content, key points, and takeaways.

SIGN UP FREE TO UNLOCK

TRANSLATE

Translate the transcript to 100+ languages with one click. Download in any format.

SIGN UP FREE TO UNLOCK

MIND MAP

Visualize the transcript as an interactive mind map. Understand structure at a glance.

SIGN UP FREE TO UNLOCK

CHAT WITH TRANSCRIPT

Ask questions about the video content. Get answers powered by AI directly from the transcript.

SIGN UP FREE TO UNLOCK

GET MORE FROM YOUR TRANSCRIPTS

Sign up for free and unlock interactive viewer, AI summaries, translations, mind maps, and more. No credit card required.

GET STARTED FREE SIGN IN