TRANSCRIPTEnglish

Regression

44m 27s5,815 words858 segmentsEnglish

FULL TRANSCRIPT

0:00

start with the regression and then I

0:04

don't know how much time will it take if

0:06

time then we will do either sources of

0:08

capital or one of

0:11

that okay regression

0:14

uh so first the equation Y = C + MX

0:19

which is now called as yal a + BX or

0:23

called as y equal B 0 + b1x okay that c

0:27

or uh the a or that b0 is called as

0:31

starting point which is also called as

0:34

intercept okay uh so we call for today

0:38

as Y A Plus BX but in the exam you could

0:41

say y b 0 b1x C

0:45

MX for today say Y is equal to a plus BX

0:49

so the a is called as starting point

0:51

which is also called as

0:53

intercept if that is positive then the

0:56

graph will start from above and if that

0:58

a is negative the graph will Start From

1:00

Below so that's the importance of the

1:02

intercept where does the line start from

1:05

a positive that means the line will

1:06

start from above above the origin and a

1:09

negative line will start below the

1:10

origin second is the B which is called

1:13

as slope coefficient calculate formula

1:17

I'm taking after some time if the B

1:19

value is positive the line would be

1:21

upward sloping if the B value is zero

1:23

line would be flat and if the B value is

1:25

negative line would be downward sloping

1:28

if the B value is negative line be

1:30

downward sloping if B value is zero line

1:31

would be flat if the B value is positive

1:33

line would be upward sloping and if

1:35

value is more positive line would be

1:36

more upward sloping okay

1:39

yeah so a is the starting point if a is

1:42

positive line will start from above

1:44

origin if a is negative line will start

1:45

from below origin okay if B is negative

1:49

line will be going down if B is zero

1:51

line will be horizontal if B is positive

1:54

line would be upward sloping and this

1:55

more positive line would be more steeply

1:57

upward sloping

2:00

uh this x which is the horizontal axis

2:02

is called as independent variable and

2:04

this Y which is the vertical axis is

2:06

called as dependent variable so the

2:07

equation y equal to a plus BX Y is

2:10

dependent variable X is independent

2:12

variable a is intercept and B is your

2:15

slope all right so a starting point B

2:18

the steepness X independent variable y

2:21

dependent

2:22

variable this line is called as the

2:25

regression line okay this line is called

2:27

as regression line and this model is

2:29

called as linear linear model this model

2:31

is linear linear

2:36

model Y A Plus BX you have Ln of Y = A

2:40

Plus BX then that's called as log linear

2:42

model instead of yal a plus BX you have

2:46

y = a + b into log of x that is Ln of X

2:49

is called as linear log model and again

2:52

it's Ln for y also Ln for X also then

2:55

called as log log model so yal a plus b

2:58

x x Ln

3:00

y linear model

3:05

y l linear model

3:09

y log

3:12

model l l

3:14

model you should know the formula for

3:17

the B although a and the B can be

3:19

calculated directly from that second

3:21

seven and second 8 so if you press

3:23

second seven put all data second 8 and

3:27

linear other second enter

3:30

and then scroll down down down down down

3:31

you'll get a b and then R so a that's

3:35

the a intercept and B slope coefficient

3:38

but obviously in the exam they might ask

3:39

you the formula what is the

3:42

formula B

3:46

formula between Y and

3:49

xid variance of x y and x variance of X

3:54

that's the formula for the B Co between

3:57

Y and X that is dependent and

3:59

independent variable by variance of X

4:01

that is variance of independent variable

4:03

I think example S&P 5 and some ABC stock

4:07

So Co ABC stock was the Y and S&P 500

4:11

was the X so cence between the X and the

4:14

Y would be cence between S&P and ABC

4:17

stock and divid by variance of X and

4:19

that X wasp finded so variance ofp find

4:22

in the

4:23

denominator so general form is co

4:25

between Y and X variance of X and once

4:28

you get the value of then only you can

4:30

get the value of a so over there instead

4:32

of Y you put the Y bar that is average

4:35

of Y instead of X you put the xar which

4:38

is average of X so Y is equal to a plus

4:40

BX in of y y bar instead of x xar b we

4:44

already calculated so only thing which

4:46

is unknown is a value okay so y y bar

4:50

average of Y is known to you X xar is

4:53

already known to you B value calculated

4:55

so the only thing which is unknown is a

4:57

value so that's the way you're going to

4:58

get a value

5:06

then that coefficient B whether that is

5:09

statistically different from zero or it

5:12

is statistically zero find out we have

5:15

three ways but why it is important to

5:17

find out what is statistically zero

5:19

statistically different from zero

5:21

statistically zero zero useless then

5:24

that case that X will become useless so

5:27

X coefficient B if that b is

5:29

statistically zero then if it is zero

5:31

then it's useless if it is statistically

5:34

different from zero then it is useful so

5:36

once you calculate the value of B I

5:38

think what 64 value I think so B value

5:41

which was 64 if that is statistically

5:44

zero or statistically non zero

6:00

is important because if the B value is

6:01

statistically zero then that's useless

6:04

and if B value is statistically non zero

6:06

then that's useful so how do we go for

6:08

that testing so there is three ways one

6:11

is the easiest way which is called as P

6:12

value if P value is given to be3

6:16

percentage then Max

6:20

99.7% confidence level then the B would

6:25

be non zero that means H1 would be

6:26

selected and H1 would have B non zero so

6:29

B will be different from zero or B will

6:31

not be zero that would be your H1 and

6:34

that H1 would be selected Max up to

6:37

99.7% if the given P value is3

6:39

percentage so the P value is.3 100 -3

6:43

99.7% so up to 99.7% confidence we can

6:47

say that the H1 would be selected which

6:49

means we can say that the B is non zero

6:53

so take an example P value is 1

6:55

percentage and the confidence sorry

6:58

Alpha is 1 percentage P value is 1

7:02

percentage and Alpha is. 1 percentage

7:05

whether B is statistically zero

7:07

statistically non

7:10

zero I I repeat P value 1 percentage

7:14

Alpha

7:15

.1% whether the B is statistically zero

7:19

statistically non

7:22

zero what is P value 1 percentage so

7:26

what's Max confidence up to 99

7:29

percentage H1 would be selected up to 99

7:33

percentage B will be different from zero

7:35

H1 means not equal to that is different

7:38

from zero so up to 99 percentage the B

7:41

would be non zero but the question had

7:44

said Alpha of

7:46

Point

7:48

99.9 that means it was beyond the 99 so

7:51

up to 99 H1 would be selected Beyond 99

7:55

H1 would be rejected h0 would be

7:57

selected and h0 would stay that b is

8:00

equal to Z H1 was B not equal to Z h0

8:05

was b equal to Z so up to 99 percentage

8:08

based on P value of 1 percentage up to

8:10

99 percentage H1 would be selected

8:13

Beyond 99 percentage H1 would be

8:15

rejected h0 would be selected and h0 was

8:19

b equal to z b equal to Z get selected

8:22

would mean that the coefficient is

8:24

statistically zero and hence useful yeah

8:27

useless useless because statistically

8:29

zero

8:33

useless second method is you do the

8:36

testing and for that you have to first

8:39

write the null and the H1 null would be

8:44

opposite of your belief so generally n

8:46

would be b equal to Z and H1 would be B

8:49

not equal to

8:52

Z is the coefficient greater

8:56

than2 so greater than to less than equal

9:02

toer less

9:05

equal equal to sign can never come in H1

9:09

so greater than 020 would be there in H1

9:13

and less than or equal to 020 will be in

9:15

h0 because equal to that will be h0 I

9:19

say greater than that means less than

9:21

equal to would be h0 and greater than

9:23

would be H1 so greater than would mean

9:26

right tail okay so you'll be focusing on

9:28

only the upper critical value okay so

9:32

now we have to look at the T Test

9:34

compulsorily over here we don't have

9:37

the standard Dev of Pop is known to Z

9:41

standard of population is not known to

9:43

use use test so in this example when I

9:46

said the H1 would be B greater than 020

9:50

then greater than would mean upper tail

9:53

so there's only one tail so we'll be

9:54

looking at one tail Alpha would be given

9:56

to you let's say 5% alpha is given to

9:58

you so one 5 percentage we'll be looking

10:01

that in that column and degree of

10:03

Freedom would

10:04

be good n minus 2 for correlation and

10:08

regression the degree of freedom is

10:09

going to be n minus 2 so in that column

10:11

degree of Freedom n minus 2 one tail and

10:14

Alpha

10:15

5 previous example h b equal to Z and H1

10:20

B not equal to Z so not equal to left

10:23

tail and right tail both so 2.5 over

10:25

here 2.5 over here so you will be one

10:27

tail 2.5 and two tail five so 2 till

10:30

five degree of Freedom nus 2 will be

10:32

looking at the lower and upper critical

10:35

second example B is greater than 020 so

10:39

greater than upper tail only so only

10:41

entire 5 percentage will be over here so

10:43

one tail 5 percentage degree of Freedom

10:45

n

10:48

minus so you'll get the area of H1 and

10:51

the h0 and then you have to apply the

10:53

formula remember write down that formula

11:15

given value okay I'll not speak out the

11:16

formula you write down the formula and

11:17

then check it on the

11:27

board how many have written like this

11:29

raise

11:33

hand this is wrong denominator

11:36

root it's only this formula okay the

11:39

value of B that you calculated minus the

11:41

value the

11:44

H1

11:48

sorry so H1 h0 was like

11:52

this and H1

11:55

was H1 Zer so that will put over here

11:58

and just second

12:06

H1

12:08

h120 so that H1 be putting

12:12

020 all right so whatever is the value

12:15

in the H1 H1 zero value so put that H1

12:18

over here zero value and h120 you put

12:21

that 020 over here okay so the

12:23

calculated value of B which I believe

12:25

was 64 right

12:27

64 and then B1 will be given to you

12:30

that's called as standard

12:33

error of slope

12:42

coefficient how theable will look

12:45

like uh you'll be given some y value

12:48

over here which is like let's say stock

12:50

return

12:52

return then you will be given over here

12:56

intercept and then the x value which is

12:59

let's say Market

13:03

return return

13:05

coefficient and then standard

13:10

error so let's say these values are like

13:12

this two three and let's say uh 2.5 and

13:17

4.5 so how to read this the equation

13:19

will be stock return return that will be

13:22

your Y is equal to a +

13:26

BX that a intercept that intercept a is

13:30

this

13:32

two plus the slope is three and that X

13:37

is Market

13:39

return all right so the Y will be

13:43

written over here or it will be

13:45

given y value that is y meaning stock

13:50

return this is your X which is your

13:53

Market return intercept is a

13:57

value and this is your coefficient value

14:00

B

14:01

value okay so this is the B value three

14:04

which you'll be putting it over here in

14:05

the formula B value

14:07

three value in the H1 is over here or

14:10

over here whatever the H1 they will give

14:11

it to you and the standard error of B1

14:14

is this standard error of B1 is this

14:16

this

14:20

4.5 So based on that you'll get your

14:22

calculated value calculated value h0

14:25

area h0 would be selected calculated

14:28

value H1 area H1 would

14:36

be okay so

14:43

P hypothesis

14:47

testing confidence interval so what you

14:50

do over here you get the B value which

14:52

is 64 in example

14:59

that was

15:02

three so you add as well as you

15:07

subtract something called as s into

15:10

T you get B Plus St and over here you

15:14

get B minus that s is standard ER which

15:23

given 4.5 standard ER so the standard

15:26

error s is 4.5 and T value will have to

15:29

be calculated using the T table degree

15:32

of Freedom n minus 2 and it's two tail

15:37

because left left and right right

15:43

inter okay so it's two tail and then

15:46

Alpha will be given to you yeah this

15:48

interval will be given to you so this is

15:50

95 percentage so

15:52

2.5

15:55

2.5 so 5 and then get the T value let's

16:00

say t value is 2 so 4.5 into 2 is 9 so

16:05

here it will be 3 +

16:08

9 and here it will be 3 - 9 so 3 - 9 is

16:12

-

16:15

6 and 3 + 9 is 12 and now the important

16:19

thing is interval minus 6 or 12 be zero

16:25

is there because lower limit is negative

16:29

- 5 -4 -3 -2 -1 zero and then 1 2 3 4 5

16:35

6 7 8 9 10 11 12 zero has come zero has

16:38

come 0 which we

16:42

say b equal to Z is going to be selected

16:46

and H1 B not equal to Zer will be

16:49

rejected so that's the third method I

16:52

repeat you'll be given B values let's

16:54

say B value is two you have to do B plus

16:57

s and then the right tail and then you

17:01

have to do B minus St and then the left

17:04

tail s Val will be given to you let's

17:07

say s value is

17:08

1.5 and to get the T value you have to

17:11

look at the T table degree of Freedom n

17:13

minus 2 that's one tail that second tail

17:15

so look for two tail and Alpha will be

17:17

given to you so based on that you get

17:19

the T value let's say t value is let's

17:22

say two or let's say t value is yeah

17:26

let's say two

17:29

then you do s into T So 1.5 into two

17:31

that would be three so two this B value

17:34

two 2 + 3 and over here B Val is two and

17:36

2 -

17:38

3 so s into T that will be 1.5 into 2

17:42

that will be 3 so 2 + 3 and 2 - 3 so

17:46

again 2 - 3 is

17:49

-1 and 2 + 3 is 5 -1 or 5 be zero will

17:54

be there zero would be

17:56

there at zero b equal to Z would be

18:00

selected and if it is zero it is

18:02

statistically

18:05

useless okay so these are the three

18:08

suppose interval minus 5 minus two

18:12

suppose interval came like this then is

18:14

the zero there over here

18:17

Z so Z is not there then H1 B not equal

18:22

to Z is going to be

18:23

selected so if it is not equal to zero

18:26

it's meaningful it's useful

18:32

similarly

18:35

range7 to

18:37

7.5 in this range is zero

18:41

there so Z is not in this range so H1 b

18:45

equal to Z is going to be

18:48

rejected so B is not I'm sorry will be

18:50

selected sorry H1 B not equal to Z sorry

18:53

H1 is B not equal to Z so that not equal

18:56

to Z is going to be selected

19:01

so it's not equal to zero so it's

19:06

useful so three methods one is P value

19:08

which I believe is the easiest one

19:10

hypothesis testing and then the

19:11

confidence

19:15

inter you calculate the value of y a and

19:19

the B value will be given to you either

19:21

in the table

19:23

like intercept is given and Market

19:26

return was given so what was the of

19:28

intercept I gave you

19:34

Che coefficient and standard error which

19:39

was I think

19:41

4.5 so y equal to intercept is

19:45

two and uh the market return return the

19:48

coefficient is three so that's

19:50

three x is your Market return so that's

19:53

your X Market

19:57

return and Y is your stock

20:02

return so you'll be given Market return

20:04

put that market return so let's say

20:05

Market return is let's say 5 percentage

20:08

you put that over there and you get the

20:09

value of stock return okay 5 into 3 15

20:12

15 you be 2 percentage so 2 percentage

20:16

so 15 percentage plus 2% will be 17

20:19

percentage that's called as estimated

20:21

value this is not the actual value this

20:24

is the estimated value actual value

20:26

could be higher or actual value could be

20:28

lower

20:30

and that will give us an error and that

20:33

error is called as actual value minus

20:35

predicted value this 17 percentage is

20:37

called as a predicted value actual value

20:39

could be higher or lower if actual value

20:42

is higher then the error would be

20:45

positive or

20:48

negative good the formula is actual

20:51

minus expected actual minus predicted

20:54

predicted is 17 if actual is higher than

20:57

that something like 18 so 18 - 17 error

21:00

would be plus one and if actual is

21:02

lesser than that something like 16 then

21:04

the error would be actual minus

21:06

estimated 16 minus 17 so that would be

21:09

negative 1 so if actual is higher error

21:11

would be positive if actual is lesser

21:13

error would be

21:15

negative so what we do we again

21:18

construct a confidence interval around

21:22

this so again confidence interval we put

21:25

the the calculated value in the middle

21:28

that is 17 percentage in the

21:30

middle what do we add and what do we

21:34

subtract St 17 plus St and 17 minus St s

21:42

this will be S for the forecasted

21:48

value which you don't have to that

21:51

formula I don't expect that formula to

21:52

be tested so this s for the forecast

21:55

will be given to you but there's a

21:57

formula given remember in the text

21:59

s e s into bracket 1 + 1 upon X so that

22:05

big formula is that the standard error

22:07

for forecast is not required to be

22:09

by so this would be that s standard

22:12

error of the forecast and T value again

22:15

you have to calculate it's a two tail

22:17

left tail right tail so two tail and

22:20

degree of freom would be n minus 2 so

22:22

from that you'll get the T value then

22:24

multiply the St add it and then subtract

22:28

ract it and you'll get your confidence

22:36

inter so once you calculate the

22:39

estimated value you could get two types

22:41

of variations over there one variation

22:43

is going to

22:44

be calculate the error so they give you

22:47

actual value and I have seen students

22:49

making a silly mistake they do estimated

22:51

minus actual as error which is a wrong

22:54

actual minus estimated will be the error

22:57

so basically you calculate the y value

22:58

using a plus BX formula yal a plus BX

23:02

and that would be the estimated value of

23:04

y stock return 17 that was the estimated

23:08

value actual value will be given to you

23:10

and actual minus estimated would be the

23:12

error value not the other way around

23:14

that could be one type of question and

23:16

second would be the confidence interval

23:18

put the estimated value at the center 17

23:20

then add the St subtract the S 17 plus

23:23

St 17 minus St you'll get a confidence

23:26

interval s will be given to you T you

23:28

have to look for two tail degree of

23:31

Freedom n minus 2 and then you'll get

23:33

the T value St add St subtract from that

23:36

calculated value of

23:38

y all

23:41

right then Anova full form analysis of

23:46

variance so over here let's write that

23:48

table down and from that only we'll be

23:50

able to reconnect everything

24:02

so first would be over here regression

24:05

sum of

24:06

square and then error sum of square and

24:09

then that becomes total sum of

24:11

square sometimes it is a sum of square

24:13

of regression sum of square of error sum

24:15

of square of total sometimes called as

24:17

regression sum of square error sum of

24:19

square and total sum of

24:22

square so these values will be given to

24:25

you it's coming

24:29

and then coefficient of determination R

24:31

square is to be calculated and that

24:33

would be RSS upon

24:37

TSS regression sum of square upon total

24:40

sum of

24:42

square what is co efficient of

24:47

determination

24:49

second y degree of Freedom will be

24:52

required to be known the degree of is K

24:54

over here which in level one is one and

24:58

error sum of square it's N - K - 1 so N

25:01

- 1 - 1 so in level 1 it's N -

25:10

2 actual formula is K and nus Kus

25:14

one in level one the K is

25:17

one so that becomes nus 1 - 1 so that

25:21

becomes nus

25:22

2 yes silent please then mean sum of

25:28

regation so that will be RSS upon K that

25:31

is RSS upon one or SSR upon one and this

25:35

is m e which is

25:38

SS upon n minus 2 or actually n minus K

25:41

minus

25:44

one all right so this is SSR or RSS

25:47

divided by one that would be MSR and

25:50

then SS upon n minus 2 would be

25:53

MSE and then finally the ratio of MSR

25:56

upon MSE would be your F value MSR upon

26:01

MSC would be F calculated

26:05

value F critical value mostly will be

26:08

given to you f critical value will be

26:11

mostly given to you this will be given

26:12

to you if not you have to look at the F

26:14

table

26:15

and degree of freedom for the numerator

26:17

and degree of freedom for the

26:18

denominator so numerator degree of

26:20

freedom is one and denominator degree of

26:23

freedom is n minus 2 okay so degree of

26:26

freedom for the numerator and degree of

26:28

freedom for the denominator is 1 andus

26:30

respectively f

26:32

tablee

26:34

forat for the denominator intersection

26:37

point will give you the critical

26:39

value and if the calculated value is

26:42

greater if the calculated Valu is coming

26:45

out to be greater then the model is

26:48

better model or model is good model

26:51

problem r s r value higher the model is

26:54

good but what is the cut off we don't

26:57

know over here so if I have R square of

26:59

82 percentage we don't know whether it's

27:01

good or bad so if his R square is 85

27:04

percentage then my R square of 82

27:05

percentage is bad and if his R square is

27:08

65 percentage then my R square of 82

27:10

percentage is good so R square cut off

27:13

and look at the F test which is MSR upon

27:17

MSC that's the calculated value and

27:18

compare that with the critical value if

27:20

the calculated value is higher than the

27:22

model is

27:26

good it's all

27:30

which is standard deviation of

27:32

error which is called as s standard

27:36

error of

27:39

estimate or also called as standard

27:42

deviation of

27:44

error which is

27:47

m m square

27:54

root okay so the formulas are like this

27:59

RSS plus SS is

28:03

SST that's one formula then RSS upon TSS

28:07

is R

28:13

squ then RSS upon K which is one is

28:19

MSR ssse upon n minus K minus one that's

28:23

M and in the ratio is f calculated value

28:28

and then finally square root of M is

28:30

called as standard deviation of error

28:32

also called as S E standard error of

28:45

estimate yal a + BX and just formula for

28:49

the B I'll write it one more time b and

28:51

a formula B is equal to coari between y

28:55

and

28:56

x divided by

28:58

variance of

29:01

X all right once we get the value of y

29:04

sorry once we get the value of B then y

29:06

y bar x xar b is known so only unknown

29:09

would be the a so you calculate B into

29:11

xar you put it on the other side with a

29:13

negative

29:19

sign this is called as linear linear

29:26

model Ln

29:28

so that's called as log linear

29:34

model L it's called as linear log

29:40

model and finally L it's called as log

29:44

log

29:50

model just recapping B test three

29:54

methods P

29:56

value which will will give you Max

29:58

confidence level till the time you can

30:01

say H1 would be

30:03

selected which says that a b is not

30:06

equal to zero will be selected second

30:08

one is uh the confidence interval where

30:10

you have B minus St

30:21

and B is not equal to Zer in the B minus

30:24

HD to B plus h z is there that mean b is

30:27

equal to Z Z is not there then the B is

30:29

not equal to Z and the third one is

30:31

hypothesis testing where you write h0 b

30:34

equal to Z H1 B not equal to zero and

30:36

then you write the make the normal

30:39

distribution with both tailes because

30:41

not equal to then get the T values using

30:44

degree of Freedom which is n minus 2 and

30:48

the two tail 5 percentage and then apply

30:51

the formula which is value of B minus

30:54

the value in the H1 divided standard

30:56

error upon root of

30:58

only standard er only standard not

31:01

divided root so that's the third

31:04

method then you'll be applied you'll be

31:06

asked to use this yal to a plus BX so

31:09

values of A and B will be given to

31:16

you something like this and they will

31:18

not be given to you like this they will

31:19

given to you in the table

31:23

table so using that they will give you x

31:26

value and they will ask you to calculate

31:27

y value and then you'll be asked either

31:29

for the error or the confidence interval

31:32

error predicted value so actual value

31:35

minus the predicted value will be your

31:37

error so using this formula the value

31:39

that you get is predicted value and then

31:41

actual value will be given to you so

31:43

difference between them actual value and

31:44

predicted value is the error that's one

31:47

and second would be once you get the

31:49

predicted

31:50

value subtract

31:52

St and add St and you'll be asked to do

31:56

a confidence interal for that

31:58

s will be given to you that formula for

32:00

the S will not be asked to be byhe

32:02

hearted and the T value would be like

32:04

degree of Freedom n minus two and it's

32:07

right tail and left tail so it's a two

32:09

tail Alpha will be given to

32:12

you and then

32:14

that analysis of

32:22

variance it's only there for the theor

32:25

it's not there that's no numerical

32:39

everything or miss out something any

32:41

theory that you missed

32:42

out please have a look at it I'm just

32:44

scrolling if I out any

32:51

Theory that's linear log log log model

32:54

so no numericals on this

33:01

this we covered confidence in terms ofed

33:03

as I said this formula is not required

33:04

to be

33:11

byed we did

33:20

itting RSS upon

33:23

TSS this one RSS upon TSS

33:35

this one if they give you correlation

33:38

coefficient then R square will be

33:40

coration coefficient

33:42

Square so if they give you correlation

33:45

between the Y and the X as let's say

33:49

80 then R squ which is called as

33:52

coefficient of determination would be 80

33:55

Square

33:57

this is the point I missed out

33:59

correlation between the dependent and

34:01

independent variable R square here so

34:04

you get the R square coefficient of

34:10

determination typ

34:12

of hypothesis testing type of er if you

34:14

want me to revise I do that but types

34:23

of all right

34:26

so point let me highlight the

34:31

point because

34:33

act or

34:38

line is is

34:41

positive

34:45

or and line is so this will be called as

34:48

negative eror line will give you

34:50

predicted values so predicted value

34:53

value so this will be a negative error

34:56

and over here actual values over here

34:58

predicted value somewhere over here so

35:00

this will be a positive error and this

35:03

will be a negative ER so Point actual

35:05

points are so actual minus predicted

35:07

will be a negative error and points

35:10

actual points are above predicted points

35:12

are below so that's

35:20

positive I formula let me write that

35:25

formula what we are going to do is we

35:27

have the Y over here and the Y Bar over

35:30

here that's Yus Y Bar which we call it

35:32

as a

35:34

deviation all right y- Y Bar so we'll do

35:37

Yus Y Bar Square which is called as TSS

35:41

total sum of

35:43

squ then we introduce that y predicted

35:46

value

35:47

YP okay and then we do YP minus y bar

35:51

which is called as the regression sum of

35:54

square and then Yus YP is called as

35:57

error actual minus is called as error so

35:59

that becomes error sum of stent so one

36:01

more time this is y and this is average

36:04

value of y and the difference between

36:07

them is called as deviation and square

36:10

it's total sum of

36:12

square and then predicted value in the

36:15

middle somewhere so YP minus y bar

36:18

that's the regression sum of square and

36:21

then this is error actual minus VOR

36:23

that's called as error sum of

36:25

square so why minus YP square and over

36:30

here YP minus y

36:33

s and total sum of square is like Yus Y

36:37

Bar

36:40

Square I

36:42

repeat y minus y bar that's called as

36:46

total sum of

36:51

square then YP minus y bar square is

36:55

called as regression sum of square

37:00

and lastly y minus YP that's called as

37:05

error sum of

37:18

square we didn't cover this errors

37:28

so I'll talk about this

37:29

errors okay apart from okay so let let's

37:32

cover this

37:34

errors or assumptions or errors or this

37:38

are assumptions and if this assumptions

37:39

are violated that becomes

37:43

error first one the name of the chapter

37:46

what's the name of the

37:47

chapter linear regression that means

37:50

there has to be a linear relationship

37:52

between Y and X the dependent variable Y

37:55

and the independent variable X be

37:58

linear relationship so there is no

38:00

linear relationship or there's a

38:01

nonlinear relationship nonlinear y = x²

38:05

or Y = root of x or y = x Cub this is

38:09

called as nonlinear y = a + BX that X is

38:14

like X to one so when it is X to one not

38:18

X squ or X Cube X to one that's called

38:21

as linear relationship so assumption

38:22

over here is that Y and X have a linear

38:25

line relationship not a curve

38:27

relationship Square Cube square roots if

38:30

you plot them in the Excel sheet you'll

38:32

get a curve you will not get a line so

38:36

assumption is the Y and the X have a

38:37

linear relationship error would be when

38:40

they have a nonlinear

38:41

relationship second variance variance

38:46

should be constant if the variance is

38:48

not constant then that is going to have

38:52

error which is called as

38:53

heteroscedasticity it's mentioned over

38:55

there variance not being constant is

38:57

called as

38:58

can someone number I

39:03

here 20 so page number 2011 so variance

39:08

constant to homoscedastic variance not

39:11

constant to

39:12

heteroscedastic now let me explain what

39:14

do you mean by variance not constant and

39:16

variance

39:18

constant stock market returns okay stock

39:22

market is it on every day same volatile

39:26

or on some days is more volatile some

39:29

days more volatile some days less

39:30

volatile that is stock market is heteros

39:34

stas if the volatility would have been

39:36

constant every day then it would have

39:38

been homoscedastic so if the volatility

39:40

that is Sigma is going to be constant

39:42

across the time then the stock market

39:44

would be homos gastic but it is not some

39:47

days the stock market is stable some

39:48

days like very very volatile so the

39:50

volatility is not constant hence stock

39:53

market is

39:54

hetas if the standard deviation or the

39:58

volatility is constant homoscedastic if

40:01

the standard deviation or the uh the

40:04

volatility is not constant then it's

40:07

called as

40:09

hetas the first one is linear

40:11

relationship obviously because the name

40:12

of the chapter is linear regression so

40:15

assumption is there is a linear

40:16

relationship between the Y and the

40:18

X if there is not a linear relationship

40:21

that is there is a nonlinear

40:23

relationship then that would be an

40:25

error second one if if there is the

40:29

error sorry the if the volatility is

40:31

constant then that's called as

40:32

homoscedastic if the volatility is not

40:34

constant that's called as

40:45

hetas listen to me carefully the

40:48

error first error and

40:52

second

40:54

Rel what does that mean

41:04

act second value

41:07

be chances are higher that's called as

41:11

positive dependency between the errors

41:14

okay this should not happen the

41:16

assumption is the error of one period

41:19

and the error of Next Period should not

41:21

have any dependency which is in simple

41:24

English called as errors should be

41:25

random the error

41:51

not happen like

41:56

that in every time the chance of six

41:59

is 1 upon six all right every time the

42:02

chance of six is 1 upon six so every

42:04

time whether six will be there or not

42:06

there that will be randomly determined

42:08

so basically similarly eror positive

42:11

second error be

42:14

posi that means errors are

42:17

dependent that means errors are not

42:19

random and

42:22

violation have they a name is there in

42:24

level

42:26

two called as serial correlation in

42:28

level two anyways but I think so

42:31

basically they require you that the

42:33

errors should be inde of each that

42:42

means chance of the second error coming

42:44

out to be positive or negative whether

42:46

first error is positive or negative

42:48

won't have any dependency on the second

42:50

error so three assumptions first Y and X

42:53

linear relationship not Square not Cube

42:56

not square root that is it's not a

42:59

nonlinear relationship second the

43:01

variance or the volatility is going to

43:04

be constant which is homoskedastic

43:06

variance not constant that error is

43:08

called as heteroskedastic and third

43:11

errors should be independent of each

43:13

other that means error should not have

43:15

any dependency

43:21

each so I repeat it one more time y = a

43:24

+ b x² y = a + b s root of x y = a + b x

43:30

CU not allowed y = a + b x is linear

43:34

ination which is allowed all

43:37

right then the standard deviation or the

43:40

variance of the volatility should be

43:42

constant which is called as

43:43

hosas not constant that's error hetas

43:47

and errors no dependency if there's a

43:49

dependency that's error which is called

43:51

as serial correlation in level

43:55

two and normally distribution

43:57

noral

44:21

distrib all right so that completes the

44:23

regression

UNLOCK MORE

Sign up free to access premium features

INTERACTIVE VIEWER

Watch the video with synced subtitles, adjustable overlay, and full playback control.

SIGN UP FREE TO UNLOCK

AI SUMMARY

Get an instant AI-generated summary of the video content, key points, and takeaways.

SIGN UP FREE TO UNLOCK

TRANSLATE

Translate the transcript to 100+ languages with one click. Download in any format.

SIGN UP FREE TO UNLOCK

MIND MAP

Visualize the transcript as an interactive mind map. Understand structure at a glance.

SIGN UP FREE TO UNLOCK

CHAT WITH TRANSCRIPT

Ask questions about the video content. Get answers powered by AI directly from the transcript.

SIGN UP FREE TO UNLOCK

GET MORE FROM YOUR TRANSCRIPTS

Sign up for free and unlock interactive viewer, AI summaries, translations, mind maps, and more. No credit card required.

GET STARTED FREE SIGN IN