Regression
FULL TRANSCRIPT
start with the regression and then I
don't know how much time will it take if
time then we will do either sources of
capital or one of
that okay regression
uh so first the equation Y = C + MX
which is now called as yal a + BX or
called as y equal B 0 + b1x okay that c
or uh the a or that b0 is called as
starting point which is also called as
intercept okay uh so we call for today
as Y A Plus BX but in the exam you could
say y b 0 b1x C
MX for today say Y is equal to a plus BX
so the a is called as starting point
which is also called as
intercept if that is positive then the
graph will start from above and if that
a is negative the graph will Start From
Below so that's the importance of the
intercept where does the line start from
a positive that means the line will
start from above above the origin and a
negative line will start below the
origin second is the B which is called
as slope coefficient calculate formula
I'm taking after some time if the B
value is positive the line would be
upward sloping if the B value is zero
line would be flat and if the B value is
negative line would be downward sloping
if the B value is negative line be
downward sloping if B value is zero line
would be flat if the B value is positive
line would be upward sloping and if
value is more positive line would be
more upward sloping okay
yeah so a is the starting point if a is
positive line will start from above
origin if a is negative line will start
from below origin okay if B is negative
line will be going down if B is zero
line will be horizontal if B is positive
line would be upward sloping and this
more positive line would be more steeply
upward sloping
uh this x which is the horizontal axis
is called as independent variable and
this Y which is the vertical axis is
called as dependent variable so the
equation y equal to a plus BX Y is
dependent variable X is independent
variable a is intercept and B is your
slope all right so a starting point B
the steepness X independent variable y
dependent
variable this line is called as the
regression line okay this line is called
as regression line and this model is
called as linear linear model this model
is linear linear
model Y A Plus BX you have Ln of Y = A
Plus BX then that's called as log linear
model instead of yal a plus BX you have
y = a + b into log of x that is Ln of X
is called as linear log model and again
it's Ln for y also Ln for X also then
called as log log model so yal a plus b
x x Ln
y linear model
y l linear model
y log
model l l
model you should know the formula for
the B although a and the B can be
calculated directly from that second
seven and second 8 so if you press
second seven put all data second 8 and
linear other second enter
and then scroll down down down down down
you'll get a b and then R so a that's
the a intercept and B slope coefficient
but obviously in the exam they might ask
you the formula what is the
formula B
formula between Y and
xid variance of x y and x variance of X
that's the formula for the B Co between
Y and X that is dependent and
independent variable by variance of X
that is variance of independent variable
I think example S&P 5 and some ABC stock
So Co ABC stock was the Y and S&P 500
was the X so cence between the X and the
Y would be cence between S&P and ABC
stock and divid by variance of X and
that X wasp finded so variance ofp find
in the
denominator so general form is co
between Y and X variance of X and once
you get the value of then only you can
get the value of a so over there instead
of Y you put the Y bar that is average
of Y instead of X you put the xar which
is average of X so Y is equal to a plus
BX in of y y bar instead of x xar b we
already calculated so only thing which
is unknown is a value okay so y y bar
average of Y is known to you X xar is
already known to you B value calculated
so the only thing which is unknown is a
value so that's the way you're going to
get a value
then that coefficient B whether that is
statistically different from zero or it
is statistically zero find out we have
three ways but why it is important to
find out what is statistically zero
statistically different from zero
statistically zero zero useless then
that case that X will become useless so
X coefficient B if that b is
statistically zero then if it is zero
then it's useless if it is statistically
different from zero then it is useful so
once you calculate the value of B I
think what 64 value I think so B value
which was 64 if that is statistically
zero or statistically non zero
is important because if the B value is
statistically zero then that's useless
and if B value is statistically non zero
then that's useful so how do we go for
that testing so there is three ways one
is the easiest way which is called as P
value if P value is given to be3
percentage then Max
99.7% confidence level then the B would
be non zero that means H1 would be
selected and H1 would have B non zero so
B will be different from zero or B will
not be zero that would be your H1 and
that H1 would be selected Max up to
99.7% if the given P value is3
percentage so the P value is.3 100 -3
99.7% so up to 99.7% confidence we can
say that the H1 would be selected which
means we can say that the B is non zero
so take an example P value is 1
percentage and the confidence sorry
Alpha is 1 percentage P value is 1
percentage and Alpha is. 1 percentage
whether B is statistically zero
statistically non
zero I I repeat P value 1 percentage
Alpha
.1% whether the B is statistically zero
statistically non
zero what is P value 1 percentage so
what's Max confidence up to 99
percentage H1 would be selected up to 99
percentage B will be different from zero
H1 means not equal to that is different
from zero so up to 99 percentage the B
would be non zero but the question had
said Alpha of
Point
99.9 that means it was beyond the 99 so
up to 99 H1 would be selected Beyond 99
H1 would be rejected h0 would be
selected and h0 would stay that b is
equal to Z H1 was B not equal to Z h0
was b equal to Z so up to 99 percentage
based on P value of 1 percentage up to
99 percentage H1 would be selected
Beyond 99 percentage H1 would be
rejected h0 would be selected and h0 was
b equal to z b equal to Z get selected
would mean that the coefficient is
statistically zero and hence useful yeah
useless useless because statistically
zero
useless second method is you do the
testing and for that you have to first
write the null and the H1 null would be
opposite of your belief so generally n
would be b equal to Z and H1 would be B
not equal to
Z is the coefficient greater
than2 so greater than to less than equal
toer less
equal equal to sign can never come in H1
so greater than 020 would be there in H1
and less than or equal to 020 will be in
h0 because equal to that will be h0 I
say greater than that means less than
equal to would be h0 and greater than
would be H1 so greater than would mean
right tail okay so you'll be focusing on
only the upper critical value okay so
now we have to look at the T Test
compulsorily over here we don't have
the standard Dev of Pop is known to Z
standard of population is not known to
use use test so in this example when I
said the H1 would be B greater than 020
then greater than would mean upper tail
so there's only one tail so we'll be
looking at one tail Alpha would be given
to you let's say 5% alpha is given to
you so one 5 percentage we'll be looking
that in that column and degree of
Freedom would
be good n minus 2 for correlation and
regression the degree of freedom is
going to be n minus 2 so in that column
degree of Freedom n minus 2 one tail and
Alpha
5 previous example h b equal to Z and H1
B not equal to Z so not equal to left
tail and right tail both so 2.5 over
here 2.5 over here so you will be one
tail 2.5 and two tail five so 2 till
five degree of Freedom nus 2 will be
looking at the lower and upper critical
second example B is greater than 020 so
greater than upper tail only so only
entire 5 percentage will be over here so
one tail 5 percentage degree of Freedom
n
minus so you'll get the area of H1 and
the h0 and then you have to apply the
formula remember write down that formula
given value okay I'll not speak out the
formula you write down the formula and
then check it on the
board how many have written like this
raise
hand this is wrong denominator
root it's only this formula okay the
value of B that you calculated minus the
value the
H1
sorry so H1 h0 was like
this and H1
was H1 Zer so that will put over here
and just second
H1
h120 so that H1 be putting
020 all right so whatever is the value
in the H1 H1 zero value so put that H1
over here zero value and h120 you put
that 020 over here okay so the
calculated value of B which I believe
was 64 right
64 and then B1 will be given to you
that's called as standard
error of slope
coefficient how theable will look
like uh you'll be given some y value
over here which is like let's say stock
return
return then you will be given over here
intercept and then the x value which is
let's say Market
return return
coefficient and then standard
error so let's say these values are like
this two three and let's say uh 2.5 and
4.5 so how to read this the equation
will be stock return return that will be
your Y is equal to a +
BX that a intercept that intercept a is
this
two plus the slope is three and that X
is Market
return all right so the Y will be
written over here or it will be
given y value that is y meaning stock
return this is your X which is your
Market return intercept is a
value and this is your coefficient value
B
value okay so this is the B value three
which you'll be putting it over here in
the formula B value
three value in the H1 is over here or
over here whatever the H1 they will give
it to you and the standard error of B1
is this standard error of B1 is this
this
4.5 So based on that you'll get your
calculated value calculated value h0
area h0 would be selected calculated
value H1 area H1 would
be okay so
P hypothesis
testing confidence interval so what you
do over here you get the B value which
is 64 in example
that was
three so you add as well as you
subtract something called as s into
T you get B Plus St and over here you
get B minus that s is standard ER which
given 4.5 standard ER so the standard
error s is 4.5 and T value will have to
be calculated using the T table degree
of Freedom n minus 2 and it's two tail
because left left and right right
inter okay so it's two tail and then
Alpha will be given to you yeah this
interval will be given to you so this is
95 percentage so
2.5
2.5 so 5 and then get the T value let's
say t value is 2 so 4.5 into 2 is 9 so
here it will be 3 +
9 and here it will be 3 - 9 so 3 - 9 is
-
6 and 3 + 9 is 12 and now the important
thing is interval minus 6 or 12 be zero
is there because lower limit is negative
- 5 -4 -3 -2 -1 zero and then 1 2 3 4 5
6 7 8 9 10 11 12 zero has come zero has
come 0 which we
say b equal to Z is going to be selected
and H1 B not equal to Zer will be
rejected so that's the third method I
repeat you'll be given B values let's
say B value is two you have to do B plus
s and then the right tail and then you
have to do B minus St and then the left
tail s Val will be given to you let's
say s value is
1.5 and to get the T value you have to
look at the T table degree of Freedom n
minus 2 that's one tail that second tail
so look for two tail and Alpha will be
given to you so based on that you get
the T value let's say t value is let's
say two or let's say t value is yeah
let's say two
then you do s into T So 1.5 into two
that would be three so two this B value
two 2 + 3 and over here B Val is two and
2 -
3 so s into T that will be 1.5 into 2
that will be 3 so 2 + 3 and 2 - 3 so
again 2 - 3 is
-1 and 2 + 3 is 5 -1 or 5 be zero will
be there zero would be
there at zero b equal to Z would be
selected and if it is zero it is
statistically
useless okay so these are the three
suppose interval minus 5 minus two
suppose interval came like this then is
the zero there over here
Z so Z is not there then H1 B not equal
to Z is going to be
selected so if it is not equal to zero
it's meaningful it's useful
similarly
range7 to
7.5 in this range is zero
there so Z is not in this range so H1 b
equal to Z is going to be
rejected so B is not I'm sorry will be
selected sorry H1 B not equal to Z sorry
H1 is B not equal to Z so that not equal
to Z is going to be selected
so it's not equal to zero so it's
useful so three methods one is P value
which I believe is the easiest one
hypothesis testing and then the
confidence
inter you calculate the value of y a and
the B value will be given to you either
in the table
like intercept is given and Market
return was given so what was the of
intercept I gave you
Che coefficient and standard error which
was I think
4.5 so y equal to intercept is
two and uh the market return return the
coefficient is three so that's
three x is your Market return so that's
your X Market
return and Y is your stock
return so you'll be given Market return
put that market return so let's say
Market return is let's say 5 percentage
you put that over there and you get the
value of stock return okay 5 into 3 15
15 you be 2 percentage so 2 percentage
so 15 percentage plus 2% will be 17
percentage that's called as estimated
value this is not the actual value this
is the estimated value actual value
could be higher or actual value could be
lower
and that will give us an error and that
error is called as actual value minus
predicted value this 17 percentage is
called as a predicted value actual value
could be higher or lower if actual value
is higher then the error would be
positive or
negative good the formula is actual
minus expected actual minus predicted
predicted is 17 if actual is higher than
that something like 18 so 18 - 17 error
would be plus one and if actual is
lesser than that something like 16 then
the error would be actual minus
estimated 16 minus 17 so that would be
negative 1 so if actual is higher error
would be positive if actual is lesser
error would be
negative so what we do we again
construct a confidence interval around
this so again confidence interval we put
the the calculated value in the middle
that is 17 percentage in the
middle what do we add and what do we
subtract St 17 plus St and 17 minus St s
this will be S for the forecasted
value which you don't have to that
formula I don't expect that formula to
be tested so this s for the forecast
will be given to you but there's a
formula given remember in the text
s e s into bracket 1 + 1 upon X so that
big formula is that the standard error
for forecast is not required to be
by so this would be that s standard
error of the forecast and T value again
you have to calculate it's a two tail
left tail right tail so two tail and
degree of freom would be n minus 2 so
from that you'll get the T value then
multiply the St add it and then subtract
ract it and you'll get your confidence
inter so once you calculate the
estimated value you could get two types
of variations over there one variation
is going to
be calculate the error so they give you
actual value and I have seen students
making a silly mistake they do estimated
minus actual as error which is a wrong
actual minus estimated will be the error
so basically you calculate the y value
using a plus BX formula yal a plus BX
and that would be the estimated value of
y stock return 17 that was the estimated
value actual value will be given to you
and actual minus estimated would be the
error value not the other way around
that could be one type of question and
second would be the confidence interval
put the estimated value at the center 17
then add the St subtract the S 17 plus
St 17 minus St you'll get a confidence
interval s will be given to you T you
have to look for two tail degree of
Freedom n minus 2 and then you'll get
the T value St add St subtract from that
calculated value of
y all
right then Anova full form analysis of
variance so over here let's write that
table down and from that only we'll be
able to reconnect everything
so first would be over here regression
sum of
square and then error sum of square and
then that becomes total sum of
square sometimes it is a sum of square
of regression sum of square of error sum
of square of total sometimes called as
regression sum of square error sum of
square and total sum of
square so these values will be given to
you it's coming
and then coefficient of determination R
square is to be calculated and that
would be RSS upon
TSS regression sum of square upon total
sum of
square what is co efficient of
determination
second y degree of Freedom will be
required to be known the degree of is K
over here which in level one is one and
error sum of square it's N - K - 1 so N
- 1 - 1 so in level 1 it's N -
2 actual formula is K and nus Kus
one in level one the K is
one so that becomes nus 1 - 1 so that
becomes nus
2 yes silent please then mean sum of
regation so that will be RSS upon K that
is RSS upon one or SSR upon one and this
is m e which is
SS upon n minus 2 or actually n minus K
minus
one all right so this is SSR or RSS
divided by one that would be MSR and
then SS upon n minus 2 would be
MSE and then finally the ratio of MSR
upon MSE would be your F value MSR upon
MSC would be F calculated
value F critical value mostly will be
given to you f critical value will be
mostly given to you this will be given
to you if not you have to look at the F
table
and degree of freedom for the numerator
and degree of freedom for the
denominator so numerator degree of
freedom is one and denominator degree of
freedom is n minus 2 okay so degree of
freedom for the numerator and degree of
freedom for the denominator is 1 andus
respectively f
tablee
forat for the denominator intersection
point will give you the critical
value and if the calculated value is
greater if the calculated Valu is coming
out to be greater then the model is
better model or model is good model
problem r s r value higher the model is
good but what is the cut off we don't
know over here so if I have R square of
82 percentage we don't know whether it's
good or bad so if his R square is 85
percentage then my R square of 82
percentage is bad and if his R square is
65 percentage then my R square of 82
percentage is good so R square cut off
and look at the F test which is MSR upon
MSC that's the calculated value and
compare that with the critical value if
the calculated value is higher than the
model is
good it's all
which is standard deviation of
error which is called as s standard
error of
estimate or also called as standard
deviation of
error which is
m m square
root okay so the formulas are like this
RSS plus SS is
SST that's one formula then RSS upon TSS
is R
squ then RSS upon K which is one is
MSR ssse upon n minus K minus one that's
M and in the ratio is f calculated value
and then finally square root of M is
called as standard deviation of error
also called as S E standard error of
estimate yal a + BX and just formula for
the B I'll write it one more time b and
a formula B is equal to coari between y
and
x divided by
variance of
X all right once we get the value of y
sorry once we get the value of B then y
y bar x xar b is known so only unknown
would be the a so you calculate B into
xar you put it on the other side with a
negative
sign this is called as linear linear
model Ln
so that's called as log linear
model L it's called as linear log
model and finally L it's called as log
log
model just recapping B test three
methods P
value which will will give you Max
confidence level till the time you can
say H1 would be
selected which says that a b is not
equal to zero will be selected second
one is uh the confidence interval where
you have B minus St
and B is not equal to Zer in the B minus
HD to B plus h z is there that mean b is
equal to Z Z is not there then the B is
not equal to Z and the third one is
hypothesis testing where you write h0 b
equal to Z H1 B not equal to zero and
then you write the make the normal
distribution with both tailes because
not equal to then get the T values using
degree of Freedom which is n minus 2 and
the two tail 5 percentage and then apply
the formula which is value of B minus
the value in the H1 divided standard
error upon root of
only standard er only standard not
divided root so that's the third
method then you'll be applied you'll be
asked to use this yal to a plus BX so
values of A and B will be given to
you something like this and they will
not be given to you like this they will
given to you in the table
table so using that they will give you x
value and they will ask you to calculate
y value and then you'll be asked either
for the error or the confidence interval
error predicted value so actual value
minus the predicted value will be your
error so using this formula the value
that you get is predicted value and then
actual value will be given to you so
difference between them actual value and
predicted value is the error that's one
and second would be once you get the
predicted
value subtract
St and add St and you'll be asked to do
a confidence interal for that
s will be given to you that formula for
the S will not be asked to be byhe
hearted and the T value would be like
degree of Freedom n minus two and it's
right tail and left tail so it's a two
tail Alpha will be given to
you and then
that analysis of
variance it's only there for the theor
it's not there that's no numerical
everything or miss out something any
theory that you missed
out please have a look at it I'm just
scrolling if I out any
Theory that's linear log log log model
so no numericals on this
this we covered confidence in terms ofed
as I said this formula is not required
to be
byed we did
itting RSS upon
TSS this one RSS upon TSS
this one if they give you correlation
coefficient then R square will be
coration coefficient
Square so if they give you correlation
between the Y and the X as let's say
80 then R squ which is called as
coefficient of determination would be 80
Square
this is the point I missed out
correlation between the dependent and
independent variable R square here so
you get the R square coefficient of
determination typ
of hypothesis testing type of er if you
want me to revise I do that but types
of all right
so point let me highlight the
point because
act or
line is is
positive
or and line is so this will be called as
negative eror line will give you
predicted values so predicted value
value so this will be a negative error
and over here actual values over here
predicted value somewhere over here so
this will be a positive error and this
will be a negative ER so Point actual
points are so actual minus predicted
will be a negative error and points
actual points are above predicted points
are below so that's
positive I formula let me write that
formula what we are going to do is we
have the Y over here and the Y Bar over
here that's Yus Y Bar which we call it
as a
deviation all right y- Y Bar so we'll do
Yus Y Bar Square which is called as TSS
total sum of
squ then we introduce that y predicted
value
YP okay and then we do YP minus y bar
which is called as the regression sum of
square and then Yus YP is called as
error actual minus is called as error so
that becomes error sum of stent so one
more time this is y and this is average
value of y and the difference between
them is called as deviation and square
it's total sum of
square and then predicted value in the
middle somewhere so YP minus y bar
that's the regression sum of square and
then this is error actual minus VOR
that's called as error sum of
square so why minus YP square and over
here YP minus y
s and total sum of square is like Yus Y
Bar
Square I
repeat y minus y bar that's called as
total sum of
square then YP minus y bar square is
called as regression sum of square
and lastly y minus YP that's called as
error sum of
square we didn't cover this errors
so I'll talk about this
errors okay apart from okay so let let's
cover this
errors or assumptions or errors or this
are assumptions and if this assumptions
are violated that becomes
error first one the name of the chapter
what's the name of the
chapter linear regression that means
there has to be a linear relationship
between Y and X the dependent variable Y
and the independent variable X be
linear relationship so there is no
linear relationship or there's a
nonlinear relationship nonlinear y = x²
or Y = root of x or y = x Cub this is
called as nonlinear y = a + BX that X is
like X to one so when it is X to one not
X squ or X Cube X to one that's called
as linear relationship so assumption
over here is that Y and X have a linear
line relationship not a curve
relationship Square Cube square roots if
you plot them in the Excel sheet you'll
get a curve you will not get a line so
assumption is the Y and the X have a
linear relationship error would be when
they have a nonlinear
relationship second variance variance
should be constant if the variance is
not constant then that is going to have
error which is called as
heteroscedasticity it's mentioned over
there variance not being constant is
called as
can someone number I
here 20 so page number 2011 so variance
constant to homoscedastic variance not
constant to
heteroscedastic now let me explain what
do you mean by variance not constant and
variance
constant stock market returns okay stock
market is it on every day same volatile
or on some days is more volatile some
days more volatile some days less
volatile that is stock market is heteros
stas if the volatility would have been
constant every day then it would have
been homoscedastic so if the volatility
that is Sigma is going to be constant
across the time then the stock market
would be homos gastic but it is not some
days the stock market is stable some
days like very very volatile so the
volatility is not constant hence stock
market is
hetas if the standard deviation or the
volatility is constant homoscedastic if
the standard deviation or the uh the
volatility is not constant then it's
called as
hetas the first one is linear
relationship obviously because the name
of the chapter is linear regression so
assumption is there is a linear
relationship between the Y and the
X if there is not a linear relationship
that is there is a nonlinear
relationship then that would be an
error second one if if there is the
error sorry the if the volatility is
constant then that's called as
homoscedastic if the volatility is not
constant that's called as
hetas listen to me carefully the
error first error and
second
Rel what does that mean
act second value
be chances are higher that's called as
positive dependency between the errors
okay this should not happen the
assumption is the error of one period
and the error of Next Period should not
have any dependency which is in simple
English called as errors should be
random the error
not happen like
that in every time the chance of six
is 1 upon six all right every time the
chance of six is 1 upon six so every
time whether six will be there or not
there that will be randomly determined
so basically similarly eror positive
second error be
posi that means errors are
dependent that means errors are not
random and
violation have they a name is there in
level
two called as serial correlation in
level two anyways but I think so
basically they require you that the
errors should be inde of each that
means chance of the second error coming
out to be positive or negative whether
first error is positive or negative
won't have any dependency on the second
error so three assumptions first Y and X
linear relationship not Square not Cube
not square root that is it's not a
nonlinear relationship second the
variance or the volatility is going to
be constant which is homoskedastic
variance not constant that error is
called as heteroskedastic and third
errors should be independent of each
other that means error should not have
any dependency
each so I repeat it one more time y = a
+ b x² y = a + b s root of x y = a + b x
CU not allowed y = a + b x is linear
ination which is allowed all
right then the standard deviation or the
variance of the volatility should be
constant which is called as
hosas not constant that's error hetas
and errors no dependency if there's a
dependency that's error which is called
as serial correlation in level
two and normally distribution
noral
distrib all right so that completes the
regression
UNLOCK MORE
Sign up free to access premium features
INTERACTIVE VIEWER
Watch the video with synced subtitles, adjustable overlay, and full playback control.
AI SUMMARY
Get an instant AI-generated summary of the video content, key points, and takeaways.
TRANSLATE
Translate the transcript to 100+ languages with one click. Download in any format.
MIND MAP
Visualize the transcript as an interactive mind map. Understand structure at a glance.
CHAT WITH TRANSCRIPT
Ask questions about the video content. Get answers powered by AI directly from the transcript.
GET MORE FROM YOUR TRANSCRIPTS
Sign up for free and unlock interactive viewer, AI summaries, translations, mind maps, and more. No credit card required.