Visual Guide to Gradient Boosted Trees (xgboost)
FULL TRANSCRIPT
hi everyone welcome back to another
video in our machine learning series
in this video we'll learn yet another
popular model ensembling method
called gradient boosted trees if you
haven't already
check out our previous videos to learn
about random forests
where we introduced the concept of model
ensembling
as well as decision trees where we talk
about the building blocks of these
models
in this video we'll use gradient boosted
trees to perform
classification specifically to identify
the number
drawn in an image we'll use mnist
a large database of handwritten images
commonly used in image processing
it contains 60 000 training images and
10 000 testing images
each pixel is a feature and there are 10
possible classes
let's first learn a bit more about the
model gradient boosted trees and random
forests are both ensembling methods that
perform regression or classification
by combining the outputs from individual
trees
however gradient boosted trees and
random forests
differ in the way the individual trees
are built
and in the way the results are combined
as you already know
random forests build independent
decision trees and combine them in
parallel
on the other hand gradient boosted trees
use a method called
boosting boosting combines weak learners
sequentially
so that each new tree corrects the
errors of the previous one
weak learners are usually decision trees
with only one split
called decision stumps so the first step
is to fit a single decision tree
we'll evaluate how well this tree does
using a loss function
there are many different loss functions
we can choose from
for multi-class classification
cross-entropy is a popular choice
here's the equation for cross-entropy
where p is the label and
q is the prediction basically the loss
is high
when the label and prediction do not
agree and the loss is zero when they're
in perfect agreement
now that we have our first tree and the
loss function we'll use to evaluate the
model
let's add in a second tree we want the
second tree to be such that when added
to the first
it lowers the loss compared to the first
tree alone
here's what that looks like where eta is
the learning rate
we want to find the direction in which
the loss decreases the fastest
mathematically this is given by the
negative derivative of loss
with respect to the previous model's
output
therefore we fit the second weak learner
on the derivative of
l with respect to f of one which is
nothing but the gradient of the loss
function
with respect to the output of the
previous model
that's why this method is called
gradient boosting
for any step m gradient boosted trees
produces a model such that
ensemble at step m equals ensemble at
step
m minus 1 plus the learning rate times
the weak learner at step
m we want to choose the learning rate
such that we don't walk too far in any
direction
but at the same time if the learning
rate is too low
then the model might take too long to
converge to the right answer
compared to random forests gradient
boosted trees have a lot of model
capacity
so they can model very complex
relationships and decision boundaries
however as with all models with high
capacity
this can lead to overfitting very
quickly so be careful
[Music]
we fit a gradient boosted trees model
using the xg boost library
on mnist with 330 weak learners
and achieved 89 accuracy
try it out for yourself using the link
in the description and let us know your
thoughts
don't forget to subscribe to
reconnaissance for videos on machine
learning and more
UNLOCK MORE
Sign up free to access premium features
INTERACTIVE VIEWER
Watch the video with synced subtitles, adjustable overlay, and full playback control.
AI SUMMARY
Get an instant AI-generated summary of the video content, key points, and takeaways.
TRANSLATE
Translate the transcript to 100+ languages with one click. Download in any format.
MIND MAP
Visualize the transcript as an interactive mind map. Understand structure at a glance.
CHAT WITH TRANSCRIPT
Ask questions about the video content. Get answers powered by AI directly from the transcript.
GET MORE FROM YOUR TRANSCRIPTS
Sign up for free and unlock interactive viewer, AI summaries, translations, mind maps, and more. No credit card required.