How to Visualize Gradient Boosting Decision Trees With XGBoost in Python

Last Updated on

Plotting individual decision trees can provide insight into the gradient boosting process for a given dataset.

In this tutorial you will discover how you can plot individual decision trees from a trained gradient boosting model using XGBoost in Python.

Discover how to configure, fit, tune and evaluation gradient boosting models with XGBoost in my new book, with 15 step-by-step tutorial lessons, and full python code.

Let’s get started.

  • Update March/2018: Added alternate link to download the dataset as the original appears to have been taken down.
How to Visualize Gradient Boosting Decision Trees With XGBoost in Python

How to Visualize Gradient Boosting Decision Trees With XGBoost in Python
Photo by Kaarina Dillabough, some rights reserved.

Need help with XGBoost in Python?

Take my free 7-day email course and discover configuration, tuning and more (with sample code).

Click to sign-up now and also get a free PDF Ebook version of the course.

Start Your FREE Mini-Course Now!

Plot a Single XGBoost Decision Tree

The XGBoost Python API provides a function for plotting decision trees within a trained XGBoost model.

This capability is provided in the plot_tree() function that takes a trained model as the first argument, for example:

This plots the first tree in the model (the tree at index 0). This plot can be saved to file or shown on the screen using matplotlib and pyplot.show().

This plotting capability requires that you have the graphviz library installed.

We can create an XGBoost model on the Pima Indians onset of diabetes dataset and plot the first tree in the model (update: download from here). The full code listing is provided below:

Running the code creates a plot of the first decision tree in the model (index 0), showing the features and feature values for each split as well as the output leaf nodes.

XGBoost Plot of Single Decision Tree

XGBoost Plot of Single Decision Tree

You can see that variables are automatically named like f1 and f5 corresponding with the feature indices in the input array.

You can see the split decisions within each node and the different colors for left and right splits (blue and red).

The plot_tree() function takes some parameters. You can plot specific graphs by specifying their index to the num_trees argument. For example, you can plot the 5th boosted tree in the sequence as follows:

You can also change the layout of the graph to be left to right (easier to read) by changing the rankdir argument as ‘LR’ (left-to-right) rather than the default top to bottom (UT). For example:

The result of plotting the tree in the left-to-right layout is shown below.

XGBoost Plot of Single Decision Tree Left-To-Right

XGBoost Plot of Single Decision Tree Left-To-Right

Summary

In this post you learned how to plot individual decision trees from a trained XGBoost gradient boosted model in Python.

Do you have any questions about plotting decision trees in XGBoost or about this post? Ask your questions in the comments and I will do my best to answer.


Want To Learn The Algorithm Winning Competitions?

XGBoost With Python

Develop Your Own XGBoost Models in Minutes

…with just a few lines of Python

Discover how in my new Ebook:
XGBoost With Python

It covers self-study tutorials like:
Algorithm Fundamentals, Scaling, Hyperparameters, and much more…

Bring The Power of XGBoost To
Your Own Projects

Skip the Academics. Just Results.

Click to learn more.


69 Responses to How to Visualize Gradient Boosting Decision Trees With XGBoost in Python

  1. Ronen September 18, 2016 at 3:48 pm #

    Hi Jason,

    Nice one, an exact post using R would be much appreciated, since visualizing the tree is not straightforward.

    I have a conceptual question, let’s say the model trained 100 boosted trees, how do i know which one is the best performing tree ? is it by definition the last tree that was trained, since the growth of the last tree takes into account the 99 trees that have already been grown.

    Thanks!

    Ronen

    • Jason Brownlee September 19, 2016 at 7:42 am #

      Thanks Ronen. Sorry, no R examples yet, but perhaps soon.

      Interesting question, but not really important as the performance the ensemble is defined by the contribution of all trees in the ensemble. The performance of one tree on the problem does not make sense on the problem as it is working only to correct the residuals of the previous tree in the sequence.

      Does that make sense?

      • ronen September 21, 2016 at 7:26 pm #

        Thanks Jason, your answer is perfect.

      • ronen September 21, 2016 at 8:05 pm #

        The importance at least to me stems from the ambition to communicate the model and it’s splits to my client using a chart like the one you showed.

      • Funing July 24, 2017 at 12:47 am #

        Hi Jason, just like you said, the performance of one tree doesn’t make sense, since the output is the ensemble from all trees. Then why do we bother to plot one tree? And the tree we plot may different from other trees, so if we simply want to give an idea of what the tree looks like, which tree should we plot?

        • Jason Brownlee July 24, 2017 at 6:55 am #

          Some developers are very interested in getting a feeling for what the individual trees are doing to help better understand the whole.

          Personally, I do not.

  2. Shir December 9, 2016 at 5:41 am #

    Hi Jason!
    Do you know how to change the fontsize of the features in the tree?
    Thanks!

  3. Yonatan January 20, 2017 at 7:08 pm #

    Hi Jason,
    Is there a way to extract the list of decision trees and their parameters in order, for example, to save them for usage outside of python?
    Thanks,
    Yonatan

    • Jason Brownlee January 21, 2017 at 10:25 am #

      Sorry Yonatan, I have not done this.

      Let me know how you go.

  4. Jimmy March 16, 2017 at 4:23 pm #

    Hi Jason!
    Thanks for your sharing.
    I have a question that what does the output value of ‘leaf’ means?
    As the example, what does the final leaf = 0.12873 means?
    Thank you~

    • Jason Brownlee March 17, 2017 at 8:24 am #

      Great question, I don’t recall off-hand, but I would guess it is the ratio of the training data accounted for by that leaf.

      I’m sure the doco would make this clearer.

      • Gilles March 29, 2017 at 10:13 pm #

        It is definitely not the ratio of training data, since it can have negative values. I’m currently struggling with it as well.

        For binary classification, it can be converted to probabilities by applying a logistic function (1/(1+exp(x))) (not -x but just x, which is already weird). But for multi-class, each tree is a one-vs-all classifier and you use 1/(1+exp(-x)).

        https://github.com/dmlc/xgboost/issues/1746

  5. Niranjan April 20, 2017 at 10:26 pm #

    Hi Jason,
    A very wonderful tutorial, in your case it is renaming the attributes to it’s own form like ‘f1’, ‘f2’ etc. That is not happening in my case due to which the tree is not clearly visible. Could you please help at it. Also please do tell how to save a full resolution image of the tree.

    • Jason Brownlee April 21, 2017 at 8:34 am #

      Perhaps the API has changed.

      Sorry, I’m not sure how to make the image larger off-hand.

      • Frank July 9, 2018 at 7:05 pm #

        The following code is able to set the resolution of image and save it to a pdf file

  6. Wenbo April 21, 2017 at 4:52 am #

    Hi Jason, thanks for the post. I have one question that I have max_depth = 6 for each tree and the resulting plot tends to be too small to read. Is there any way that we can kinda zoom-in zoom out the plot? Thank you.

    • Jason Brownlee April 21, 2017 at 8:42 am #

      Not that I’m aware of, sorry.

    • Sergii November 23, 2017 at 9:27 pm #

      — xgb version 0.6 —

      import xgboost as xgb
      xgb.to_graphviz(you_xgb_model, num_trees=0, rankdir=’LR’, **{‘size’:str(10)})

      Tuning size you will change size of graphviz plot, though there is no zoom available (to my best knowledge)

  7. Dennis Gercke May 25, 2017 at 10:05 pm #

    As it was requested several times, a high resolution image, that is a render one, can be created with:

    xgb.to_graphviz(model)

    For me, this opens in the IPython console, I can then save the image with a right click.

  8. Kafka July 26, 2017 at 8:41 pm #

    Jason, how do we know red or blue belongs to which class?

    • Jason Brownlee July 27, 2017 at 8:01 am #

      Great question.

      Off-hand, I would guess that “no” is the 0 class, and “yes” is the 1 class.

    • Claude COULOMBE July 14, 2018 at 4:23 am #

      Is it possible to get the class / target name in the leaf node?
      Particularly for multi-class case…

  9. Anna August 3, 2017 at 1:19 pm #

    Can we output the tree model to a flat file ? or it’s only supported to figure it out? Thanks.

    • Jason Brownlee August 4, 2017 at 6:48 am #

      You may Anna, I’m not sure off the cuff. I bet there is away, consider looking through the xgboost docs.

  10. zeushera140 August 24, 2017 at 4:17 pm #

    Do you know how to change the fontsize of the features in the tree?
    Thanks!

  11. brian December 2, 2017 at 7:15 am #

    Have you found it possible to plot in python using the feature names? This seems like a bug

    • Jason Brownlee December 2, 2017 at 9:08 am #

      No, but perhaps the API has changed or someone has posted a workaround on stackoverflow?

  12. Michelle December 12, 2017 at 9:30 am #

    Thanks for the post, Jason.

    When I tried to plot_tree, I got a ValueError as below:
    ValueError: Unable to parse node: 0:[COLLAT_TYP_GOVT

    Any idea why this happened?

    Thanks

    • Jason Brownlee December 12, 2017 at 4:07 pm #

      Sorry to hear that, I have not seen this problem.

      Perhaps ensure xgboost is up to date and that you have all of the code from the post?

      • Michelle December 14, 2017 at 3:11 am #

        I figured out. It turns out that the feature name cannot contain spaces.

        I have another question though. The tree will output leaf values in the end. But for a multi-class classification, what do leaf values mean?

        • Jason Brownlee December 14, 2017 at 5:41 am #

          Glad to hear it.

          Good question, I have not tried more than two classes. Try it and let me know what you see.

  13. Sylwia January 11, 2018 at 1:12 am #

    Jason, thank you for the post.
    I’ve tried to use plot_tree for HousePrices dataset from Kaggle (using XGBRegressor), but it plots only one leaf. Any idea what might be the reason?

  14. Ammar Abdilghanie May 9, 2018 at 10:15 pm #

    It would be nice to be able to use actual feature names instead of the generic f1,f2,..etc

    • Jason Brownlee May 10, 2018 at 6:31 am #

      I agree.

      • Lasitha Ishan Petthawadu October 9, 2018 at 7:34 am #

        Is there anyway to provide the feature names to the fit function?

    • Novius July 8, 2019 at 10:24 pm #

      You can use pandas dataframe instead of numpy array, fit will use dataframe column names in the graph instead of f1,f2,… etc

  15. AAron September 18, 2018 at 12:19 am #

    Hey Jason, thanks for you patient lessons!
    May I ask you a question? I use XGBoost to train some data then test, but a new issue is that if when testing unknown data, there are some other options of the testing data label, how could I eliminate some options which I don’t expect?
    Like it will label 1 for A, but I want make it wont label 1 for A (eliminate the choice of 1).Not sure if you understand,THX!!

    • Jason Brownlee September 18, 2018 at 6:18 am #

      Not sure I follow, sorry? Perhaps you can elaborate?

      • AAron September 18, 2018 at 10:29 am #

        Well, while predicting one data set, I’d like to know the five closest possible labels for it, so what suggestion?
        Thanks for your reply,Jason

  16. AAron September 18, 2018 at 3:01 pm #

    well…have no idea about that..It would be very nice if you could tell me more ..thanks still:)

  17. Banny March 6, 2019 at 11:49 pm #

    Hey Jason, you are an awesome teacher. Great content.

  18. Andre March 12, 2019 at 12:49 pm #

    Hi Jason,

    Awesome stuff, thanks for the detailed tutorial!

    Question, what does the coefficients represent (e.g., probabilities, positive vs. negative)? I guess the root node has a higher feature importance, but how do I interpret the nodes to the far right?

    Thanks,
    André

    • Jason Brownlee March 12, 2019 at 2:31 pm #

      Generally, we don’t interpret the trees in an ensemble.

      • Banny March 14, 2019 at 9:33 am #

        what about the values on the leaves, what do they mean?

        Thanks

        • Jason Brownlee March 14, 2019 at 2:39 pm #

          Good question.

          I suspect the support for the leaf in the training dataset.

  19. Matthew March 27, 2019 at 6:12 pm #

    Hi Jason,

    Thanks a lot for the awesome tutorial, and would be very much appreciate if you could help the issue I face when running the tutorial!

    When I ran the code, everything works fine until I try “plot_tree(model)”. I get the error as below.

    Is there a chance that you may know the issue I am facing here? I suspect it could be an issue of installing the graphviz package, for which I did the following:
    1. Install windows package from (https://graphviz.gitlab.io/_pages/Download/Download_windows.html)\
    2. Install python graphviz package (using anaconda prompt “pip install graphviz)
    3. Add C:\Program Files (x86)\Graphviz2.38\bin to User path
    4. Add C:\Program Files (x86)\Graphviz2.38\bin\dot.exe to System Path

    +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
    Traceback (most recent call last):

    File “”, line 1, in
    plot_tree(model)

    File “C:\ProgramData\Anaconda3\lib\site-packages\xgboost\plotting.py”, line 278, in plot_tree
    rankdir=rankdir, **kwargs)

    File “C:\ProgramData\Anaconda3\lib\site-packages\xgboost\plotting.py”, line 227, in to_graphviz
    graph = Digraph(graph_attr=kwargs)

    File “C:\ProgramData\Anaconda3\lib\site-packages\graphviz\dot.py”, line 61, in __init__
    super(Dot, self).__init__(filename, directory, format, engine, encoding)

    TypeError: super(type, obj): obj must be an instance or subtype of type
    +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

    • Jason Brownlee March 28, 2019 at 8:08 am #

      Sorry to hear that, I don’t have any good ideas.

      You could try posting the error to stackoverflow?

      • Matthew March 28, 2019 at 12:13 pm #

        Thanks Jason, that sounds like a way out!

  20. Karthik Mamudur May 31, 2019 at 1:03 pm #

    HI Jason,

    I am working on a regression problem and am using XGboost, I tried to plot a tree by modifying the code you presented slightly and it worked fine. But here is my question

    Say I am using Gradient Boosting regressor with Decision trees as base learners, and I print the first tree out, for a given instance, I can traverse down the tree and find out with a rough approximation of the dependent variable. I understand XGBoost formulation is different from GBM, but is there a way to get a similar plot? The plot I extracted has just has yes and no decisions and some leaf values which for me isnt any useful (unlike a developer) . Please suggest if there is any other plot that helps me come up with a rough approximation of my dependent variable in the nth boosting round.

    • Jason Brownlee May 31, 2019 at 2:48 pm #

      Not really as you have hundreds or thousands of trees.

      • Karthik Mamudur May 31, 2019 at 10:29 pm #

        Thank you! I agree there are a number of trees, but I thought the first few trees will give me a rough cut value of my dependent variable and the subsequent trees will only be useful to finetune the rough cut value. Given a XGB model and its parameters, is there a way to find out a GBM equivalent of it? (Apologies if the questions sounds silly, I am just months old to ML concepts and not in a position to chew and digest all that I read in these months)

        • Jason Brownlee June 1, 2019 at 6:14 am #

          They are the same algorithm for the most part, “stochastic gradient boosting”, but the xgboost implementation is designed from the group-up for speed of execution during training and inference.

          • Karthik Mamudur June 2, 2019 at 11:40 am #

            Thank you!

  21. Srinivasa Murthy Gunturu June 24, 2019 at 7:55 pm #

    How to establish a relation between the predicted values by the model and the leaves or the terminal nodes in the graph for a regression problem in XGBoost?

    • Jason Brownlee June 25, 2019 at 6:18 am #

      Not sure I follow, why exactly?

      • Srinivasa Murthy Gunturu June 25, 2019 at 7:41 pm #

        Suppose I have a dataset and I train an xgboost Regression model with 80% of data as training and the rest 20% is used as a test for predictions.

        And also, the plot_tree() method is used on an xgBosst Regressor, to get a graph similar to the one that was depicted at the beginning of this article article.

        Now the question is how to establish a relationship between the leaf nodes in the tree-plot and the prediction values that were obtained from the model.

        (And what are exactly those values in the leaf nodes correspond to?)

        • Jason Brownlee June 26, 2019 at 6:39 am #

          I don’t know off hand sorry. Recall there are hundreds of trees in the model.

          Perhaps there is something in the xgboost API to allow you to discover the leaf of each tree used to make a prediction.

  22. Konrad Bachusz August 23, 2019 at 12:23 am #

    Is it possible to plot the last tree in the model?

    • Jason Brownlee August 23, 2019 at 6:31 am #

      I don’t see why not. I don’t have an example sorry.

Leave a Reply