How to Visualize Gradient Boosting Decision Trees With XGBoost in Python

Last Updated on August 27, 2020

Plotting individual decision trees can provide insight into the gradient boosting process for a given dataset.

In this tutorial you will discover how you can plot individual decision trees from a trained gradient boosting model using XGBoost in Python.

Kick-start your project with my new book XGBoost With Python, including step-by-step tutorials and the Python source code files for all examples.

Let’s get started.

  • Update Mar/2018: Added alternate link to download the dataset as the original appears to have been taken down.
How to Visualize Gradient Boosting Decision Trees With XGBoost in Python

How to Visualize Gradient Boosting Decision Trees With XGBoost in Python
Photo by Kaarina Dillabough, some rights reserved.

Need help with XGBoost in Python?

Take my free 7-day email course and discover xgboost (with sample code).

Click to sign-up now and also get a free PDF Ebook version of the course.

Plot a Single XGBoost Decision Tree

The XGBoost Python API provides a function for plotting decision trees within a trained XGBoost model.

This capability is provided in the plot_tree() function that takes a trained model as the first argument, for example:

This plots the first tree in the model (the tree at index 0). This plot can be saved to file or shown on the screen using matplotlib and

This plotting capability requires that you have the graphviz library installed.

We can create an XGBoost model on the Pima Indians onset of diabetes dataset and plot the first tree in the model .

Download the dataset and place it in your current working directory.

The full code listing is provided below:

Running the code creates a plot of the first decision tree in the model (index 0), showing the features and feature values for each split as well as the output leaf nodes.

XGBoost Plot of Single Decision Tree

XGBoost Plot of Single Decision Tree

You can see that variables are automatically named like f1 and f5 corresponding with the feature indices in the input array.

You can see the split decisions within each node and the different colors for left and right splits (blue and red).

The plot_tree() function takes some parameters. You can plot specific graphs by specifying their index to the num_trees argument. For example, you can plot the 5th boosted tree in the sequence as follows:

You can also change the layout of the graph to be left to right (easier to read) by changing the rankdir argument as ‘LR’ (left-to-right) rather than the default top to bottom (UT). For example:

The result of plotting the tree in the left-to-right layout is shown below.

XGBoost Plot of Single Decision Tree Left-To-Right

XGBoost Plot of Single Decision Tree Left-To-Right


In this post you learned how to plot individual decision trees from a trained XGBoost gradient boosted model in Python.

Do you have any questions about plotting decision trees in XGBoost or about this post? Ask your questions in the comments and I will do my best to answer.

Discover The Algorithm Winning Competitions!

XGBoost With Python

Develop Your Own XGBoost Models in Minutes

...with just a few lines of Python

Discover how in my new Ebook:
XGBoost With Python

It covers self-study tutorials like:
Algorithm Fundamentals, Scaling, Hyperparameters, and much more...

Bring The Power of XGBoost To Your Own Projects

Skip the Academics. Just Results.

See What's Inside

90 Responses to How to Visualize Gradient Boosting Decision Trees With XGBoost in Python

  1. Avatar
    Ronen September 18, 2016 at 3:48 pm #

    Hi Jason,

    Nice one, an exact post using R would be much appreciated, since visualizing the tree is not straightforward.

    I have a conceptual question, let’s say the model trained 100 boosted trees, how do i know which one is the best performing tree ? is it by definition the last tree that was trained, since the growth of the last tree takes into account the 99 trees that have already been grown.



    • Avatar
      Jason Brownlee September 19, 2016 at 7:42 am #

      Thanks Ronen. Sorry, no R examples yet, but perhaps soon.

      Interesting question, but not really important as the performance the ensemble is defined by the contribution of all trees in the ensemble. The performance of one tree on the problem does not make sense on the problem as it is working only to correct the residuals of the previous tree in the sequence.

      Does that make sense?

      • Avatar
        ronen September 21, 2016 at 7:26 pm #

        Thanks Jason, your answer is perfect.

      • Avatar
        ronen September 21, 2016 at 8:05 pm #

        The importance at least to me stems from the ambition to communicate the model and it’s splits to my client using a chart like the one you showed.

      • Avatar
        Funing July 24, 2017 at 12:47 am #

        Hi Jason, just like you said, the performance of one tree doesn’t make sense, since the output is the ensemble from all trees. Then why do we bother to plot one tree? And the tree we plot may different from other trees, so if we simply want to give an idea of what the tree looks like, which tree should we plot?

        • Avatar
          Jason Brownlee July 24, 2017 at 6:55 am #

          Some developers are very interested in getting a feeling for what the individual trees are doing to help better understand the whole.

          Personally, I do not.

  2. Avatar
    Shir December 9, 2016 at 5:41 am #

    Hi Jason!
    Do you know how to change the fontsize of the features in the tree?

  3. Avatar
    Yonatan January 20, 2017 at 7:08 pm #

    Hi Jason,
    Is there a way to extract the list of decision trees and their parameters in order, for example, to save them for usage outside of python?

    • Avatar
      Jason Brownlee January 21, 2017 at 10:25 am #

      Sorry Yonatan, I have not done this.

      Let me know how you go.

  4. Avatar
    Jimmy March 16, 2017 at 4:23 pm #

    Hi Jason!
    Thanks for your sharing.
    I have a question that what does the output value of ‘leaf’ means?
    As the example, what does the final leaf = 0.12873 means?
    Thank you~

    • Avatar
      Jason Brownlee March 17, 2017 at 8:24 am #

      Great question, I don’t recall off-hand, but I would guess it is the ratio of the training data accounted for by that leaf.

      I’m sure the doco would make this clearer.

      • Avatar
        Gilles March 29, 2017 at 10:13 pm #

        It is definitely not the ratio of training data, since it can have negative values. I’m currently struggling with it as well.

        For binary classification, it can be converted to probabilities by applying a logistic function (1/(1+exp(x))) (not -x but just x, which is already weird). But for multi-class, each tree is a one-vs-all classifier and you use 1/(1+exp(-x)).

    • Avatar
      Kjell Jansson February 15, 2020 at 8:58 pm #

      “Value (for leafs): the margin value that the leaf may contribute to prediction” (xgb.plot.tree for R but could be valid here too?)

  5. Avatar
    Niranjan April 20, 2017 at 10:26 pm #

    Hi Jason,
    A very wonderful tutorial, in your case it is renaming the attributes to it’s own form like ‘f1’, ‘f2’ etc. That is not happening in my case due to which the tree is not clearly visible. Could you please help at it. Also please do tell how to save a full resolution image of the tree.

    • Avatar
      Jason Brownlee April 21, 2017 at 8:34 am #

      Perhaps the API has changed.

      Sorry, I’m not sure how to make the image larger off-hand.

      • Avatar
        Frank July 9, 2018 at 7:05 pm #

        The following code is able to set the resolution of image and save it to a pdf file

        • Avatar
          Jason Brownlee July 10, 2018 at 6:45 am #


          • Avatar
            Jose Q December 22, 2020 at 7:04 am #

            Thank you Jason and Frank !

  6. Avatar
    Wenbo April 21, 2017 at 4:52 am #

    Hi Jason, thanks for the post. I have one question that I have max_depth = 6 for each tree and the resulting plot tends to be too small to read. Is there any way that we can kinda zoom-in zoom out the plot? Thank you.

    • Avatar
      Jason Brownlee April 21, 2017 at 8:42 am #

      Not that I’m aware of, sorry.

    • Avatar
      Sergii November 23, 2017 at 9:27 pm #

      — xgb version 0.6 —

      import xgboost as xgb
      xgb.to_graphviz(you_xgb_model, num_trees=0, rankdir=’LR’, **{‘size’:str(10)})

      Tuning size you will change size of graphviz plot, though there is no zoom available (to my best knowledge)

  7. Avatar
    Dennis Gercke May 25, 2017 at 10:05 pm #

    As it was requested several times, a high resolution image, that is a render one, can be created with:


    For me, this opens in the IPython console, I can then save the image with a right click.

  8. Avatar
    Kafka July 26, 2017 at 8:41 pm #

    Jason, how do we know red or blue belongs to which class?

    • Avatar
      Jason Brownlee July 27, 2017 at 8:01 am #

      Great question.

      Off-hand, I would guess that “no” is the 0 class, and “yes” is the 1 class.

    • Avatar
      Claude COULOMBE July 14, 2018 at 4:23 am #

      Is it possible to get the class / target name in the leaf node?
      Particularly for multi-class case…

  9. Avatar
    Anna August 3, 2017 at 1:19 pm #

    Can we output the tree model to a flat file ? or it’s only supported to figure it out? Thanks.

    • Avatar
      Jason Brownlee August 4, 2017 at 6:48 am #

      You may Anna, I’m not sure off the cuff. I bet there is away, consider looking through the xgboost docs.

  10. Avatar
    zeushera140 August 24, 2017 at 4:17 pm #

    Do you know how to change the fontsize of the features in the tree?

  11. Avatar
    brian December 2, 2017 at 7:15 am #

    Have you found it possible to plot in python using the feature names? This seems like a bug

    • Avatar
      Jason Brownlee December 2, 2017 at 9:08 am #

      No, but perhaps the API has changed or someone has posted a workaround on stackoverflow?

  12. Avatar
    Michelle December 12, 2017 at 9:30 am #

    Thanks for the post, Jason.

    When I tried to plot_tree, I got a ValueError as below:
    ValueError: Unable to parse node: 0:[COLLAT_TYP_GOVT

    Any idea why this happened?


    • Avatar
      Jason Brownlee December 12, 2017 at 4:07 pm #

      Sorry to hear that, I have not seen this problem.

      Perhaps ensure xgboost is up to date and that you have all of the code from the post?

      • Avatar
        Michelle December 14, 2017 at 3:11 am #

        I figured out. It turns out that the feature name cannot contain spaces.

        I have another question though. The tree will output leaf values in the end. But for a multi-class classification, what do leaf values mean?

        • Avatar
          Jason Brownlee December 14, 2017 at 5:41 am #

          Glad to hear it.

          Good question, I have not tried more than two classes. Try it and let me know what you see.

  13. Avatar
    Sylwia January 11, 2018 at 1:12 am #

    Jason, thank you for the post.
    I’ve tried to use plot_tree for HousePrices dataset from Kaggle (using XGBRegressor), but it plots only one leaf. Any idea what might be the reason?

  14. Avatar
    Ammar Abdilghanie May 9, 2018 at 10:15 pm #

    It would be nice to be able to use actual feature names instead of the generic f1,f2,..etc

    • Avatar
      Jason Brownlee May 10, 2018 at 6:31 am #

      I agree.

      • Avatar
        Lasitha Ishan Petthawadu October 9, 2018 at 7:34 am #

        Is there anyway to provide the feature names to the fit function?

    • Avatar
      Novius July 8, 2019 at 10:24 pm #

      You can use pandas dataframe instead of numpy array, fit will use dataframe column names in the graph instead of f1,f2,… etc

    • Avatar
      Adesh January 23, 2020 at 10:48 pm #

      You can get names of feature by setting model.feature_names to column names.

      model.feature_names = list_of_feature_in_order

      I might be late to answer the question, may be it will be useful for someone else looking for similar answer.

  15. Avatar
    AAron September 18, 2018 at 12:19 am #

    Hey Jason, thanks for you patient lessons!
    May I ask you a question? I use XGBoost to train some data then test, but a new issue is that if when testing unknown data, there are some other options of the testing data label, how could I eliminate some options which I don’t expect?
    Like it will label 1 for A, but I want make it wont label 1 for A (eliminate the choice of 1).Not sure if you understand,THX!!

    • Avatar
      Jason Brownlee September 18, 2018 at 6:18 am #

      Not sure I follow, sorry? Perhaps you can elaborate?

      • Avatar
        AAron September 18, 2018 at 10:29 am #

        Well, while predicting one data set, I’d like to know the five closest possible labels for it, so what suggestion?
        Thanks for your reply,Jason

  16. Avatar
    AAron September 18, 2018 at 3:01 pm #

    well…have no idea about that..It would be very nice if you could tell me more ..thanks still:)

  17. Avatar
    Banny March 6, 2019 at 11:49 pm #

    Hey Jason, you are an awesome teacher. Great content.

  18. Avatar
    Andre March 12, 2019 at 12:49 pm #

    Hi Jason,

    Awesome stuff, thanks for the detailed tutorial!

    Question, what does the coefficients represent (e.g., probabilities, positive vs. negative)? I guess the root node has a higher feature importance, but how do I interpret the nodes to the far right?


    • Avatar
      Jason Brownlee March 12, 2019 at 2:31 pm #

      Generally, we don’t interpret the trees in an ensemble.

      • Avatar
        Banny March 14, 2019 at 9:33 am #

        what about the values on the leaves, what do they mean?


        • Avatar
          Jason Brownlee March 14, 2019 at 2:39 pm #

          Good question.

          I suspect the support for the leaf in the training dataset.

  19. Avatar
    Matthew March 27, 2019 at 6:12 pm #

    Hi Jason,

    Thanks a lot for the awesome tutorial, and would be very much appreciate if you could help the issue I face when running the tutorial!

    When I ran the code, everything works fine until I try “plot_tree(model)”. I get the error as below.

    Is there a chance that you may know the issue I am facing here? I suspect it could be an issue of installing the graphviz package, for which I did the following:
    1. Install windows package from (\
    2. Install python graphviz package (using anaconda prompt “pip install graphviz)
    3. Add C:\Program Files (x86)\Graphviz2.38\bin to User path
    4. Add C:\Program Files (x86)\Graphviz2.38\bin\dot.exe to System Path

    Traceback (most recent call last):

    File “”, line 1, in

    File “C:\ProgramData\Anaconda3\lib\site-packages\xgboost\”, line 278, in plot_tree
    rankdir=rankdir, **kwargs)

    File “C:\ProgramData\Anaconda3\lib\site-packages\xgboost\”, line 227, in to_graphviz
    graph = Digraph(graph_attr=kwargs)

    File “C:\ProgramData\Anaconda3\lib\site-packages\graphviz\”, line 61, in __init__
    super(Dot, self).__init__(filename, directory, format, engine, encoding)

    TypeError: super(type, obj): obj must be an instance or subtype of type

    • Avatar
      Jason Brownlee March 28, 2019 at 8:08 am #

      Sorry to hear that, I don’t have any good ideas.

      You could try posting the error to stackoverflow?

      • Avatar
        Matthew March 28, 2019 at 12:13 pm #

        Thanks Jason, that sounds like a way out!

  20. Avatar
    Karthik Mamudur May 31, 2019 at 1:03 pm #

    HI Jason,

    I am working on a regression problem and am using XGboost, I tried to plot a tree by modifying the code you presented slightly and it worked fine. But here is my question

    Say I am using Gradient Boosting regressor with Decision trees as base learners, and I print the first tree out, for a given instance, I can traverse down the tree and find out with a rough approximation of the dependent variable. I understand XGBoost formulation is different from GBM, but is there a way to get a similar plot? The plot I extracted has just has yes and no decisions and some leaf values which for me isnt any useful (unlike a developer) . Please suggest if there is any other plot that helps me come up with a rough approximation of my dependent variable in the nth boosting round.

    • Avatar
      Jason Brownlee May 31, 2019 at 2:48 pm #

      Not really as you have hundreds or thousands of trees.

      • Avatar
        Karthik Mamudur May 31, 2019 at 10:29 pm #

        Thank you! I agree there are a number of trees, but I thought the first few trees will give me a rough cut value of my dependent variable and the subsequent trees will only be useful to finetune the rough cut value. Given a XGB model and its parameters, is there a way to find out a GBM equivalent of it? (Apologies if the questions sounds silly, I am just months old to ML concepts and not in a position to chew and digest all that I read in these months)

        • Avatar
          Jason Brownlee June 1, 2019 at 6:14 am #

          They are the same algorithm for the most part, “stochastic gradient boosting”, but the xgboost implementation is designed from the group-up for speed of execution during training and inference.

          • Avatar
            Karthik Mamudur June 2, 2019 at 11:40 am #

            Thank you!

  21. Avatar
    Srinivasa Murthy Gunturu June 24, 2019 at 7:55 pm #

    How to establish a relation between the predicted values by the model and the leaves or the terminal nodes in the graph for a regression problem in XGBoost?

    • Avatar
      Jason Brownlee June 25, 2019 at 6:18 am #

      Not sure I follow, why exactly?

      • Avatar
        Srinivasa Murthy Gunturu June 25, 2019 at 7:41 pm #

        Suppose I have a dataset and I train an xgboost Regression model with 80% of data as training and the rest 20% is used as a test for predictions.

        And also, the plot_tree() method is used on an xgBosst Regressor, to get a graph similar to the one that was depicted at the beginning of this article article.

        Now the question is how to establish a relationship between the leaf nodes in the tree-plot and the prediction values that were obtained from the model.

        (And what are exactly those values in the leaf nodes correspond to?)

        • Avatar
          Jason Brownlee June 26, 2019 at 6:39 am #

          I don’t know off hand sorry. Recall there are hundreds of trees in the model.

          Perhaps there is something in the xgboost API to allow you to discover the leaf of each tree used to make a prediction.

    • Avatar
      Jarye February 27, 2023 at 1:37 am #

      Hello,i have the same question,did you know?

  22. Avatar
    Konrad Bachusz August 23, 2019 at 12:23 am #

    Is it possible to plot the last tree in the model?

    • Avatar
      Jason Brownlee August 23, 2019 at 6:31 am #

      I don’t see why not. I don’t have an example sorry.

  23. Avatar
    Carlos December 27, 2019 at 4:54 am #

    Jason, the image of the tree is very very small.
    So, is impossible to see the tree, How can I get a “big” image of the tree ?

    • Avatar
      Jason Brownlee December 27, 2019 at 6:37 am #

      I’m sure there is. Sorry, I don’t have an example.

    • Avatar
      Kjell Jansson February 15, 2020 at 8:35 pm #

      There are more than one way. This one is simple:

      # plot single tree
      fig = pyplot.gcf() # to solve low resolution problem
      fig.set_size_inches(150, 100) # # to solve low resolution problem

  24. Avatar
    Tudor Lapusan September 5, 2020 at 9:13 pm #

    Hi Jason,

    we have just implemented xgboost in dtreeviz library
    It contains a lot of useful visualizations, for tree structure, leaf nodes metadata and more.

    If you have time to take a look or to try it, I would like to hear any feedback from your side.


  25. Avatar
    Jason October 9, 2020 at 5:36 am #

    Hi, is it possible to use plot_tree to plot the decision tree based on predictions on test data? if not, would you be able to recommend a library that can?

  26. Avatar
    Kevin November 10, 2020 at 1:02 pm #

    Hi Dr. Jason.

    Thank you for your good article again. I have two questions to you.

    First question: May I know how do we interpret the leaf nodes at the bottom most? What does it mean for the bottom nodes that come with floating values of ‘leaf’? For example, leaf = 0.15…, 0.16…. Can you elaborate more on this, please?

    Second question: Is it a must for us to do the feature importance first, and use its result to retrain the XGBoost algorithm with features that have higher weights based on the feature importance’s result? After these steps are done, only can we do the hyperparameter optimization, right?

    Thank you for your help.

    • Avatar
      Jason Brownlee November 10, 2020 at 1:35 pm #

      A single tree is probably not useful to interpret as part of an ensemble. The node is the output/prediction or split point for prediction – I don’t recall sorry – perhaps check the documentation.

      The model automatically performs feature selection/importance weighting as part of training. All trees do this.

  27. Avatar
    Reza January 22, 2022 at 10:35 am #

    Is it possible that one feature appear twice or more in a single tree?

  28. Avatar
    Olivier December 13, 2022 at 4:25 am #

    Great work as usual. Thanks

  29. Avatar
    Chathurangi Shyalika February 16, 2023 at 1:28 pm #

    Why is the “yes, missing” in all nodes’ interactions? what is this “missing” relates to?

  30. Avatar
    Berkay March 8, 2023 at 12:46 am #

    I’m wondering how we can use this to visualize Gradient Boosting Classifier in scikit-learn?


Leave a Reply