XGBoost With Python

XGBoost With Python

Discover The Algorithm That Is Winning Machine Learning Competitions

XGBoost With Python

$37 USD

XGBoost is the dominant technique for predictive modeling on regular data.

The gradient boosting algorithm is the top technique on a wide range of predictive modeling problems, and XGBoost is the fastest implementation. When asked, the best machine learning competitors in the world recommend using XGBoost.

In this new Ebook written in the friendly Machine Learning Mastery style that you’re used to, learn exactly how to get started and bring XGBoost to your own machine learning projects. After purchasing you will get:

  • 115 Page PDF Ebook.
  • 30 Python Recipes.
  • 15 Step-by-Step Tutorial Lessons.

Apply XGBoost To Your Projects Today!

Click to jump straight to the packages.

Very comprehensive and practical coverage of XGBoost. I picked up the book because I wanted to learn about XGBoost in a quick structured way so I could start using it as quickly as possible, and the book worked out great. Many thanks to Jason Brownlee for doing the research into XGBoost for me. The convenience and time savings definitely paid for the book many times over!

Why Is XGBoost So Powerful?
… the secret is its “speed” and “model performance”

The Gradient Boosting algorithm has been around since 1999. So why is it so popular right now?

The reason is that we now have machines fast enough and enough data to really make this algorithm shine.

Academics and researchers knew it was a dominant algorithm, more powerful than random forest, but few people in industry knew about it.

This was due to two main reasons:

  1. The implementations of gradient boosting in R and Python were not really developed for performance and hence took a long time to train even modest sized models.
  2. Because of the lack of attention on the algorithm, there were few good heuristics on which parameters to tune and how to tune them.

Naive implementations are slow, because the algorithm requires one tree to be created at a time to attempt to correct the errors of all previous trees in the model.

This sequential procedure results in models with really great predictive capability, but can be very slow to train when hundreds or thousands of trees need to be created from large datasets.

XGBoost Changed Everything

XGBoost was developed by Tianqi Chen and collaborators for speed and performance.

Tianqi is a top machine learning researcher, so he knows deeply how the algorithm works. He is also a very good engineer, so he knows how to build high-quality software.

This combination allowed him to combine his talents and re-frame the interns of the gradient boosting algorithm in such a way that it can exploit the full potential of the memory and CPU cores of your hardware.

In XGBoost, individual trees are created using multiple cores and data is organized to minimize the lookup times, all good computer science tips and tricks.

The result is an implementation of gradient boosting in the XGBoost library that can be configured to squeeze the best performance from your machine, whilst offering all of the knobs and dials to tune the behavior of the algorithm to your specific problem.

This Power Did Not Go Unnoticed

Soon after the release of XGBoost, top machine learning competitors started using it.

More than that, they started winning competitions on sites like Kaggle. And they were not shy about sharing the news about XGBoost.

For example, here are some quotes from top Kaggle competitors:

As the winner of an increasing amount of Kaggle competitions, XGBoost showed us again to be a great all-round algorithm worth having in your toolbox.

Dato Winners’ Interview, Mad Professors

I only used XGBoost.

Liberty Mutual Property Inspection Winner’s Interview, Qingchen Wang

In fact, the formally ranked #1 Kaggle competitor in the world, Owen Zhang, strongly encourages the use of XGBoost:

When in doubt, use xgboost.

— Avito Winner’s Interview, Owen Zhang

XGBoost is a powerhouse when it comes to developing predictive models.

So how do you get started using it?

How Do You Get Started Using XGBoost
…be systematic and develop a new core skill

The Slow Way

The way that most people get started with XGBoost is the slow way.

  1. They try and find and read all of the official documentation for the library.
  2. Next, they try to adapt demos and examples to their problem.

The problem is they don’t even know anything about the underlying algorithm that XGBoost implements. Therefore, they don’t know what parameters to tune to best adapt the algorithm to their problem.

They most definitely don’t know about the full capabilities of the library.

This is the slow and frustrating way to get started with XGBoost, and sadly it is the most common.

The Fast Way

Knowing that things can be different, you can see the faster path:

  1. Learn something about the underlying algorithm so you know how to configure it.
  2. Learn about the suite of key features supported by the library.
  3. Practice using features of the library on small well understood problems.
  4. Get started applying XGBoost to your own problem.

This will cut the time taken in going from beginner to proficient practitioner by a factor of 2x or 4x if not more.

You also get the benefits of really knowing how to wield XGBoost in a range of different situations.

But you still have to find and gather all of the materials together yourself, and then study them.

The Best Way

There is an even faster way.

  1. Find an expert who has actually done all of the research and who has actually use XGBoost on real problems.
  2. Have them prepare the materials for you to study.

In addition to saving you a lot of wasteful time researching algorithm and library details, this approach can speed up the learning process by giving you access to:

  • Tips and tricks to get past roadblocks and get the most from the algorithm.
  • Code examples that work, can be run immediately and can provide templates for your own problems.
  • An expert who can answer questions and point you to the best results to learn more.

If you want to get started with XGBoost, then you are in the right place.

Introducing “XGBoost With Python”
…your ticket to developing and tuning XGBoost models

This book was designed using for you as a developer to rapidly get up to speed with applying Gradient Boosting in Python using the best-of-breed library XGBoost.

The Ebook uses a step-by-step tutorial approach throughout to help you focus on getting results in your projects and delivering value.

The goal is to get you up to speed on gradient boosting and XGBoost to quickly create your first gradient boosting model as fast as possible, then guide you through the finer points of the library and tuning your models.

This Ebook is your guide to developing and tuning XGBoost models on your own machine learning projects.

Let’s take a closer look at the breakdown of what you will discover inside this Ebook.

Everything You Need To Know to Develop XGBoost Model in Python

This Ebook designed to get you up and running with XGBoost as fast as possible.

As such, a series of step-by-step tutorial based lessons was designed to lead you from XGBoost beginner to being an effective XGBoost practitioner.

Below is an overview of the step-by-step lessons on XGBoost you will complete divided into three parts:

Part 1: XGBoost Basics

  • Lesson 01: A Gentle Introduction to Gradient Boosting.
  • Lesson 02: A Gentle Introduction to XGBoost.
  • Lesson 03: How to Develop your First XGBoost Model in Python.
  • Lesson 04: How to Best Prepare Data For Use With XGBoost.
  • Lesson 05: How to Evaluate the Performance of Models.
  • Lesson 06: How to Visualize Individual Decision Trees in XGBoost.

Part 2: XGBoost Advanced

  • Lesson 07: How to Save And Load XGBoost Models.
  • Lesson 08: How to Review and Use Feature Importance.
  • Lesson 09: How to Monitor Performing and Use Early Stopping.
  • Lesson 10: How to Configure XGBoost for Multithreading.
  • Lesson 11: How to Develop Large XGBoost models in the Cloud.

Part 3: XGBoost Tuning

  • Lesson 12: Best Practices When Configuring XGBoost.
  • Lesson 13: How to Tune the Number and Size of Decision Trees.
  • Lesson 14: How to Tune Learning Rate and Number of Trees.
  • Lesson 15: How to Tune Sampling in Stochastic Gradient Boosting.

Each lesson was designed to be completed in about 30 minutes by the average developer

XGBoost With Python Table of Contents

XGBoost With Python Table of Contents

Here’s Everything You’ll Get…
in XGBoost With Python

Hands-On Tutorials

A digital download that contains everything you need, including:

  • Clear algorithm descriptions that help you to understand the principles that underlie the technique.
  • Step-by-step XGBoost tutorials to show you exactly how to apply each method.
  • Python source code recipes for every example in the book so that you can run the tutorial and project code in seconds.
  • Digital Ebook in PDF format so that you can have the book open side-by-side with the code and see exactly how each example works.

The XGBoost basics to get you started and build a foundation, including:

  • The gradient boosting algorithm description and the 4 extensions that improve performance.
  • The XGBoost implementation of gradient boosting and the key differences that make it so fast.
  • The application of XGBoost to a simple predictive modeling problem, step-by-step.
  • The 2 important steps in data preparation you must know when using XGBoost with scikit-learn.
  • The surprising automatic handling of missing values and how it compares to imputing values manually.
  • The 2 ways to estimate model performance of XGBoost models with scikit-learn.
  • The visualization of individual trees within a trained XGBoost model.

Advanced Usage and Tuning

The advanced XGBoost usage to speed-up your own projects, including:

  • The 2 techniques to save a trained XGBoost model and later load it to make predictions on new data.
  • The calculation of feature importance scores and the 2 ways to plot the results.
  • The diagnostics of plotting learning curves from XGBoost models and how to stop training early.
  • The multithreading support of XGBoost and how to best harness this feature when parallelizing models.
  • The use of Amazon cloud computing to speed up the training of very large XGBoost models using lots of CPU cores.

The important XGBoost model tuning steps needed to get the best results, including:

  • The expert best practices that you need to know when tuning gradient boosting models.
  • The balance between the size and number of decision trees when tuning XGBoost models.
  • The slowing down of learning during training with learning rate and the impact on the number of trees.
  • The careful use of random sampling of rows and columns in tree construction and how this affects the mean and variance of performance.

Resources you need to go deeper, when you need to, including:

  • Top machine learning textbooks and the specific chapters that discuss gradient boosting to deepen your understanding, if you crave more.
  • Seminal gradient boosting papers by the experts and links to download the PDF versions.
  • The best places online where you can find more details about the XGBoost library.

What More Do You Need?

Take a Sneak Peek Inside The Ebook

XGBoost With Python Sample 1

XGBoost With Python Sample 2

XGBoost With Python Sample 3

BONUS: XGBoost Python Code Recipes
…you also get 30 fully working XGBoost scripts

Each recipe presented in the book is standalone, meaning that you can copy and paste it into your project and use it immediately.

  1. You get one Python script (.py) for each example provided in the book.
  2. You get the datasets used throughout the book.

Your XGBoost Code Recipe Library covers the following topics:

  • Binary Classification
  • Multiclass Classification
  • One Hot Encoding
  • k-fold Cross Validation
  • Train-Test Splits
  • Tree Visualization
  • Model Serialization
  • Feature Importance Scoring
  • Feature Selection
  • Early Stopping
  • Multicore and Multithreaded Configuration
  • Grid Search Hyperparameter Tuning

This means that you can follow along and compare your answers to a known working implementation of each algorithm in the provided Python files.

This helps a lot to speed up your progress when working through the details of a specific task.

XGBoost With Python Recipes

Code Provided with XGBoost with Python

About The Author

Jason BrownleeHi, I'm Jason Brownlee.

I live in Australia with my wife and son and love to write and code.

I have a computer science background as well as a Masters and Ph.D. degree in Artificial Intelligence.

I’ve written books on algorithms, won and ranked in the top 10% in machine learning competitions, consulted for startups and spent a long time working on systems for forecasting tropical cyclones. (yes I have written tons of code that runs operationally)

I get a lot of satisfaction helping developers get started and get really good at machine learning.

I teach an unconventional top-down and results-first approach to machine learning where we start by working through tutorials and problems, then later wade into theory as we need it.

I'm here to help if you ever have any questions. I want you to be awesome at machine learning.

Get Your Sample Chapter

Download PDFWant to take a look before you buy? Download a free sample chapter PDF.

Enter your email address and your sample chapter will be sent to your inbox.

Click Here to Get Your Sample Chapter



Check Out What Customers Are Saying:

This is another excellent book.  The explanations are concise, very well written.  Using real-world data like Otto from Kaggle is definitely much needed to learn ML. The codes are very well explained.  I don’t see this book as merely a how-to tutorial, it’s a very noble cause by disseminating your knowledge and skill to empower others to excel in Machine Learning.

I am happy I bought this book, and it allowed me to successfully kickstart a practical understanding of how to employ the XGBoost algorithm.

My needs may be a little different from others who look to becoming data scientists – I don’t. My objective here is to seamlessly integrate XGBoost – and possibly other algorithms – into a new product I am developing to provide real-time predictions. I am happy to report that this book was instrumental in helping me to run a successful pilot – within a short space of time.

I can recommend this book to anyone who wants to get down to the practical objective of implementing XGBoost.


You're Not Alone in Choosing Machine Learning Mastery
Trusted by Over 10,000 Practitioners

...including employees from companies like:


cisco  google  oracle  adobe

apple microsoft paypal  intel


...students and faculty from universities like:


berkeley  princeton  yale cmu

stanford  harvard  mit  nyu


and many thousands more...

Absolutely No Risk with...
100% Money Back Guarantee

Plus, as you should expect of any great product on the market, every Machine Learning Mastery Ebook
comes with the surest sign of confidence: my gold-standard 100% money-back guarantee.

Money Back Guarantee

100% Money-Back Guarantee

If you're not happy with your purchase of any of the Machine Learning Mastery Ebooks,
just email me within 90 days of buying, and I'll give you your money back ASAP.

No waiting. No questions asked. No risk.


Get Results With The Algorithm That Is
Winning Machine Learning Competitions

Choose Your Package:

Basic Package

You will get:

  • XGBoost With Python

(including bonus source code)

Buy Now for $37

(great value!)

Python Pro Bundle


You get the 3-book set:

  • Machine Learning Mastery With Python
  • Deep Learning With Python
  • XGBoost With Python

(includes all bonus source code)

Buy Now for $84

(save $37, like getting a book for free!)

Super Bundle

You get the complete 11-Ebook set:

  • Linear Algebra for Machine Learning
  • Master Machine Learning Algorithms
  • ML Algorithms From Scratch
  • Machine Learning Mastery With Weka
  • Machine Learning Mastery With R
  • Machine Learning Mastery With Python
  • Time Series Forecasting With Python
  • Deep Learning With Python
  • Deep Learning for NLP
  • LSTM Networks With Python
  • XGBoost With Python

(includes all bonus source code)

Buy Now for $287

(save a massive $120)

All prices are in US Dollars (USD).

(1) Click the button.    (2) Enter your details.   (3) Download immediately.

credit cards

Secure Payment Processing With SSL Encryption

Secure Payment

Are you a Student, Teacher or Retiree?

Contact me about a discount.


Do you have any Questions?

See the FAQ.

What Are Skills in Machine Learning Worth?

Your boss asks you:

Hey, can you build a predictive model for this?

Imagine you had the skills and confidence to say:
...and follow through.

I have been there. It feels great!

How much is that worth to you?

The industry is demanding skills in machine learning.
The market wants people that can deliver results, not write academic papers.

Business knows what these skills are worth and are paying sky-high starting salaries.

A Data Scientists Salary Begins at:
$100,000 to $150,000.
A Machine Learning Engineers Salary is Even Higher.

What Are Your Alternatives?

You made it this far.
You're ready to take action.

But, what are your alternatives? What options are there?

(1) A Theoretical Textbook for $100+ 
...it's boring, math-heavy and you'll probably never finish it.

(2) An On-site Boot Camp for $10,000+ 
...it's full of young kids, you must travel and it can take months.

(3) A Higher Degree for $100,000+ 
...it's expensive, takes years, and you'll be an academic.


For the Hands-On Skills You Get...
And the Speed of Results You See...
And the Low Price You Pay...

Machine Learning Mastery Ebooks are
Amazing Value!

And they work. That's why I offer the money-back guarantee.

You're A Professional

The field moves quickly,
...how long can you wait?

You think you have all the time in the world, but...

  • New methods are devised and algorithms change.
  • New books get released and prices increase.
  • New graduates come along and jobs get filled.

Right Now is the Best Time to make your start.

Bottom-up is Slow and Frustrating,
...don't you want a faster way?

Can you really go on another day, week or month...

  • Scraping ideas and code from incomplete posts.
  • Skimming theory and insight from short videos.
  • Parsing Greek letters from academic textbooks.

Targeted Training is your Shortest Path to a result.

Professionals Use Training To Stay On Top Of Their Field
Get The Training You Need!

You don't want to fall behind or miss the opportunity.

Frequently Asked Questions

Why doesn't my payment work?

I am sorry to hear that you're having difficulty.

Some ideas:

  • Perhaps you can double check that your details are correct, just in case of a typo?
  • Perhaps you could try a different payment method, such as PayPal or Credit Card?
  • Perhaps you could try my alternative secure payment processor, click here?
  • Perhaps you're able to talk to your bank, just in case they blocked the transaction?

If you're still having difficulty, please contact me and I can help investigate further.

Can I get your books for free?


Sorry, I don’t give away free copies of my books.

You can access all of my best free material on my blog.

Can I get a hard copy of your book?


Sorry, I don't have hard copies by design.

The books are written for immediate use, rather than references to sit on the shelf.

My students like to have the PDF open on their screen next to their editor so they can copy-paste code.

Also, the books are updated often to reflect changes to APIs. The field is moving very fast.

I hope that helps explain the rationale.

Are there Kindle or ePub versions of the books?


Sorry, just PDF Ebooks.

This is by design and I put a lot of thought into it. My rationale is as follows:

  • I use LaTeX to layout the text and code to give a professional look and I am afraid that EBook readers would mess this up.
  • The increase in supported formats would create a maintenance headache that would take a large amount of time away from updating the books and working on new books.
  • Most critically, reading on an e-reader or iPad is antithetical to the book-open-next-to-code-editor approach the PDF format was chosen to support.

My materials are playbooks intended to be open on the computer, next to a text editor and a command line. They are not reference texts to be read away from the computer.

Will I get free updates to the books?


All updates are free.

Books are usually updated once every month or two to fix bugs, typos and keep abreast of API changes.

Contact me anytime and check if there have been updates. Let me know what version of the book you have (version is listed on the copyright page).

How do I get access to any bonuses?

After you complete your purchase you will receive an email with a link to download your bundle.

The download will include the book or books and any bonus material.

Is there any digital rights management (DRM)?


Can I print the PDF for my personal use?


In what order should I read your books?

My best advice is to pick a topic that most interests you and start there.

Can I get a customized bundle of books?


Sorry, I cannot create custom bundles of books for you, it would create a maintenance nightmare for me. I’m sure you can understand.

You can see the full catalog of my books and bundles here.

Can I get an evaluation copy of your books?


Sorry, I no longer distribute evaluation copies of my books due to some past abuse of the privilege.

If you are a teacher or lecturer, I’m happy to offer you a student discount.

Contact me and ask for the discount.

Can I get an invoice for my purchase?


Email me with the details of your order (order number or email address used to make the purchase) and details you would like to appear on the invoice (your name, company name and address).

I will create a PDF invoice for you and email it back.

How long do books take to ship?

There are no physical books, therefore no shipping is required.

All books are EBooks that you can download immediately after you complete your purchase.

Do you ship to my country?

There are no physical books, therefore no shipping is required.

All books are EBooks that you can download immediately after you complete your purchase.

I support purchases from any country via PayPal or Credit Card.

Can I have a discount?

I do offer a discount to students, teachers, and retirees.

Note: I only offer discounts on individual books, not on the bundles. This is because the bundles are already heavily discounted.

If you are a student, teacher or a retiree please contact me and ask for the discount.

Do you have any sales, deals, or coupons?


I generally don't do sales.

If I do have a special, such as around the launch of a new book, I only offer it to past customers and subscribers on my email list.

I do offer book bundles that offer a discount for a collection of related books.

Can I get a refund?


I am sorry to hear that you want a refund.

Please contact me directly with your purchase details (order number or email address used to make the purchase) and I will organize a refund.

Will you help me if I have questions?


Please contact me anytime with questions about machine learning or the books.

One question at a time please.

Also, each book has a final chapter on getting more help and further reading and points to resources that you can use to get more help.

Do I need to be a good programmer?


Not at all.

My material requires that you have a programmers mindset of thinking in procedures and learning by doing.

You do not need to be an excellent programmer to read and learn about machine learning algorithms.

How much math do I need to know?

No background in statistics, probability or linear algebra is required.

I teach using a top-down and results-first approach to machine learning. You will learn by doing, not learn by theory.

There are no derivations.

Any questions presented are explained in full and are only provided to make the explanation clearer, not more confusing.

How much machine learning do I need to know?

Only a little.

If you are a reader of my blog posts, then you know enough to get started.

I do my best to lead you through what you need to know, step-by-step.

How long will the book take me to complete?

I recommend reading one chapter per day.

Some students finish the book in a weekend.

Most students finish the book in a few weeks by working through it during nights and weekends.

How are your books different to other books?

My books are playbooks. Not textbooks.

They have no deep explanations of theory, just working examples that are laser-focused on the information that you need to know to bring machine learning to your project.

My books are not for everyone, they are carefully designed for practitioners that need to get results, fast.

How are your books different from the blog?

The books are a concentrated and more convenient version of what I put on the blog.

I design my books to be a combination of lessons and projects to teach you how to use a specific machine learning tool or library and then apply it to real predictive modeling problems.

The books get updated with bug fixes, updates for API changes and the addition of new chapters, and these updates are totally free.

I do put some of the book chapters on the blog as examples, but they are not tied to the surrounding chapters or the narrative that a book offers and do not offer the standalone code files.

With each book, you also get all of the source code files used in the book that you can use as recipes to jump-start your own predictive modeling problems.

How are the 2 algorithms books different?

The book “Master Machine Learning Algorithms” is for programmers and non-programmers alike that learn through worked examples. It teaches you how 10 top machine learning algorithms work, with worked examples in arithmetic, not code (and spreadsheets) that show how each model learns and makes predictions.

The book “Machine Learning Algorithms From Scratch” is for programmers that learn by writing code to understand. It provides step-by-step tutorials on how to implement top algorithms as well as how to load data, evaluate models and more. It has less on how the algorithms work, instead focusing exclusively on how to implement each in code.

The two books can support each other.

Is there a team or company-wide license?


Due to abuse of the privilege, I only support purchases by individuals.

Is there a license for libraries?


Sorry, I only support purchases by individuals.

Do you have videos?


I only have tutorial lessons and projects in text format.

This is by design. I used to have video content and I found the completion rate much lower.

I want you to put the material into practice. I have found that text-based tutorials are the best way of achieving this.

After reading and working through the tutorials you are far more likely to apply what you have learned.

What operating systems are supported?

Linux, Mac OS X and Windows.

Can you be my mentor or coach?


Thanks for asking. I would love to help, but I just don't have the capacity.

I try to help as many people as possible through my blog and books.

Can I purchase from Amazon (or elsewhere)?


My books can only be purchased from my website.

The reason is that I am a small business and I want a direct relationship with you, my customer, so that I can offer personal support and send out updates about your book and new stuff I am working on.

I hope you can understand my rationale.

What if my download link expires?

It is possible that your link to download your purchase will expire after a few days.

This is a security precaution.

Please contact me and I will resend you purchase receipt with an updated download link.

Can I use your code in my own project?


But, understand that all code was developed and provided for educational purposes only and that I take no responsibility for it, what it might do or how you might use it.

Do you have another question?

Please contact me.