Top Books on Natural Language Processing

Natural Language Processing, or NLP for short, is the study of computational methods for working with speech and text data.

The field is dominated by the statistical paradigm and machine learning methods are used for developing predictive models.

In this post, you will discover the top books that you can read to get started with natural language processing.

After reading this post, you will know:

  • The top books for practical natural language processing.
  • The top textbooks for the theoretical foundations of natural language processing.
  • The NLP books I have on my shelf.

Kick-start your project with my new book Deep Learning for Natural Language Processing, including step-by-step tutorials and the Python source code files for all examples.

Let’s get started.

Top Practical Books on Natural Language Processing

As practitioners, we do not always have to grab for a textbook when getting started on a new topic.

Code examples in the book are in the Python programming language.

Although there are fewer practical books on NLP than textbooks, I have tried to pick the top 3 books that will help you get started and bring NLP method to your machine learning project.

1. Natural Language Processing with Python

Written by Steven Bird, Ewan Klein and Edward Loper.

Natural Language Processing with Python

Natural Language Processing with Python

This book provides an introduction to NLP using the Python stack for practitioners.

The book focuses on using the NLTK Python library, which is very popular for common NLP tasks.

Contents include:

  1. Language Processing and Python
  2. Accessing Text Corpora and Lexical Resources
  3. Processing Raw Text
  4. Writing Structured Programs
  5. Categorizing and Tagging Words
  6. Learning to Classify Text
  7. Extracting Information from Text
  8. Analyzing Sentence Structure
  9. Building Feature-Based GRammars
  10. Analyzing the Meaning of Sentences
  11. Managing Linguistic Data

This book is perfect if you are looking at getting into classical NLP using the go-to NLTK platform.

Resources

2. Taming Text

This book provides an introduction to a suite of different NLP tools and problems, such as Apache Solr, Apache OpenNLP, and Apache Mahout.

Taming Text

Taming Text

Code examples are in Java.

It may be more suited to developers getting started with larger enterprise-grade NLP tools on work projects.

Written by Grant Ingersoll, Thomas Morton and Drew Farris.

Notably, Grant Ingersoll is a cofounder of the Apache Mahout project.

Contents include:

  1. Getting Started Taming Text
  2. Foundations of Taming Text
  3. Searching
  4. Fuzzy String Matching
  5. Identifying People, Places and Things
  6. Clustering Text
  7. Classification, Categorization and Tagging
  8. Building an Example Question Answering System
  9. Untaming Text: Exploring the Next Frontier

Resources

Need help with Deep Learning for Text Data?

Take my free 7-day email crash course now (with code).

Click to sign-up and also get a free PDF Ebook version of the course.

3. Text Mining with R

Written by Julia Silge and David Robinson.

Text Mining with R

Text Mining with R

This book demonstrates statistical natural language processing methods on a range of modern applications.

Code examples are in R.

Code focuses on the “tidy” principles by Hadley Wickham (paper) and the tidytext package by the authors.

Of the three books, this is the most recently published and has a more practical and modern feel to the demonstrations.

Contents include:

  1. The Tidy Text Format
  2. Sentiment Analysis with Tidy Data
  3. Analyzing word and Document Frequency: tf-idf
  4. Relationships Between Words: N-grams and Correlations
  5. Converting to and from Nontidy Formats
  6. Topic Modeling
  7. Case Study: Comparing Twitter Archives
  8. Case Study: Mining NASA Metadata
  9. Case Study: Analyzing Usenet Text

Resources

Do you know of other great practical books on natural language processing?
Let me know in the comments.

Top Textbooks on Natural Language Processing

There are a ton of textbooks on natural language processing and on specific sub-topics.

In this section, I have tried to focus on what I (and consensus) seems to see as the best books on the topic for beginners, e.g. undergraduate or graduate students and practitioners looking to step deeper into the theory.

I have tried to pick a mix of general NLP books as well as books on highly studied topics like translation and speech.

The first two books in this section are essentially cannon for NLP students.

1. Foundations of Statistical Natural Language Processing

Written by Christopher Manning and Hinrich Schütze.

Foundations of Statistical Natural Language Processing

Foundations of Statistical Natural Language Processing

Notably, Christopher Manning teaches NLP at Stanford and is behind the CS224n: Natural Language Processing with Deep Learning course.

This book provides an introduction to statistical methods for natural language processing covering both the required linguistics and the newer (at the time, circa 1999) statistical methods.

This book provides a strong foundation to better grasp the newer methods and encodings.

Contents include:

  1. Introduction
  2. Mathematical Foundations
  3. Linguistic Essentials
  4. Corpus-Based Work
  5. Collocations
  6. Statistical Inference: n-gram Models over Sparse Data
  7. Word Sense Disambiguation
  8. Lexical Acquisition
  9. Markov Models
  10. Part-of-Speech Tagging
  11. Probabilistic Context Free Grammars
  12. Probabilistic Parsing
  13. Statistical Alignment and Machine Translation
  14. Clustering
  15. Topics in Information Retrieval
  16. Text Categorization

Resources

2. Speech and Language Processing

Written by Daniel Jurafsky and James Martin.

Speech and Language Processing

Speech and Language Processing

This book provides coverage of NLP from both speech and text perspectives with a strong focus on applications (one in each chapter).

Coverage of the topic feels exhaustive.

Contents include:

  1. Introduction
  2. Regular Expressions and Automata
  3. Words and Transducers
  4. N-grams
  5. Part-of-Speech Tagging
  6. Hidden Markov and Maximum Entropy Models
  7. Phonetics
  8. Speech Synthesis
  9. Automatic Speech Recognition
  10. Speech Recognition: Advanced Topics
  11. Computational Phonology
  12. Formal Grammars of English
  13. Syntactic Parsing
  14. Statistical Parsing
  15. Features and Unification
  16. Language and Complexity
  17. The Representation of Meaning
  18. Computational Semantics
  19. Lexical Semantics
  20. Computational Lexical Semantics
  21. Computational Discourse
  22. Information Extraction
  23. Question Answering and Summarization
  24. Dialog and Conversational Agents
  25. Machine Translation

Resources

4. Statistical Machine Translation

Written by Philipp Koehn.

Statistical Machine Translation

Statistical Machine Translation

This book provides an introduction to the topic of statistical machine translation, a s subfield of NLP.

Contents include:

  1. Introduction
  2. Words, Sentences, Corpa
  3. Probability Theory
  4. Word-Based Models
  5. Phrase-Based Models
  6. Decoding
  7. Language Models
  8. Evaluation
  9. Discriminative Training
  10. Integrating Linguistic Information
  11. Tree-Based Methods

Resources

5. Statistical Methods for Speech Recognition

Written by Frederick Jelinek.

Statistical Methods for Speech Recognition

Statistical Methods for Speech Recognition

This book provides an introduction to the topic of statistical speech recognition, another subfield of NLP that saw an overhaul in the 1990s with statistical approaches.

Contents Include

  1. The Speech Recognition Problem
  2. Hidden Markov Models
  3. The Acoustic Model
  4. Basic Language Modeling
  5. The Viterbi Search
  6. Hypothesis Search on a Tree and the Fast Match
  7. Elements of Information Theory
  8. The Complexity of Tasks – The Quality of Language Models
  9. The Expectation-Maximization Algorithm and Its Consequences
  10. Decision Trees and Tree Language Models
  11. Phonetics from Orthography: Spelling-to-Base Form Mappings
  12. Triphones and Allophones
  13. Maximum Entropy Probability Estimation and Language Models
  14. Tree Applications of Maximum Entropy Estimation to Language Modeling
  15. Estimation of Probabilities from Counts and the Back-Off Method

Resources

NLP Books that I Own

I like to have a mixture of practical and reference texts on my shelf.

The hard part of NLP (for me) is simply the large number of sub-problems and the specialized terminology and theory used.

For this reason I have the following 3 NLP textbooks on my shelf:

I also really like the look of:

I recommend choosing the NLP books that are right for you and your needs or project.

Let me know which books you chose or own.
Leave a comment below.

Further Reading

This section provides more resources on the topic if you are looking go deeper.

Top NLP Books

Quora

Summary

In this post, you discovered the top books on natural language processing.

Specifically, you learned:

  • The top books for practical natural language processing.
  • The top textbooks for the theoretical foundations of natural language processing
  • The NLP books I have on my shelf.

Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.

Develop Deep Learning models for Text Data Today!

Deep Learning for Natural Language Processing

Develop Your Own Text models in Minutes

...with just a few lines of python code

Discover how in my new Ebook:
Deep Learning for Natural Language Processing

It provides self-study tutorials on topics like:
Bag-of-Words, Word Embedding, Language Models, Caption Generation, Text Translation and much more...

Finally Bring Deep Learning to your Natural Language Processing Projects

Skip the Academics. Just Results.

See What's Inside

33 Responses to Top Books on Natural Language Processing

  1. Avatar
    Redentor Del Rosario September 8, 2017 at 5:29 am #

    Much better if you have simulator to do this an include the guide an all the command line.

  2. Avatar
    Jaime B. Allen September 8, 2017 at 5:31 am #

    NLP with Python is definitely one of my first books in this area. Besides this, I also checked out ‘Text Analytics with Python’ recently from Springer (http://www.springer.com/us/book/9781484223871) which is definitely a decent read into core concepts of text processing and mining with a lot of code snippets\hands-on examples. The good part is the entire source code has also been open sourced by the author at GitHub (https://github.com/dipanjanS/text-analytics-with-python)

  3. Avatar
    BL September 8, 2017 at 9:17 am #

    Hi, Jason
    Do you have experience/comments on spacy vs nltk, vs textblob vs core nlp? Thank you.

    • Avatar
      Jason Brownlee September 9, 2017 at 11:48 am #

      I do have posts on these topics scheduled.

      Generally, I use NLTK for data prep and whip up my own models in Keras instead of spacy.

  4. Avatar
    Overflow012 September 12, 2017 at 12:43 am #

    Hi Jason! Thank you for your post. I have a question… What is the difference between “Natural Language Processing with Python” free version and “Natural Language Processing with Python: Analyzing Text with the Natural Language Toolkit”?

  5. Avatar
    Tim September 15, 2017 at 11:52 pm #

    What do you think about this book:
    http://ciml.info/

    • Avatar
      Jason Brownlee September 16, 2017 at 8:41 am #

      I’ve not read it, sorry Tim. What do you think of it?

      • Avatar
        Tim September 22, 2017 at 7:26 pm #

        I haven’t read it either. But it is written by a relatively well-known computational linguist and it is freely available as well, so I thought it might deserve some attention. It seems heavy on the math side, but take this with a grain of salt as I have only skimmed through it.

  6. Avatar
    Irfan September 26, 2017 at 10:54 pm #

    Hello Jason,

    None of the books that you have mentioned uses a library either developed in C++ or for C++ developers. Would you be able to suggest a library similar to NLTK that has been developed in C++ and for C++ developers? Information concerning a book aimed at C++ developers would also be appreciated.
    I would greatly appreciate your response.

  7. Avatar
    Franco January 30, 2018 at 8:05 am #

    I own a few NLP books, but my absolute favorite NLP Book is Jason Brownlee’s ‘DL for NLP’ 🙂

    Why?
    – I love math, it doesn’t help me in building models for applied DL.
    – I love theory, but it kind of slows me down.
    – I’m an engineer; I love to build (models).

    To me, Jason focuses on what matters most to build DL models: code, code, code….

  8. Avatar
    Harish Yadav March 3, 2018 at 8:39 pm #

    i want to do a text-speech project …can you help me out by giving reference books or videos or code…

  9. Avatar
    Corey July 30, 2018 at 7:27 am #

    I know that texts can quickly become dated. From your list of top references on Amazon in NLP I saw Applied Text Analysis with Python. It’s very new. Any thoughts?

    https://www.amazon.com/Applied-Text-Analysis-Python-Language-Aware/dp/1491963042/ref=zg_bs_271581011_4?_encoding=UTF8&psc=1&refRID=PDSJSPFCWQH6N41N9RGK

  10. Avatar
    Tenzin Bhotia August 5, 2018 at 2:04 pm #

    Which book would you suggest, if my project is on text summarization using python and deep learning

  11. Avatar
    Happy Buzaaba August 7, 2018 at 2:38 pm #

    Thanks Jason for the books. Which book would you recommend if my project is Question answering with python?

    • Avatar
      Jason Brownlee August 8, 2018 at 6:14 am #

      I don’t think there are books on just that topic, perhaps try a general book to get started with NLP?

  12. Avatar
    John September 19, 2018 at 6:19 pm #

    Hi Jason – I hope you can help point me in the right direction on something. I am new to NLP and have been learning with the NLTK libraries. However, what I am really trying to understand is how we can parse sentances to extract meaning – for example, you have terms and conditions attached to using software – the user is permitted to do X and is prohibited from doing Y or can do Z if you get our permission.
    Are there any good tools out there to help with this? Can you point me in the right direction to learn more?

  13. Avatar
    Aqsa Zafar June 20, 2020 at 4:49 pm #

    Very helpful and great list of books is listed.
    Thanks for providing such an informative post.

    I would like to provide some more add-on books for NLP.
    https://www.mltut.com/best-books-for-natural-language-processing-you-should-read/

    • Avatar
      Jason Brownlee June 21, 2020 at 6:19 am #

      Thanks for sharing.

    • Avatar
      Flavio Mosafi May 11, 2022 at 12:51 am #

      Great Aqza, thank you for it

  14. Avatar
    Arnold February 18, 2022 at 3:01 am #

    It is such a complex field. I appreciate your recommendations – you are a trustworthy source!

    • Avatar
      James Carmichael February 18, 2022 at 12:48 pm #

      You are very welcome Arnold!

Leave a Reply