A Gentle Introduction to Optimization / Mathematical Programming

Last Updated on March 16, 2022

Whether it is a supervised learning problem or an unsupervised problem, there will be some optimization algorithm working in the background. Almost any classification, regression or clustering problem can be cast as an optimization problem.

In this tutorial, you will discover what is optimization and concepts related to it.

After completing this tutorial, you will know:

  • What is Mathematical programming or optimization
  • Difference between a maximization and minimization problems
  • Difference between local and global optimal solutions
  • Difference between constrained and unconstrained optimization
  • Difference between linear and non-linear programming
  • Examples of optimization

Let’s get started.

Picture of Hunza valley by Mehtab Farooq

A gentle introduction to optimization. Photo by Mehtab Farooq, some rights reserved.

Tutorial Overview

This tutorial is divided into two parts; they are:

  1. Various introductory topics related to optimization
    1. Constrained vs. unconstrained optimization
    2. Equality vs. inequality constraints
    3. Feasible region
  2. Examples of optimization in machine learning

What Is Optimization or Mathematical Programming?

In calculus and mathematics, the optimization problem is also termed as mathematical programming. To describe this problem in simple words, it is the mechanism through which we can find an element, variable or quantity that best fits a set of given criterion or constraints.

Maximization Vs. Minimization Problems

The simplest cases of optimization problems are minimization or maximization of scalar functions. If we have a scalar function of one or more variables, f(x_1, x_2, … x_n) then the following is an optimization problem:

Find x_1, x_2, …, x_n where f(x) is minimum

Or we can have an equivalent maximization problem.

When we define functions quantifying errors or penalties, we apply a minimization problem. On the other hand, if a learning algorithm constructs a function modeling the accuracy of a method, we would maximize this function.

Many automated software tools for optimization, generally implement either a maximization problem or a minimization task but not both. Hence, we can convert a maximization problem to a minimization problem (and vice versa) by adding a negative sign to f(x), i.e., 

Maximize f(x) w.r.r x is equivalent to Minimize -f(x) w.r.t. x

As the two problems are equivalent, we’ll only talk about either minimization or maximization problems in the rest of the tutorial. The same rules and definitions apply to its equivalent.

Want to Get Started With Calculus for Machine Learning?

Take my free 7-day email crash course now (with sample code).

Click to sign-up and also get a free PDF Ebook version of the course.

Global Vs. Local Optimum Points

In machine learning, we often encounter functions, which are highly non-linear with a complex landscape. It is possible that there is a point where the function has the lowest value within a small or local region around that point. Such a point is called a local minimum point.

This is opposed to global minimum point, which is a point where the function has the least value over its entire domain. The following figure shows local and global maximum points.

Local and global maximum points

Local and global maximum points

Unconstrained  Vs. Constrained Optimization

There are many problems in machine learning, where we are interested in finding the global optimum point without any constraints or restrictions on the region in space. Such problems are called unconstrained optimization problems.

At times we have to solve an optimization problem subject to certain constraints. Such optimization problems are termed as constrained optimization problems. For example:

Minimize x^2 + y^2     subject to.       x + y <= 1 

Examples of constrained optimization are:

  1. Find minimum of a function when the sum of variables in the domain must sum to one
  2. Find minimum of a function such that certain vectors are normal to each other
  3. Find minimum of a function such that certain domain variables lie in a certain range.

Feasible Region

All the points in space where the constraints on the problem hold true comprise the feasible region. An optimization algorithm searches for optimal points in the feasible region. The feasible region for the two types of constraints is shown in the figure of the next section.

For an unconstrained optimization problem, the entire domain of the function is a feasible region.

Equality Vs. Inequality Constraints

The constraints imposed in an optimization problem could be equality constraints or inequality constraints. The figure below shows the two types of constraints.

Equality vs. inequality constraints

Equality vs. inequality constraints

Linear Vs. Non-linear Programming

An optimization problem where the function is linear and all equality or inequality constraints are also linear constraints is called a linear programming problem.

If either the objective function is non-linear or one or more than one constraints is non-linear, then we have a non-linear programming problem.

To visualize the difference between linear and non-linear functions you can check out the figure below.

Linear vs. non-linear functions

Linear vs. non-linear functions

Examples of Optimization in Machine Learning

Listed below are some well known machine learning algorithms that employ optimization. You should keep in mind that almost all machine learning algorithms employ some kind of optimization.

  1. Gradient descent in neural networks (unconstrained optimization).
  2. Method of Lagrange multipliers in support vector machines (constrained optimization).
  3. Principal component analysis (constrained optimization) 
  4. Clustering via expectation maximization algorithm (constrained optimization)
  5. Logistic regression (unconstrained optimization)
  6. Genetic algorithms in evolutionary learning algorithms (different variants exist to solve both constrained and unconstrained optimization problems).


This section lists some ideas for extending the tutorial that you may wish to explore.

  • Method of Lagrange multipliers
  • Non-linear optimization techniques
  • The simplex method

If you explore any of these extensions, I’d love to know. Post your findings in the comments below.

Further Reading

This section provides more resources on the topic if you are looking to go   deeper.




  • Thomas’ Calculus, 14th edition, 2017. (based on the original works of George B. Thomas, revised by Joel Hass, Christopher Heil, Maurice Weir)
  • Calculus, 3rd Edition, 2017. (Gilbert Strang)
  • Calculus, 8th edition, 2015. (James Stewart)


In this tutorial, you discovered what is mathematical programming or optimization problem. Specifically, you learned:

  • Maximization vs. minimization
  • Constrained vs. unconstrained optimization
  • Why optimization is important in machine learning

Do you have any questions?

Ask your questions in the comments below and I will do my best to answer

Get a Handle on Calculus for Machine Learning!

Calculus For Machine Learning

Feel Smarter with Calculus Concepts

...by getting a better sense on the calculus symbols and terms

Discover how in my new Ebook:
Calculus for Machine Learning

It provides self-study tutorials with full working code on:
differntiation, gradient, Lagrangian mutiplier approach, Jacobian matrix, and much more...

Bring Just Enough Calculus Knowledge to
Your Machine Learning Projects

See What's Inside

, , , , , ,

8 Responses to A Gentle Introduction to Optimization / Mathematical Programming

  1. Avatar
    Sachin August 11, 2021 at 11:46 am #

    Nice article. Can you write an article with source code regarding Interpolation in Machine Learning?

    • Avatar
      Adrian Tam August 12, 2021 at 5:53 am #

      Thanks for your suggestion.

    • Avatar
      John Richards August 14, 2021 at 9:43 am #

      I love the math and optimization tutorials. Very nicely done!

      • Avatar
        Adrian Tam August 14, 2021 at 11:37 am #

        Glad you like it!

  2. Avatar
    S. Bhubaneswari August 11, 2021 at 10:41 pm #

    Sir. Please explain all the six algorithm with example

    • Avatar
      Adrian Tam August 12, 2021 at 5:56 am #

      We have most of these in our blog. Please use the search function and you should find them easily.

  3. Avatar
    Sean August 14, 2021 at 11:28 pm #

    Hi, how can I subscribe to this network. I’m an engineer and decide to branch out, excuse the pun, into ML. Thx, Sean.

Leave a Reply