Monday, January 24, 2022
No menu items!
HomeArtificial Intelligence and Machine LearningA Gentle Introduction to Optimization / Mathematical Programming

A Gentle Introduction to Optimization / Mathematical Programming

Last Updated on August 10, 2021

Whether it is a supervised learning problem or an unsupervised problem, there will be some optimization algorithm working in the background. Almost any classification, regression or clustering problem can be cast as an optimization problem.

In this tutorial, you will discover what is optimization and concepts related to it.

After completing this tutorial, you will know:

What is Mathematical programming or optimization
Difference between a maximization and minimization problems
Difference between local and global optimal solutions
Difference between constrained and unconstrained optimization
Difference between linear and non-linear programming
Examples of optimization

Let’s get started.

A gentle introduction to optimization. Photo by Mehtab Farooq, some rights reserved.

Tutorial Overview

This tutorial is divided into two parts; they are:

Various introductory topics related to optimization
Constrained vs. unconstrained optimization
Equality vs. inequality constraints
Feasible region

Examples of optimization in machine learning

What Is Optimization or Mathematical Programming?

In calculus and mathematics, the optimization problem is also termed as mathematical programming. To describe this problem in simple words, it is the mechanism through which we can find an element, variable or quantity that best fits a set of given criterion or constraints.

Maximization Vs. Minimization Problems

The simplest cases of optimization problems are minimization or maximization of scalar functions. If we have a scalar function of one or more variables, f(x_1, x_2, … x_n) then the following is an optimization problem:

Find x_1, x_2, …, x_n where f(x) is minimum

Or we can have an equivalent maximization problem.

When we define functions quantifying errors or penalties, we apply a minimization problem. On the other hand, if a learning algorithm constructs a function modeling the accuracy of a method, we would maximize this function.

Many automated software tools for optimization, generally implement either a maximization problem or a minimization task but not both. Hence, we can convert a maximization problem to a minimization problem (and vice versa) by adding a negative sign to f(x), i.e., 

Maximize f(x) w.r.r x is equivalent to Minimize -f(x) w.r.t. x

As the two problems are equivalent, we’ll only talk about either minimization or maximization problems in the rest of the tutorial. The same rules and definitions apply to its equivalent.

Global Vs. Local Optimum Points

In machine learning, we often encounter functions, which are highly non-linear with a complex landscape. It is possible that there is a point where the function has the lowest value within a small or local region around that point. Such a point is called a local minimum point.

This is opposed to global minimum point, which is a point where the function has the least value over its entire domain. The following figure shows local and global maximum points.

Local and global maximum points

Unconstrained  Vs. Constrained Optimization

There are many problems in machine learning, where we are interested in finding the global optimum point without any constraints or restrictions on the region in space. Such problems are called unconstrained optimization problems.

At times we have to solve an optimization problem subject to certain constraints. Such optimization problems are termed as constrained optimization problems. For example:

Minimize x^2 + y^2     subject to.       x + y <= 1 

Examples of constrained optimization are:

Find minimum of a function when the sum of variables in the domain must sum to one
Find minimum of a function such that certain vectors are normal to each other
Find minimum of a function such that certain domain variables lie in a certain range.

Feasible Region

All the points in space where the constraints on the problem hold true comprise the feasible region. An optimization algorithm searches for optimal points in the feasible region. The feasible region for the two types of constraints is shown in the figure of the next section.

For an unconstrained optimization problem, the entire domain of the function is a feasible region.

Equality Vs. Inequality Constraints

The constraints imposed in an optimization problem could be equality constraints or inequality constraints. The figure below shows the two types of constraints.

Equality vs. inequality constraints

Linear Vs. Non-linear Programming

An optimization problem where the function is linear and all equality or inequality constraints are also linear constraints is called a linear programming problem.

If either the objective function is non-linear or one or more than one constraints is non-linear, then we have a non-linear programming problem.

To visualize the difference between linear and non-linear functions you can check out the figure below.

Linear vs. non-linear functions

Examples of Optimization in Machine Learning

Listed below are some well known machine learning algorithms that employ optimization. You should keep in mind that almost all machine learning algorithms employ some kind of optimization.

Gradient descent in neural networks (unconstrained optimization).
Method of Lagrange multipliers in support vector machines (constrained optimization).
Principal component analysis (constrained optimization) 
Clustering via expectation maximization algorithm (constrained optimization)
Logistic regression (unconstrained optimization)
Genetic algorithms in evolutionary learning algorithms (different variants exist to solve both constrained and unconstrained optimization problems).


This section lists some ideas for extending the tutorial that you may wish to explore.

Method of Lagrange multipliers
Non-linear optimization techniques
The simplex method

If you explore any of these extensions, I’d love to know. Post your findings in the comments below.

Further Reading

This section provides more resources on the topic if you are looking to go   deeper.


Function of several variables and gradient vectors
Gradient descent for machine learning
Why optimization is important in machine learning
How to choose an optimization algorithm


Additional resources on Calculus Books for Machine Learning


Thomas’ Calculus, 14th edition, 2017. (based on the original works of George B. Thomas, revised by Joel Hass, Christopher Heil, Maurice Weir)
Calculus, 3rd Edition, 2017. (Gilbert Strang)
Calculus, 8th edition, 2015. (James Stewart)


In this tutorial, you discovered what is mathematical programming or optimization problem. Specifically, you learned:

Maximization vs. minimization
Constrained vs. unconstrained optimization
Why optimization is important in machine learning

Do you have any questions?

Ask your questions in the comments below and I will do my best to answer

The post A Gentle Introduction to Optimization / Mathematical Programming appeared first on Machine Learning Mastery.

Read MoreMachine Learning Mastery



Please enter your comment!
Please enter your name here

Most Popular

Recent Comments