Fitting Functions to Data

At the core of many machine learning models, lies a very simple mathematical problem: fit a given set of data points into an appropriate function, then make sure this function performs well on new data.

Structure of the machine learning part of an AI problem

Our goal in this blog is to internalize the following structure of the machine learning part of an AI problem:

Identify the problem

The problem depends on the specific use case: classify images, classify documents, predict house prices, detect fraud or anomalies, recommend the next product, predict the likelihood of a criminal reoffending, predict the internal structure of a building given external images, convert speech to text, generate audio, generate images, generate video, etc.

Acquire the appropriate data

This is about training our models to do the right thing. We say that our models learn from data. We have make sure data is clean, complete and if necessary, depending on the specific model we are implementing, transformed (normalized, standardize, some features aggregated, etc.). This step is usually way more time-consuming than implementing and training the machine learning models.

Create a hypothesis function

We use the terms hypothesis function, learning function, prediction function, prediction function, training function, and model interchangeably. Our main assumption is that this input/output mathematical function explains the observed data, and it can be used later to make predictions on new data.

I’ve always been amazed of how (supervised) machine learning algorithms are constituted by simple ideas.

fit a given set of data points into an appropriate function, then make sure this function performs well on new data.

Training Function (Model)

Loss Function

Optimization

Classical Machine Learning Algorithms:

Linear Regression

Bias / Variance Trade-Off

The inability for a machine learning method to capture the true relationship of data is called bias.

In machine learning lingo, the difference in fits between data sets is called variance.

In Machine Learning, the ideal algorithm has low variance and therefore can accurately model the true relationship. It also has low variability, by producing consistent predictions across different data sets.

Structure of the machine learning part of an AI problem

Identify the problem

Acquire the appropriate data

Create a hypothesis function

Bias / Variance Trade-Off

Regularization, boosting and bagging