understanding machine learning from theory to algorithms pdf

The term machine learning refers to the automated detection of meaningful patterns
in data. In the past couple of decades it has become a common tool in almost any
task that requires information extraction from large data sets. We are surrounded
by a machine learning based technology:

Search engines learn how to bring us the
best results (while placing profitable ads), antispam software learns to filter our email messages, and credit card transactions are secured by a software that learns
how to detect frauds.

Digital cameras learn to detect faces and intelligent personal
assistance applications on smart-phones learn to recognize voice commands. Cars
are equipped with accident prevention systems that are built using machine learning
algorithms. Machine learning is also widely used in scientific applications such as
bioinformatics, medicine, and astronomy.

One common feature of all of these applications is that, in contrast to more traditional uses of computers, in these cases, due to the complexity of the patterns that
need to be detected, a human programmer cannot provide an explicit, fine-detailed
specification of how such tasks should be executed. Taking example from intelligent
beings, many of our skills are acquired or refined through learning from our experience (rather than following explicit instructions given to us). Machine learning tools
are concerned with endowing programs with the ability to “learn” and adapt.

The first goal of this book is to provide a rigorous, yet easy to follow, introduction
to the main concepts underlying machine learning: What is learning? How can a
machine learn? How do we quantify the resources needed to learn a given concept?
Is learning always possible? Can we know whether the learning process succeeded or
failed?

The second goal of this book is to present several key machine learning algorithms. We chose to present algorithms that on one hand are successfully used in
practice and on the other hand give a wide spectrum of different learning techniques. Additionally, we pay specific attention to algorithms appropriate for large
scale learning (a.k.a. “Big Data”), since in recent years, our world has become
increasingly “digitized” and the amount of data available for learning is dramatically increasing.

As a result, in many applications data is plentiful and computationtime is the main bottleneck. We therefore explicitly quantify both the amount of
data and the amount of computation time needed to learn a given concept.