McGraw-Hill, 1997. 414 p. — ISBN 0070428077.
This book covers the field of machine learning, which is the study of algorithms that allow computer programs to automatically improve through experience. The book is intended to support upper level undergraduate and introductory level graduate courses in machine learning.
The field of machine learning is concerned with the question of how to construct computer programs that automatically improve with experience. In recent years many successful machine learning applications have been developed, ranging from data-mining programs that learn to detect fraudulent credit card transactions, to information-filtering systems that learn users' reading preferences, to autonomous vehicles that learn to drive on public highways. At the same time, there have been important advances in the theory and algorithms that form the foundations of this field.
The goal of this textbook is to present the key algorithms and theory that form the core of machine learning. Machine learning draws on concepts and results from many fields, including statistics, artificial intelligence, philosophy, information theory, biology, cognitive science, computational complexity, and control theory. My belief is that the best way to learn about machine learning is to view it from all of these perspectives and to understand the problem settings, algorithms, and assumptions that underlie each. In the past, this has been difficult due to the absence of a broad-based single source introduction to the field. The primary goal of this book is to provide such an introduction.
Because of the interdisciplinary nature of the material, this book makes few assumptions about the background of the reader. Instead, it introduces basic concepts from statistics, artificial intelligence, information theory, and other disciplines as the need arises, focusing on just those concepts most relevant to machine learning. The book is intended for both undergraduate and graduate students in fields such as computer science, engineering, statistics, and the social sciences, and as a reference for software professionals and practitioners. Two principles that guided the writing of the book were that it should be accessible to undergraduate students and that it should contain the material I would want my own Ph.D. students to learn before beginning their doctoral research in machine learning.
A third principle that guided the writing of this book was that it should present a balance of theory and practice. Machine learning theory attempts to answer questions such as "How does learning performance vary with the number of training examples presented? " and "Which learning algorithms are most appropriate for various types of learning tasks? " This book includes discussions of these and other theoretical issues, drawing on theoretical constructs from statistics, computational complexity, and Bayesian analysis. The practice of machine learning
is covered by presenting the major algorithms in the field, along with illustrative traces of their operation. Online data sets and implementations of several algorithms are available via the World Wide Web at http://www.cs.cmu.edu/-tom1 mlbook.html. These include neural network code and data for face recognition, decision tree learning, code and data for financial loan analysis, and Bayes classifier code and data for analyzing text documents.
Tom M. Mitchell
Ever since computers were invented, we have wondered whether they might be
made to learn. If we could understand how to program them to learn-to improve
automatically with experience-the impact would be dramatic. Imagine comput-
ers learning from medical records which treatments are most effective for new
diseases, houses learning from experience to optimize energy costs based on the
particular usage patterns of their occupants, or personal software assistants learn-
ing the evolving interests of their users in order to highlight especially relevant
stories from the online morning newspaper. A successful understanding of how to
make computers learn would open up many new uses of computers and new levels
of competence and customization. And a detailed understanding of information-
processing algorithms for machine learning might lead to a better understanding
of human learning abilities (and disabilities) as well.
We do not yet know how to make computers learn nearly as well as people
learn. However, algorithms have been invented that are effective for certain types
of learning tasks, and a theoretical understanding of learning is beginning to
emerge. Many practical computer programs have been developed to exhibit use-
ful types of learning, and significant commercial applications have begun to ap-
pear. For problems such as speech recognition, algorithms based on machine
learning outperform all other approaches that have been attempted to date. In
the field known as data mining, machine learning algorithms are being used rou-
tinely to discover valuable knowledge from large commercial databases containing
equipment maintenance records, loan applications, financial transactions, medical
records, and the like. As our understanding of computers continues to mature, it
seems inevitable that machine learning will play an increasingly central role in
computer science and computer technology.