失效链接处理 |
Learning From Data 2nd PDF 下载
本站整理下载:
提取码:n1ru
相关截图:
主要内容:
There are two problems in modern science:
too many people use different terminology to solve the same problems;
even more people use the same terminology to address completely different
issues.
Anonymous
In recent years, there has been an explosive growth of methods for learning
(or estimating dependencies) from data. This is not surprising given the proliferation of
low-cost computers (for implementing such methods in software)
low-cost sensors and database technology (for collecting and storing data)
highly computer-literate application experts (who can pose ‘‘interesting’’
application problems)
A learning method is an algorithm (usually implemented in software) that estimates an unknown mapping (dependency) between a system’s inputs and outputs
from the available data, namely from known (input, output) samples. Once such
a dependency has been accurately estimated, it can be used for prediction of future
system outputs from the known input values. This book provides a unified description of principles and methods for learning dependencies from data.
Methods for estimating dependencies from data have been traditionally explored
in diverse fields such as statistics (multivariate regression and classification), engineering (pattern recognition), and computer science (artificial intelligence, machine
xi
learning, and, more recently, data mining). Recent interest in learning from data has
resulted in the development of biologically motivated methodologies, such as
artificial neural networks, fuzzy systems, and wavelets.
Unfortunately, developments in each field are seldom related to other fields,
despite the apparent commonality of issues and methods. The mere fact that
hundreds of ‘‘new’’ methods are being proposed each year at various conferences
and in numerous journals suggests a certain lack of understanding of the basic
issues common to all such methods.
The premise of this book is that there are just a handful of important principles
and issues in the field of learning dependencies from data. Any researcher or
practitioner in this field needs to be aware of these issues in order to successfully
apply a particular methodology, understand a method’s limitations, or develop new
techniques.
This book is an attempt to present and discuss such issues and principles (common to all methods) and then describe representative popular methods originating
from statistics, neural networks, and pattern recognition. Often methods developed
in different fields can be related to a common conceptual framework. This approach
enables better understanding of a method’s properties, and it has methodological
advantages over traditional ‘‘cookbook’’ descriptions of various learning algorithms.
Many aspects of learning methods can be addressed under a traditional statistical
framework. At the same time, many popular learning algorithms and learning
methodologies have been developed outside classical statistics. This happened
for several reasons:
1. Traditionally, the statistician’s role has been to analyze the inferential
limitations of the structural model constructed (proposed) by the application-domain expert. Consequently, the conceptual approach (adopted in
statistics) is parameter estimation for model identification. For many reallife problems that require flexible estimation with finite samples, the
statistical approach is fundamentally flawed. As shown in this book, learning
with finite samples should be based on the framework known as risk
minimization, rather than density estimation.
2. Statisticians have been late to recognize and appreciate the importance of
computer-intensive approaches to data analysis. The growing use of computers has fundamentally changed the traditional boundaries between a statistician (data modeler) and a user (application expert). Nowadays, engineers
and computer scientists successfully use sophisticated empirical datamodeling techniques (i.e., neural nets) to estimate complex nonlinear
dependencies from the data.
3. Statistics (being part of mathematics) has developed into a closed discipline,
with its own scientific jargon and academic objectives that favor analytic
proofs rather than practical methods for learning from data.
|