pattern recognition and machine learning PDF 下载_Java知识分享网-免费Java资源下载

失效链接处理

pattern recognition and machine learning PDF 下载

本站整理下载：

链接：https://pan.baidu.com/s/1J8k5ZWp8unqMEW97SD_cxg

提取码：0x2b

相关截图：

主要内容：

The problem of searching for patterns in data is a fundamental one and has a long and

successful history. For instance, the extensive astronomical observations of Tycho

Brahe in the 16th century allowed Johannes Kepler to discover the empirical laws of

planetary motion, which in turn provided a springboard for the development of classical mechanics. Similarly, the discovery of regularities in atomic spectra played a

key role in the development and verification of quantum physics in the early twentieth century. The field of pattern recognition is concerned with the automatic discovery of regularities in data through the use of computer algorithms and with the use of

these regularities to take actions such as classifying the data into different categories.

Consider the example of recognizing handwritten digits, illustrated in Figure 1.1.

Each digit corresponds to a 28×28 pixel image and so can be represented by a vector

x comprising 784 real numbers. The goal is to build a machine that will take such a

vector x as input and that will produce the identity of the digit 0,..., 9 as the output.

This is a nontrivial problem due to the wide variability of handwriting. It could be

2 1. INTRODUCTION

Figure 1.1 Examples of hand-written digits taken from US zip codes.

tackled using handcrafted rules or heuristics for distinguishing the digits based on

the shapes of the strokes, but in practice such an approach leads to a proliferation of

rules and of exceptions to the rules and so on, and invariably gives poor results.

Far better results can be obtained by adopting a machine learning approach in

which a large set of N digits {x1,..., xN } called a training set is used to tune the

parameters of an adaptive model. The categories of the digits in the training set

are known in advance, typically by inspecting them individually and hand-labelling

them. We can express the category of a digit using target vector t, which represents

the identity of the corresponding digit. Suitable techniques for representing categories in terms of vectors will be discussed later. Note that there is one such target

vector t for each digit image x.

The result of running the machine learning algorithm can be expressed as a

function y(x) which takes a new digit image x as input and that generates an output

vector y, encoded in the same way as the target vectors. The precise form of the

function y(x) is determined during the training phase, also known as the learning

phase, on the basis of the training data. Once the model is trained it can then determine the identity of new digit images, which are said to comprise a test set. The

ability to categorize correctly new examples that differ from those used for training is known as generalization. In practical applications, the variability of the input

vectors will be such that the training data can comprise only a tiny fraction of all

possible input vectors, and so generalization is a central goal in pattern recognition.

For most practical applications, the original input variables are typically preprocessed to transform them into some new space of variables where, it is hoped, the

pattern recognition problem will be easier to solve. For instance, in the digit recognition problem, the images of the digits are typically translated and scaled so that each

digit is contained within a box of a fixed size. This greatly reduces the variability

within each digit class, because the location and scale of all the digits are now the

same, which makes it much easier for a subsequent pattern recognition algorithm

to distinguish between the different classes. This pre-processing stage is sometimes

also called feature extraction. Note that new test data must be pre-processed using

the same steps as the training data.

Pre-processing might also be performed in order to speed up computation. For

example, if the goal is real-time face detection in a high-resolution video stream,

the computer must handle huge numbers of pixels per second, and presenting these

directly to a complex pattern recognition algorithm may be computationally infeasible. Instead, the aim is to find useful features that are fast to compute, and yet that

1. INTRODUCTION 3

also preserve useful discriminatory information enabling faces to be distinguished

from non-faces. These features are then used as the inputs to the pattern recognition

algorithm. For instance, the average value of the image intensity over a rectangular

subregion can be evaluated extremely efficiently (Viola and Jones, 2004), and a set of

such features can prove very effective in fast face detection. Because the number of

such features is smaller than the number of pixels, this kind of pre-processing represents a form of dimensionality reduction. Care must be taken during pre-processing

because often information is discarded, and if this information is important to the

solution of the problem then the overall accuracy of the system can suffer.

Applications in which the training data comprises examples of the input vectors

along with their corresponding target vectors are known as supervised learning problems. Cases such as the digit recognition example, in which the aim is to assign each

input vector to one of a finite number of discrete categories, are called classification

problems. If the desired output consists of one or more continuous variables, then

the task is called regression. An example of a regression problem would be the prediction of the yield in a chemical manufacturing process in which the inputs consist

of the concentrations of reactants, the temperature, and the pressure.

In other pattern recognition problems, the training data consists of a set of input

vectors x without any corresponding target values. The goal in such unsupervised

learning problems may be to discover groups of similar examples within the data,

where it is called clustering, or to determine the distribution of data within the input

space, known as density estimation, or to project the data from a high-dimensional

space down to two or three dimensions for the purpose of visualization.

最新Java全栈就业实战课程(免费)

AI人工智能学习大礼包

IDEA永久激活

66套java实战课程无套路领取

锋哥开始收Java学员啦！

Python学习路线图

pattern recognition and machine learning PDF 下载

Java1234官方群25：
Java1234官方群25：	838462530