evaluating-machine-learning-models PDF 下载_Java知识分享网-免费Java资源下载

失效链接处理

evaluating-machine-learning-models PDF 下载

本站整理下载：

链接：https://pan.baidu.com/s/1A2_P9eQfaRENujccNfgXDA

提取码：ipvw

相关截图：

主要内容：

This report on evaluating machine learning models arose out of a

sense of need. The content was first published as a series of six tech‐

nical posts on the Dato Machine Learning Blog. I was the editor of

the blog, and I needed something to publish for the next day. Dato

builds machine learning tools that help users build intelligent data

products. In our conversations with the community, we sometimes

ran into a confusion in terminology. For example, people would ask

for cross-validation as a feature, when what they really meant was

hyperparameter tuning, a feature we already had. So I thought, “Aha!

I’ll just quickly explain what these concepts mean and point folks to

the relevant sections in the user guide.”

So I sat down to write a blog post to explain cross-validation, holdout datasets, and hyperparameter tuning. After the first two para‐

graphs, however, I realized that it would take a lot more than a sin‐

gle blog post. The three terms sit at different depths in the concept

hierarchy of machine learning model evaluation. Cross-validation

and hold-out validation are ways of chopping up a dataset in order

to measure the model’s performance on “unseen” data. Hyperpara‐

meter tuning, on the other hand, is a more “meta” process of model

selection. But why does the model need “unseen” data, and what’s

meta about hyperparameters? In order to explain all of that, I

needed to start from the basics. First, I needed to explain the highlevel concepts and how they fit together. Only then could I dive into

each one in detail.

Machine learning is a child of statistics, computer science, and

mathematical optimization. Along the way, it took inspiration from

information theory, neural science, theoretical physics, and many

other fields. Machine learning papers are often full of impenetrable

mathematics and technical jargon. To make matters worse, some‐

times the same methods were invented multiple times in different

fields, under different names. The result is a new language that is

unfamiliar to even experts in any one of the originating fields.

As a field, machine learning is relatively young. Large-scale applica‐

tions of machine learning only started to appear in the last two dec‐

ades. This aided the development of data science as a profession.

Data science today is like the Wild West: there is endless opportu‐

nity and excitement, but also a lot of chaos and confusion. Certain

helpful tips are known to only a few.

Clearly, more clarity is needed. But a single report cannot possibly

cover all of the worthy topics in machine learning. I am not covering

problem formulation or feature engineering, which many people

consider to be the most difficult and crucial tasks in applied

machine learning. Problem formulation is the process of matching a

dataset and a desired output to a well-understood machine learning

task. This is often trickier than it sounds. Feature engineering is also

extremely important. Having good features can make a big differ‐

ence in the quality of the machine learning models, even more so

than the choice of the model itself. Feature engineering takes knowl‐

edge, experience, and ingenuity. We will save that topic for another

time.

This report focuses on model evaluation. It is for folks who are start‐

ing out with data science and applied machine learning. Some seas‐

oned practitioners may also benefit from the latter half of the report,

which focuses on hyperparameter tuning and A/B testing. I certainly

learned a lot from writing it, especially about how difficult it is to do

A/B testing right. I hope it will help many others build measurably

better machine learning models!

This report includes new text and illustrations not found in the orig‐

inal blog posts. In Chapter 1, Orientation, there is a clearer explana‐

tion of the landscape of offline versus online evaluations, with new

diagrams to illustrate the concepts. In Chapter 2, Evaluation Met‐

rics, there’s a revised and clarified discussion of the statistical boot‐

strap. I added cautionary notes about the difference between train‐

ing objectives and validation metrics, interpreting metrics when the

data is skewed (which always happens in the real world), and nested

hyperparameter tuning. Lastly, I added pointers to various software

vi | Preface

packages that implement some of these procedures. (Soft plugs for

GraphLab Create, the library built by Dato, my employer.)

I’m grateful to be given the opportunity to put it all together into a

single report. Blogs do not go through the rigorous process of aca‐

demic peer reviewing. But my coworkers and the community of

readers have made many helpful comments along the way. A big

thank you to Antoine Atallah for illuminating discussions on A/B

testing. Chris DuBois, Brian Kent, and Andrew Bruce provided

careful reviews of some of the drafts. Ping Wang and Toby Roseman

found bugs in the examples for classification metrics. Joe McCarthy

provided many thoughtful comments, and Peter Rudenko shared a

number of new papers on hyperparameter tuning. All the awesome

infographics are done by Eric Wolfe and Mark Enomoto; all the

average-looking ones are done by me.

If you notice any errors or glaring omissions, please let me know:

alicez@dato.com. Better an errata than never!

Last but not least, without the cheerful support of Ben Lorica and

Shannon Cutt at O’Reilly, this report would not have materialized.

Thank you!

最新Java全栈就业实战课程(免费)

AI人工智能学习大礼包

IDEA永久激活

66套java实战课程无套路领取

锋哥开始收Java学员啦！

Python学习路线图

evaluating-machine-learning-models PDF 下载

Java1234官方群25：
Java1234官方群25：	838462530