书籍详情
《构建机器学习应用》[47M]百度网盘|亲测有效|pdf下载
  • 构建机器学习应用

  • 出版社:南京东南大学出版社
  • 出版时间:2020-07
  • 热度:6727
  • 上架时间:2024-06-30 09:38:03
书籍下载
书籍预览
免责声明

本站支持尊重有效期内的版权/著作权,所有的资源均来自于互联网网友分享或网盘资源,一旦发现资源涉及侵权,将立即删除。希望所有用户一同监督并反馈问题,如有侵权请联系站长或发送邮件到ebook666@outlook.com,本站将立马改正

内容介绍

编辑推荐

作者Emmanuel Ameisen是一名经验丰富的数据科学家,他领导着一个人工智能教育项目群,通过代码片段、插图和截图以及对行业领袖的采访内容展示实用的机器学习概念。

本书第一部分教授如何设计一个机器学习应用程序并评估效果;第二部分介绍如何构建一个可用的机器学习模型;第三部分演示改进模型的方法,让模型满足你最初的设想;第四部分介绍应用部署和监测策略。

这本书将帮助你:

定义产品目标,确立一个机器学习问题

快速构建一个端到端的机器学习流水线并获取一个初始数据集

培训和评估机器学习模型并解决性能瓶颈

在生产环境中部署和监测模型

内容简介

通过这本实用的教程,你将学会如何构建一个机器学习模型驱动的应用示例,将最初的想法转化成可部署的产品。数据科学家、软件工程师和产品经理——无论经验丰富的的专家,还是刚刚入门的新手——都可以循序渐进地学习构建真实机器学习应用涉及的工具、最佳实践,完成相关的技术挑战。

学习设计、构建和部署机器学习(ML)应用所需的技能。通过这本实用的教程,你将构建一个机器学习驱动的示例应用程序,将最初的想法转化成可部署的产品。数据科学家、软件工程师和产品经理——无论经验丰富的专家还是刚刚入门的新手——都可以循序渐进地学习构建实际的机器学习应用程序所涉及的工具、最佳实践和技术挑战。

作者简介

Emmanuel Ameisen是Stripe公司的机器学习工程师,曾经为Local Motion和Zipcar公司实施并部署了预测分析和机器学习解决方案。最近,他正在领导洞见数据科学的人工智能项目群,指导着100多个机器学习项目。他拥有法国三所大学的人工智能、计算机工程和管理硕士学位。

目录

Preface

Part I. Find the Correct ML Approach

1. From Product Goal to ML Framing

Estimate What Is sible

Models

Data

Framing the ML Editor

Trying to Do It All with ML: An End-to-End Framework

The Simplest Approach: Being the Algorithm

Middle Ground: Learning from Our Experience

Monica Rogati: How to Choose and Prioritize ML Projects

Conclusion

2. Createa Plan

Measuring Success

Business Performance

Model Performance

Freshness and Distribution Shift

Speed

Estimate Scope and Challenges

Leverage Domain Expertise

Stand on the Shoulders of Giants

ML Editor Planning

Initial Plan for an Editor

Always Start with a Simple Model

To Make Regular Progress: Start Simple

Start with a Simple Pipeline

Pipeline for the ML Editor

Conclusion

Part II. Build a Working Pipeline

3. Build Your First End-to-End Pipeline

The Simplest Scaffolding

Prototype of an ML Editor

Parse and Clean Data

Tokenizing Text

Generating Features

Test Your Workflow

User Experience

Modeling Results

ML Editor Prototype Evaluation

Model

User Experience

Conclusion

4. Acquire an Initial Dataset

Iterate on Datasets

Do Data Science

Explore Your First Dataset

Be Efficient, Start Small

Insights Versus Products

A Data Quality Rubric

Label to Find Data Trends

Summary Statistics

Explore and Label Efficiently

Be the Algorithm

Data Trends

Let Data Inform Features and Models

Build Features Out of Patterns

ML Editor Features

Robert nro: How Do You Find, Label, and Leverage Data?

Conclusion

Part III. Iterate on Models

5. Train and Evaluate Your Model

The Simplest Appropriate Model

Simple Models

From Patterns to Models

Split Your Dataset

ML Editor Data Split

Judge Performance

Evaluate Your Model: Look Beyond Accuracy

Contrast Data and Predictions

Confusion Matrix

ROC Curve

Calibration Curve

Dimensionality Reduction for Errors

The Top-k Method

Other Models

Evaluate Feature Importancek

Directly from a Classifier

Black-Box Explainers

Conclusion

6. Debug Your ML Problems

Software Best Practices

ML-Specific Best Practices

Debug Wiring: Visualizing and Testing

Start with One Example

Test Your ML Code

Debug Training: Make Your Model Learn

Task Difficulty

Optimization Problems

Debug Generalization: Make Your Model Useful

Data Leakage

Overfitting

Consider the Task at Hand

Conclusion

7. Using Classifiers for Writing Recommendations

Extracting Recommendations from Models

What Can We Achieve Without a Model?

Extracting Global Feature Importance

Using a Model's Score

Extracting Local Feature Importance

Comparing Models

Version 1: The Report Card

Version 2: More Powerful, More Unclear

Version 3: Understandable Recommendations

Generating Editing Recommendations

Conclusion

Part IV. Deploy and Monitor

8. Considerations When Deploying Models

Data Concerns

Data Ownership

Data Bias

Systemic Bias

Modeling Concerns

Feedback Loops

Inclusive Model Performance

Considering Context

Adversaries

Abuse Concerns and Dual-Use

Chris Harland: Shipping Experiments

Conclusion

9. Choose Your Deployment Option

Server-Side Deployment

Streaming Application or API

Batch Predictions

Client-Side Deployment

On Device

Browser Side

Federated Learning: A Hybrid Approach

Conclusion

10. Build Safeguards for Models

Engineer Around Failures

Input and Output Checks

Model Failure Fal


精彩书摘

“很多关于机器学习的书都跳过了最困难的部分:提炼问题、调试模型和为客户部署。但本书关注的正式这些内容,可以让你的项目从一个想法变成具有影响力的产品。”

——Alexander Gude (Intuit公司的数据科学家)