Hands on Machine Learning目录
第1章 机器学习概览
第2章 一个完整的机器学习项目
第3章 分类
第4章 训练模型
第5章 支持向量机
第6章 决策树
第7章 集体学习和随机森林
第8章 降维
第9章 安装运行TensorFlow
第10章 人工神经网络简介
第11章 深层神经网络训练
第12章 分布式TensorFlow横穿设备和服务器
第13章 卷积神经网络
第14章 递归神经网络
第15章 自编码器
第16章 强化学习
书籍简介
第1章 机器学习概览
第2章 一个完整的机器学习项目 (作者暂未翻译完)
其他部分暂时准备看英文版
Preface 序言
Part I. The Fundamentals of Machine Learning
机器学习基本原理
1. The Machine Learning Landscape
第1章 机器学习概览
- What Is Machine Learning? 什么是机器学习?
- Why Use Machine Learning? 为什么使用机器学习?
- Types of Machine Learning Systems 机器学习系统的类型
- Supervised/Unsupervised Learning 监督/非监督学习
- Batch and Online Learning 批量和在线学习
- Instance-Based Versus Model-Based Learning 基于实例与基于模型学习
- Main Challenges of Machine Learning 机器学习的主要挑战
- Insufficient Quantity of Training Data 训练数据数量不足
- Non representative Training Data 没有代表性的训练数据
- Poor-Quality Data 劣质数据
- Irrelevant Features 不相关的特性
- Over fitting the Training Data 过拟合的训练数据
- Under fitting the Training Data 欠拟合训练数据
Stepping Back 回顾
- Testing and Validating 测试和确认
- Exercises 练习
2. End-to-End Machine Learning Project
第2章 一个完整的机器学习项目
- Working with Real Data 使用真实数据
- Look at the Big Picture 项目概览
- Frame the Problem 划定问题
- Select a Performance Measure 选择性能指标
- Check the Assumptions 核实假设
- Get the Data 获取数据
- Create the Workspace 创建工作空间
- Download the Data 下载数据
- Take a Quick Look at the Data Structure 速览数据结构
- Create a Test Set 创建测试集
- Discover and Visualize the Data to Gain Insights 发现并可视化数据帮助理解
- Visualizing Geographical Data 可视化地理数据
- Looking for Correlations 寻找关联
- Experimenting with Attribute Combinations 尝试属性组合
- Prepare the Data for Machine Learning Algorithms 为机器学习算法准备数据
- Data Cleaning 数据清洗
- Handling Text and Categorical Attributes 处理文本和分类属性
- Custom Transformers 自定义Transformers
- Feature Scaling 特征缩放
- Transformation Pipelines 转化管道
- Select and Train a Model 选择和训练模型
- Training and Evaluating on the Training Set 在测试集训练和评估
- Better Evaluation Using Cross-Validation 使用交叉验证做更好的评估
- Fine-Tune Your Model 微调模型
- Grid Search 网格搜索
- Randomized Search 随机搜索
- Ensemble Methods 集成方法
- Analyze the Best Models and Their Errors 分析最好的模型和他们的错误
- Evaluate Your System on the Test Set 在测试集评估系统
- Launch, Monitor, and Maintain Your System 发布、监视和管理系统
- Try It Out! 试试看!
Exercises 练习
3. Classification 分类
- MNIST(Mixed National Institute of Standards and Technology database)
- Training a Binary Classifier 训练一个二元分类器
- Performance Measures 性能测量
- Measuring Accuracy Using Cross-Validation 使用交叉验证测量准确率
- Confusion Matrix 混合矩阵
- Precision and Recall 准确率和召回率
- Precision/Recall Tradeoff 准确率/召回率权衡
- The ROC Curve ROC曲线
- Multiclass Classification 多级分类
- Error Analysis 错误分析
- Multilabel Classification 多标签分类
- Multioutput Classification 多输出分类
Exercises 练习
4. Training Models 训练模型
- Linear Regression 线性回归
- The Normal Equation 标准方程
- Computational Complexity 计算复杂度
- Gradient Descent 梯度下降
- Batch Gradient Descent 批量梯度下降
- Stochastic Gradient Descent 随机梯度下降
- Mini-batch Gradient Descent 小批量梯度下降
- Polynomial Regression 多项式回归
- Learning Curves 学习曲线
- Regularized Linear Models 正规化的线性模型
- Ridge Regression 脊回归
- Lasso Regression 套索回归
- Elastic Net 弹性网络
- Early Stopping 提前停止
- Logistic Regression 逻辑回归
- Estimating Probabilities 估计概率
- Training and Cost Function 训练和成本函数
- Decision Boundaries 决定边界
- Softmax Regression Softmax回归
Exercises 练习
5. Support Vector Machines (SVM)支持向量机
- Linear SVM Classification 线性支持向量机分类
- Soft Margin Classification 软间隔分类
- Nonlinear SVM Classification 非线性支持向量机分类
- Polynomial Kernel 多项式核
- Adding Similarity Features 添加相似特性
- Gaussian RBF Kernel 高斯径向基核
- Computational Complexity 计算复杂度
- SVM Regression SVM回归
- Under the Hood 底层
- Decision Function and Predictions 决策函数和预测
- Training Objective 训练目标
- Quadratic Programming 二次规划
- The Dual Problem 对偶问题
- Kernelized SVM 核化SVM
- Online SVMs 在线SVM
Exercises 练习
6. Decision Trees 决策树
Training and Visualizing a Decision Tree 167
Making Predictions 169
Estimating Class Probabilities 171
The CART Training Algorithm 171
Computational Complexity 172
Gini Impurity or Entropy? 172
Regularization Hyperparameters 173
Regression 175
Instability 177
Exercises 178
7. Ensemble Learning and Random Forests
Voting Classifiers 181
Bagging and Pasting 185
Bagging and Pasting in Scikit-Learn 186
Out-of-Bag Evaluation 187
Random Patches and Random Subspaces 188
Random Forests 189
Extra-Trees 190
Feature Importance 190
Boosting 191
AdaBoost 192
Gradient Boosting 195
Stacking 200
Exercises 202
8. Dimensionality Reduction 降维
The Curse of Dimensionality 206
Main Approaches for Dimensionality Reduction 207
Projection 207
Manifold Learning 210
PCA 211
Preserving the Variance 211
Principal Components 212
Projecting Down to d Dimensions 213
Using Scikit-Learn 214
Explained Variance Ratio 214
Choosing the Right Number of Dimensions 215
PCA for Compression 216
Incremental PCA 217
Randomized PCA 218
Kernel PCA 218
Selecting a Kernel and Tuning Hyperparameters 219
LLE 221
Other Dimensionality Reduction Techniques 223
Exercises 224
Part II. Neural Networks and Deep Learning 神经网络和深度学习
9. Up and Running with TensorFlow 安装运行TensorFlow
Installation 232
Creating Your First Graph and Running It in a Session 232
Managing Graphs 234
Lifecycle of a Node Value 235
Linear Regression with TensorFlow 235
Implementing Gradient Descent 237
Manually Computing the Gradients 237
Using autodiff 238
Using an Optimizer 239
Feeding Data to the Training Algorithm 239
Saving and Restoring Models 241
Visualizing the Graph and Training Curves Using TensorBoard 242
Name Scopes 245
Modularity 246
Sharing Variables 248
Exercises 251
10. Introduction to Artificial Neural Networks 人工神经网络简介
From Biological to Artificial Neurons 254
Biological Neurons 255
Logical Computations with Neurons 256
The Perceptron 257
Multi-Layer Perceptron and Backpropagation 261
Training an MLP with TensorFlow’s High-Level API 264
Training a DNN Using Plain TensorFlow 265
Construction Phase 265
Execution Phase 269
Using the Neural Network 270
Fine-Tuning Neural Network Hyperparameters 270
Number of Hidden Layers 270
Number of Neurons per Hidden Layer 272
Activation Functions 272
Exercises 273
11. Training Deep Neural Nets 深层神经网络训练
Vanishing/Exploding Gradients Problems 275
Xavier and He Initialization 277
Nonsaturating Activation Functions 279
Batch Normalization 282
Gradient Clipping 286
Reusing Pretrained Layers 286
Reusing a TensorFlow Model 287
Reusing Models from Other Frameworks 288
Freezing the Lower Layers 289
Caching the Frozen Layers 290
Tweaking, Dropping, or Replacing the Upper Layers 290
Model Zoos 291
Unsupervised Pretraining 291
Pretraining on an Auxiliary Task 292
Faster Optimizers 293
Momentum optimization 294
Nesterov Accelerated Gradient 295
AdaGrad 296
RMSProp 298
Adam Optimization 298
Learning Rate Scheduling 300
Avoiding Overfitting Through Regularization 302
Early Stopping 303
ℓ1 and ℓ2 Regularization 303
Dropout 304
Max-Norm Regularization 307
Data Augmentation 309
Practical Guidelines 310
Exercises 311
12. Distributing TensorFlow Across Devices and Servers 分布式TensorFlow横穿设备和服务器
Multiple Devices on a Single Machine 314
Installation 314
Managing the GPU RAM 317
Placing Operations on Devices 318
Parallel Execution 321
Control Dependencies 323
Multiple Devices Across Multiple Servers 323
Opening a Session 325
The Master and Worker Services 325
Pinning Operations Across Tasks 326
Sharding Variables Across Multiple Parameter Servers 327
Sharing State Across Sessions Using Resource Containers 328
Asynchronous Communication Using TensorFlow Queues 329
Loading Data Directly from the Graph 335
Parallelizing Neural Networks on a TensorFlow Cluster 342
One Neural Network per Device 342
In-Graph Versus Between-Graph Replication 343
Model Parallelism 345
Data Parallelism 347
Exercises 352
13. Convolutional Neural Networks 卷积神经网络
The Architecture of the Visual Cortex 354
Convolutional Layer 355
Filters 357
Stacking Multiple Feature Maps 358
TensorFlow Implementation 360
Memory Requirements 362
Pooling Layer 363
CNN Architectures 365
LeNet-5 366
AlexNet 367
GoogLeNet 368
ResNet 372
Exercises 376
14. Recurrent Neural Networks 递归神经网络
Recurrent Neurons 380
Memory Cells 382
Input and Output Sequences 382
Basic RNNs in TensorFlow 384
Static Unrolling Through Time 385
Dynamic Unrolling Through Time 387
Handling Variable Length Input Sequences 387
Handling Variable-Length Output Sequences 388
Training RNNs 389
Training a Sequence Classifier 389
Training to Predict Time Series 392
Creative RNN 396
Deep RNNs 396
Distributing a Deep RNN Across Multiple GPUs 397
Applying Dropout 399
The Difficulty of Training over Many Time Steps 400
LSTM Cell 401
Peephole Connections 403
GRU Cell 404
Natural Language Processing 405
Word Embeddings 405
An Encoder–Decoder Network for Machine Translation 407
Exercises 410
15. Autoencoders 自编码器
Efficient Data Representations 412
Performing PCA with an Undercomplete Linear Autoencoder 413
Stacked Autoencoders 415
TensorFlow Implementation 416
Tying Weights 417
Training One Autoencoder at a Time 418
Visualizing the Reconstructions 420
Visualizing Features 421
Unsupervised Pretraining Using Stacked Autoencoders 422
Denoising Autoencoders 424
TensorFlow Implementation 425
Sparse Autoencoders 426
TensorFlow Implementation 427
Variational Autoencoders 428
Generating Digits 431
Other Autoencoders 432
Exercises 433
16. Reinforcement Learning 强化学习
Learning to Optimize Rewards 438
Policy Search 440
Introduction to OpenAI Gym 441
Neural Network Policies 444
Evaluating Actions: The Credit Assignment Problem 447
Policy Gradients 448
Markov Decision Processes 453
Temporal Difference Learning and Q-Learning 457
Exploration Policies 459
Approximate Q-Learning 460
Learning to Play Ms. Pac-Man Using Deep Q-Learning Exercises 469
Thank You! 感谢!