书籍详情
《概率机器学习朱军概率机器学习基础学习理论概率图模型近似概率推断高斯过程深度生成模型》[30M]百度网盘|亲测有效|pdf下载
  • 概率机器学习朱军概率机器学习基础学习理论概率图模型近似概率推断高斯过程深度生成模型

  • 出版社:书香神州图书专营店
  • 热度:11566
  • 上架时间:2024-06-30 09:38:03
  • 价格:0.0
书籍下载
书籍预览
免责声明

本站支持尊重有效期内的版权/著作权,所有的资源均来自于互联网网友分享或网盘资源,一旦发现资源涉及侵权,将立即删除。希望所有用户一同监督并反馈问题,如有侵权请联系站长或发送邮件到ebook666@outlook.com,本站将立马改正

内容介绍


编辑推荐Editorial Reviews
本书以概率建模和推断为主线,系统阐述机器学习的基本原理、典型模型和算法,包括经典机器学习模型和算法、学习理论,也包括深度神经、概率图模型、深度生成模型、强化学习等前沿内容。通俗易懂,逻辑性强,可操作性好。
*简化复杂的数学证明和推导过程,并配套大量典型实例和示意图,理论与应用交错编排,图文并茂、深入浅出地阐述机器学习的基本原理、算法和应用,读者只需具备高等数学的基础知识即可阅读。
*本书是作者在二十年机器学习研究的成果上,总结讲授十余年“统计机器学习”课程的基础上编写而成的,通俗易懂,逻辑性强,可操作性好,特别适合作为机器学习的基础教材。
*本书可供理工科高等院校的高年级本科生、研究生、教师以及从事机器学习的研究人员和工程技术人员学习和参考。
内容简介Content Description
随着深度学习、大规模预训练模型和生成式人工智能的进展,机器学习已成为解决很多工程和科学问题的首选方案。《概率机器学习》一书从概率建模和统计推断的角度系统介绍机器学习的基本概念、经典算法及前沿进展。主要内容包括概率机器学习基础、学习理论、概率图模型、近似概率推断、高斯过程、深度生成模型、强化学习等。全书从实例出发,由浅入深,直观与严谨相结合,并提供了延伸阅读内容和丰富的参考文献。
作者简介Author Biography
朱军,清华大学计算机系Bosch AI冠名教授,IEEE Fellow,清华大学人工智能研究院副院长,曾任卡内基-梅隆大学兼职教授。主要从事机器学习研究,担任国际著名期刊IEEE TPAMI副主编,担任ICML、NeurIPS、ICLR等(资深)领域主席二十余次。获中国科协求是杰出青年奖、科学探索奖、中国计算机学会自然科学一等奖、吴文俊人工智能自然科学一等奖、ICLR国际会议杰出论文奖等。入选万人计划领军人才、中国计算机学会青年科学家、MIT TR35中国先锋者等。
目录Catalog
基础篇
第 1章绪论                                  3 
1.1机器学习 3 
1.1什么是机器学习                          3 
1.2机器学习的基本任务                       5 
1.1.3 K-近邻:一种“懒惰”学习方法                  9 
1.2概率机器学习 11 
2.1为什么需要概率机器学习                     11 
2.2概率机器学习包含的内容                     13 
1.3延伸阅读 16 
1.4习题. 17
第 2章概率统计基础                             19 
2.1概率 19 
1.1事件空间与概率                        . 19 
1.2连续型和离散型随机变量                     21 
2.1.3变量变换 22 
1.4联合分布、边缘分布和条件分布                .. 22 
1.5独立与条件独立                        . 24 
1.6贝叶斯公式                          .. 24 
2常见概率分布及其数字特征                      . 25 
2.1随机变量的常用数字特征                     25 
2.2离散型变量的概率分布                      26 
2.3连续型变量的概率分布                      27 
2.3统计推断                   . 28 
3.1*大似然估计                        .. 29 
2.3.2误差                          29 
2.4贝叶斯推断                       . 30 
2.4.1基本流程                        .. 30 
4.2常见应用和方法                        . 32 
4.3在线贝叶斯推断                        . 33 
2.4.4共轭先验                         .. 33 
2.5信息论基础                             . 35 
2.5.1 熵                                35 
2.5.2 互信息                             36 
2.5.3 相对熵                             36 
2.6习题                                .. 37
第 3章线性回归模型                             39 
3.1基本模型                               . 39 
3.1.1 统计决策基本模型                       . 39 
3.1.2 线性回归及*小二乘法                      40 
3.1.3 概率模型及*大似然估计                     42 
3.1.4 带基函数的线性回归                     . 43 
3.2正则化线性回归                             44 
3.2.1 岭回归                             45 
Lasso                            .. 47
3.2.2 
3.2.3  Lp范数正则化的线性回归                  .. 49 
3.3贝叶斯线性回归                             50 
3.3.1 *大后验分布估计                       . 51 
3.3.2 贝叶斯预测分布                        . 51 
3.3.3 贝叶斯模型选择                        . 54 
3.3.4 经验贝叶斯和相关向量机                     56 
3.4模型评估                               . 57 
3.4.1 评价指标                           .. 57 
3.4.2 交叉验证                           .. 58 
3.5延伸阅读                               . 59 
3.6习题                                .. 60
第 4章朴素贝叶斯分类器                          ..61 
4.1基本分类模型                              61 
4.1.1 贝叶斯分类器                        .. 62 
4.1.2 核密度估计                          .. 63 
4.1.3 维数灾                             65 
4.2朴素贝叶斯模型                             66 
4.2.1 生成式模型                          .. 66 
4.2.2 朴素贝叶斯假设                        . 67 
4.2.3 *大似然估计                        .. 68 
4.2.4 *大后验估计                        .. 69 
4.3朴素贝叶斯的扩展                           70 
4.3.1 多值特征                           .. 70 
目录 
4.3.2多类别分类                          .. 71 
4.3.3连续型特征                          .. 72 
4.3.4半监督朴素贝叶斯分类器                     73 
4.3.5树增广朴素贝叶斯分类器                     73 
4.4朴素贝叶斯的分析                           74 
4.4.1分类边界                           .. 74 
4.4.2预测概率                           .. 75 
4.5延伸阅读                               . 76 
4.6习题                                .. 77
第 5章对数几率回归和广义线性模型                    .78 
5.1对数几率回归                              78 
5.1.1模型定义                           .. 78 
5.1.2对数几率回归的隐变量表示                   79 
5.1.3*大条件似然估计                       . 80 
5.1.4正则化方法                          .. 82 
5.1.5判别式模型与生成式模型对比                 .. 83 
5.2随机梯度下降                              84 
5.2.1基本方法                           .. 85 
5.2.2动量法                             86 
5.2.3 AdaGrad方法                        . 86 
5.2.4 RMSProp法                           86 
5.2.5 Adam法                            87 
5.3贝叶斯对数几率回归                         .. 87 
5.3.1拉普拉斯近似                        .. 87 
5.3.2预测分布                           .. 89 
5.4广义线性模型                              89 
5.4.1指数族分布                          .. 89 
5.4.2指数族分布的性质                       . 91 
5.4.3广义线性模型                        .. 92 
5.5延伸阅读                               . 93 
5.6习题                                .. 94
第 6章深度神经网络                             95 
6.1神经网络的基本原理                         .. 95 
6.1.1非线性学习的基本框架                      95 
6.1.2感知机                             95 
6.1.3多层感知机                          .. 97 
6.1.4反向传播                           .. 98 
6.2卷积神经网络                            ..101 
6.2.1基本组成                           .101 
6.2.2批归一化                           .104 
6.2.3残差网络                           .105 
6.3循环神经网络                            ..106 
6.3.1基本原理                           .107 
6.3.2长短时记忆网络                        110 
6.4延伸阅读                               112 
6.5习题                                .113
第 7章支持向量机与核方法                          114 
7.1硬间隔支持向量机                         ..114 
7.1.1分类边界                           .114 
7.1.2线性可分的支持向量机                    ..114 
7.1.3硬间隔支持向量机的对偶问题                 .116 
7.2软间隔支持向量机                         ..118 
7.2.1软约束与损失函数                       118 
7.2.2软间隔 SVM的对偶问题                   ..120 
7.2.3支持向量回归                        .122 
7.3核方法                                123 
7.3.1核函数的基本性质                       123 
7.3.2表示定理                           .125 
7.3.3常见的核函数                        .126 
7.3.4概率生成模型诱导的核函数                 ..127 
7.3.5神经切线核                          .128 
7.4多分类支持向量机                         ..129 
7.4.1一对多                           ..129 
7.4.2一对一                           ..130 
7.4.3联合优化                           .130 
7.5支持向量机的概率解释                        .131 
7.5.1 Platt校准                          131 
7.5.2*大熵判别学习                        131 
7.6延伸阅读                               132 
7.7习题                                .133
第 8章聚类                                .. 134 
8.1聚类问题                               134 
8.1.1任务描述                           .134 
8.1.2距离度量                           .135 
目录 
8.2 K-均值算法                            ..137 
8.2.1优化目标                           .137 
8.2.2 K-均值算法介绍                       138 
8.2.3迭代初值和停止条件                     139 
8.2.4 K-均值算法中的模型选择                   .140 
8.3混合高斯模型                            ..141 
8.3.1隐变量模型                          .142 
8.3.2混合分布模型                        .142 
8.3.3混合分布模型与聚类                     144 
8.4 EM算法                               145 
8.4.1高斯混合模型的 EM算法                   .147 
8.4.2 EM算法收敛性                       .148 
8.4.3 EM算法与 K-均值的联系                   149 
8.5评价指标                               149 
8.5.1外部评价指标                        .149 
8.5.2内部评价指标                        .150 
8.6延伸阅读                               151 
8.7习题                                .151
第 9章降维                                .. 153 
9.1降维问题                               153 
9.2主成分分析                             154 
9.2.1基本原理                           .154 
9.2.2高维 PCA                          156 
9.3主成分分析的原理                         ..156 
9.3.1*大化方差                          .157 
9.3.2*小化重建误差                        158 
9.3.3概率主成分分析                        159 
9.4自编码器                               160 
9.4.1自编码器的基本模型                     160 
9.4.2稀疏自编码器                        .161 
9.4.3去噪自编码器                        .162 
9.5局部线性嵌入                            ..162 
9.5.1局部线性嵌入的基本过程                   ..162 
9.5.2*优局部线性重构                       164 
9.5.3保持局部*优重构的嵌入表示                 .165 
9.5.4参数选择                           .166 
9.6词向量嵌入                             167 
9.6.1隐含语义分析                        .167 
9.6.2神经语言模型                        .168 
9.7延伸阅读                               170 
9.8习题                                .171
第 10章集成学习                              173 
10.1决策树                              ..173 
10.1.1 ID3算法                          .174 
10.1.2 C4.5算法                          175 
10.1.3 CART算法                         175 
10.2装包法                              ..176 
10.2.1基本方法                          .176 
10.2.2随机森林                          .177 
10.3提升法                              ..178 
10.3.1 AdaBoost算法                      ..178 
10.3.2从优化角度看 AdaBoost                   179 
10.3.3梯度提升                          .182 
10.3.4梯度提升决策树                       183 
10.3.5 XGBoost算法                       ..184 
10.4概率集成学习                           .185 
10.4.1混合线性模型                       .185 
10.4.2层次化混合专家模型                    186 
10.5深度模型的集成                           188 
10.5.1 Dropout:一种模型集成的策略               ..188 
10.5.2深度集成                          .189 
10.6延伸阅读                              .190 
10.7习题                                ..190
第 11章学习理论                              192 
11.1基本概念                              .192 
11.1.1偏差-复杂度分解                      ..193 
11.1.2结构风险*小化                       195 
11.1.3 PAC理论                          196 
11.1.4基本不等式                         .197 
11.2有限假设空间                           .198 
11.2.1 Hoeffding不等式                     ..198 
11.2.2并集上界                          .199 
11.3无限假设空间                           .201 
11.3.1 VC维                           ..201 
11.3.2 Rademacher复杂度                    .203 
目录 
11.3.3间隔理论                          .204 
11.3.4 PAC贝叶斯                        205 
11.4深度学习理论                           .206 
11.4.1双重下降                          .207 
11.4.2良性过拟合                         .208 
11.4.3隐式正则化                         209 
11.5延伸阅读                              .209 
11.6习题                                ..210
高级篇
第 12章概率图模型                            .. 215 
12.1概述                                ..215 
12.2概率图模型的表示                          217 
12.2.1贝叶斯网络                         .217 
12.2.2马尔可夫随机场                       221 
12.2.3有向图与无向图的关系                   ..224 
12.3概率图模型的推断                          226 
12.3.1变量消减                          .226 
12.3.2消息传递                          .229 
12.3.3因子图                          ..230 
12.3.4*大概率取值                       .231 
12.3.5连接树                          ..231 
12.4参数学习                              .232 
12.4.1贝叶斯网络的参数学习                   ..232 
12.4.2马尔可夫随机场的参数学习                ..233 
12.4.3条件随机场                         .235 
12.5结构学习                              .236 
12.5.1树状贝叶斯网络                       236 
12.5.2高斯马尔可夫随机场                    238 
12.6延伸阅读                              .238 
12.7习题                                ..239
第 13章变分推断                              241 
13.1基本原理                              .241 
13.1.1变分的基本原理                       241 
13.1.2推断任务                          .242 
13.2变分推断                              .244 
13.2.1对数似然的变分下界                    244 
13.2.2平均场方法                         .245 
13.2.3信念传播                          .248 
13.3变分 EM                              .250 
13.3.1从 EM到变分 EM                     ..250 
13.3.2指数分布族的变分 EM算法                .251 
13.3.3概率潜在语义分析                      252 
13.3.4随机 EM算法                       253 
13.4变分贝叶斯                             .254 
13.4.1贝叶斯定理的变分表示                   ..255 
13.4.2贝叶斯高斯混合模型                    255 
13.5期望传播                              .258 
13.5.1基础 EP算法                       .258 
13.5.2图模型的 EP算法                      260 
13.6延伸阅读                              .261 
13.7习题                                ..261
第 14章蒙特卡洛方法                           .. 263 
14.1概述                                ..263 
14.2基础采样算法                           .264 
14.2.1基于重参数化的采样                    264 
14.2.2拒绝采样                          .266 
14.2.3重要性采样                         .267 
14.2.4重要性重采样                       .268 
14.2.5原始采样                          .269 
14.3马尔可夫链蒙特卡洛                        269 
14.3.1马尔可夫链                         .269 
14.3.2 Metropolis Hastings采样                 ..271 
14.3.3 Gibbs采样                         .273 
14.3.4 Gibbs采样的变种                      274 
14.4辅助变量采样                           .274 
14.4.1切片采样                          .275 
14.4.2辅助变量采样                       .276 
14.5基于动力学系统的 MCMC采样                   .277 
14.5.1动力学系统                         .277 
14.5.2哈密尔顿方程的离散化                   ..278 
14.5.3哈密尔顿蒙特卡洛                      279 
14.5.4随机梯度 MCMC采样                   280 
14.6延伸阅读                              .281 
14.7习题                                ..282 
目录
第 15章高斯过程                              284 
15.1贝叶斯神经网络                           284 
15.1.1贝叶斯线性回归                       284 
15.1.2贝叶斯神经网络                       285 
15.1.3无限宽贝叶斯神经网络                   ..286 
15.2高斯过程回归                           .287 
15.2.1定义                            ..287 
15.2.2无噪声情况下的预测                    288 
15.2.3有噪声的预测                       .289 
15.2.4残差建模                          .290 
15.2.5协方差函数                         .291 
15.3高斯过程分类                           .293 
15.3.1基本模型                          .293 
15.3.2拉普拉斯近似推断                      293 
15.3.3期望传播近似推断                      295 
15.3.4与支持向量机的关系                    296 
15.4稀疏高斯过程                           .297 
15.4.1基于诱导点的稀疏近似                   ..297 
15.4.2稀疏变分高斯过程                      299 
15.5延伸阅读                              .300 
15.6习题                                ..301
第 16章深度生成模型                           .. 302 
16.1基本框架                              .302 
16.1.1生成模型基本概念                      302 
16.1.2基于层次化贝叶斯的建模                  ..303 
16.1.3基于深度神经网络的建模                  ..304 
16.2流模型                              ..305 
16.2.1仿射耦合流模型                       306 
16.2.2残差流模型                         .308 
16.2.3去量化                          ..309 
16.3自回归生成模型                           310 
16.3.1神经自回归密度估计器                   ..310 
16.3.2连续型神经自回归密度估计器                .312 
16.4变分自编码器                           .313 
16.4.1模型定义                          .313 
16.4.2基于重参数化的参数估计                  ..314 
16.5生成对抗网络                           .315 
16.5.1基本模型                          .315 
16.5.2沃瑟斯坦生成对抗网络                   ..318 
16.6扩散概率模型                           .319 
16.6.1模型定义                          .319 
16.6.2模型训练                          .320 
16.6.3共享参数                          .321 
16.7延伸阅读                              .322 
16.8习题                                ..323
第 17章强化学习                              324 
17.1决策任务                              .324 
17.2多臂老虎机                             .325 
17.2.1伯努利多臂老虎机                      325 
17.2.2上置信度区间算法                      327 
17.2.3汤普森采样算法                       328 
17.2.4上下文多臂老虎机                      328 
17.3马尔可夫决策过程                          330 
17.3.1基本定义                          .330 
17.3.2贝尔曼方程                         .332 
17.3.3*优化值函数与*优策略                  ..333 
17.3.4策略评估                          .334 
17.3.5策略迭代算法                       .335 
17.3.6值函数迭代算法                       335 
17.4强化学习                              .336 
17.4.1蒙特卡洛采样法                       336 
17.4.2时序差分学习                       .337 
17.4.3 Sarsa算法                         ..338 
17.4.4 Q-学习                           .339 
17.4.5值函数近似                         .340 
17.4.6策略搜索                          .342 
17.5延伸阅读                              .343 
17.6习题                                ..344
参考文献                                   . 346