دسترسی نامحدود
برای کاربرانی که ثبت نام کرده اند
برای ارتباط با ما می توانید از طریق شماره موبایل زیر از طریق تماس و پیامک با ما در ارتباط باشید
در صورت عدم پاسخ گویی از طریق پیامک با پشتیبان در ارتباط باشید
برای کاربرانی که ثبت نام کرده اند
درصورت عدم همخوانی توضیحات با کتاب
از ساعت 7 صبح تا 10 شب
دسته بندی: سایبرنتیک: هوش مصنوعی ویرایش: draft نویسندگان: Shalev-Shwartz S., Ben-David S. سری: ISBN (شابک) : 9781107057135 ناشر: CUP سال نشر: 2014 تعداد صفحات: 416 زبان: English فرمت فایل : PDF (درصورت درخواست کاربر به PDF، EPUB یا AZW3 تبدیل می شود) حجم فایل: 3 مگابایت
کلمات کلیدی مربوط به کتاب درک یادگیری ماشینی: از تئوری تا الگوریتم ها: علوم و مهندسی کامپیوتر، هوش مصنوعی، داده کاوی
در صورت تبدیل فایل کتاب Understanding Machine Learning: From Theory to Algorithms به فرمت های PDF، EPUB، AZW3، MOBI و یا DJVU می توانید به پشتیبان اطلاع دهید تا فایل مورد نظر را تبدیل نمایند.
توجه داشته باشید کتاب درک یادگیری ماشینی: از تئوری تا الگوریتم ها نسخه زبان اصلی می باشد و کتاب ترجمه شده به فارسی نمی باشد. وبسایت اینترنشنال لایبرری ارائه دهنده کتاب های زبان اصلی می باشد و هیچ گونه کتاب ترجمه شده یا نوشته شده به فارسی را ارائه نمی دهد.
Cover Halftitle Title Copyright Dedication Contents Preface 1 Introduction 1.1 What Is Learning? 1.2 When Do We Need Machine Learning? 1.3 Types of Learning 1.4 Relations to Other Fields 1.5 How to Read This Book 1.5.1 Possible Course Plans Based on This Book 1.6 Notation Part 1 Foundations 2 A Gentle Start 2.1 A Formal Model – The Statistical Learning Framework 2.2 Empirical Risk Minimization 2.2.1 Something May Go Wrong – Overfitting 2.3 Empirical Risk Minimization with Inductive Bias 2.3.1 Finite Hypothesis Classes 2.4 Exercises 3 A Formal Learning Model 3.1 PAC Learning 3.2 A More General Learning Model 3.2.1 Releasing the Realizability Assumption – Agnostic PAC Learning 3.2.2 The Scope of Learning Problems Modeled 3.3 Summary 3.4 Bibliographic Remarks 3.5 Exercises 4 Learning via Uniform Convergence 4.1 Uniform Convergence Is Sufficient for Learnability 4.2 Finite Classes Are Agnostic PAC Learnable 4.3 Summary 4.4 Bibliographic Remarks 4.5 Exercises 5 The Bias-Complexity Tradeoff 5.1 The No-Free-Lunch Theorem 5.1.1 No-Free-Lunch and Prior Knowledge 5.2 Error Decomposition 5.3 Summary 5.4 Bibliographic Remarks 5.5 Exercises 6 The VC-Dimension 6.1 Infinite-Size Classes Can Be Learnable 6.2 The VC-Dimension 6.3 Examples 6.3.1 Threshold Functions 6.3.2 Intervals 6.3.3 Axis Aligned Rectangles 6.3.4 Finite Classes 6.3.5 VC-Dimension and the Number of Parameters 6.4 The Fundamental Theorem of PAC learning 6.5 Proof of Theorem 6.7 6.5.1 Sauer\'s Lemma and the Growth Function 6.5.2 Uniform Convergence for Classes of Small Effective Size 6.6 Summary 6.7 Bibliographic Remarks 6.8 Exercises 7 Nonuniform Learnability 7.1 Nonuniform Learnability 7.1.1 Characterizing Nonuniform Learnability 7.2 Structural Risk Minimization 7.3 Minimum Description Length and Occam\'s Razor 7.3.1 Occam\'s Razor 7.4 Other Notions of Learnability – Consistency 7.5 Discussing the Different Notions of Learnability 7.5.1 The No-Free-Lunch Theorem Revisited 7.6 Summary 7.7 Bibliographic Remarks 7.8 Exercises 8 The Runtime of Learning 8.1 Computational Complexity of Learning 8.1.1 Formal Definition[sup(*)] 8.2 Implementing the ERM Rule 8.2.1 Finite Classes 8.2.2 Axis Aligned Rectangles 8.2.3 Boolean Conjunctions 8.2.4 Learning 3-Term DNF 8.3 Efficiently Learnable, but Not by a Proper ERM 8.4 Hardness of Learning[sup(*)] 8.5 Summary 8.6 Bibliographic Remarks 8.7 Exercises Part 2 From Theory to Algorithms 9 Linear Predictors 9.1 Halfspaces 9.1.1 Linear Programming for the Class of Halfspaces 9.1.2 Perceptron for Halfspaces 9.1.3 The VC Dimension of Halfspaces 9.2 Linear Regression 9.2.1 Least Squares 9.2.2 Linear Regression for Polynomial Regression Tasks 9.3 Logistic Regression 9.4 Summary 9.5 Bibliographic Remarks 9.6 Exercises 10 Boosting 10.1 Weak Learnability 10.1.1 Efficient Implementation of ERM for Decision Stumps 10.2 AdaBoost 10.3 Linear Combinations of Base Hypotheses 10.3.1 The VC-Dimension of L(B,T) 10.4 AdaBoost for Face Recognition 10.5 Summary 10.6 Bibliographic Remarks 10.7 Exercises 11 Model Selection and Validation 11.1 Model Selection Using SRM 11.2 Validation 11.2.1 Hold Out Set 11.2.2 Validation for Model Selection 11.2.3 The Model-Selection Curve 11.2.4 k-Fold Cross Validation 11.2.5 Train-Validation-Test Split 11.3 What to Do If Learning Fails 11.4 Summary 11.5 Exercises 12 Convex Learning Problems 12.1 Convexity, Lipschitzness, and Smoothness 12.1.1 Convexity 12.1.2 Lipschitzness 12.1.3 Smoothness 12.2 Convex Learning Problems 12.2.1 Learnability of Convex Learning Problems 12.2.2 Convex-Lipschitz/Smooth-Bounded Learning Problems 12.3 Surrogate Loss Functions 12.4 Summary 12.5 Bibliographic Remarks 12.6 Exercises 13 Regularization and Stability 13.1 Regularized Loss Minimization 13.1.1 Ridge Regression 13.2 Stable Rules Do Not Overfit 13.3 Tikhonov Regularization as a Stabilizer 13.3.1 Lipschitz Loss 13.3.2 Smooth and Nonnegative Loss 13.4 Controlling the Fitting-Stability Tradeoff 13.5 Summary 13.6 Bibliographic Remarks 13.7 Exercises 14 Stochastic Gradient Descent 14.1 Gradient Descent 14.1.1 Analysis of GD for Convex-Lipschitz Functions 14.2 Subgradients 14.2.1 Calculating Subgradients 14.2.2 Subgradients of Lipschitz Functions 14.2.3 Subgradient Descent 14.3 Stochastic Gradient Descent (SGD) 14.3.1 Analysis of SGD for Convex-Lipschitz-Bounded Functions 14.4 Variants 14.4.1 Adding a Projection Step 14.4.2 Variable Step Size 14.4.3 Other Averaging Techniques 14.4.4 Strongly Convex Functions[sup(*)] 14.5 Learning with SGD 14.5.1 SGD for Risk Minimization 14.5.2 Analyzing SGD for Convex-Smooth Learning Problems 14.5.3 SGD for Regularized Loss Minimization 14.6 Summary 14.7 Bibliographic Remarks 14.8 Exercises 15 Support Vector Machines 15.1 Margin and Hard-SVM 15.1.1 The Homogenous Case 15.1.2 The Sample Complexity of Hard-SVM 15.2 Soft-SVM and Norm Regularization 15.2.1 The Sample Complexity of Soft-SVM 15.2.2 Margin and Norm-Based Bounds versus Dimension 15.2.3 The Ramp Loss[sup(*)] 15.3 Optimality Conditions and \'\'Support Vectors\'\'[sup(*)] 15.4 Duality[sup(*)] 15.5 Implementing Soft-SVM Using SGD 15.6 Summary 15.7 Bibliographic Remarks 15.8 Exercises 16 Kernel Methods 16.1 Embeddings into Feature Spaces 16.2 The Kernel Trick 16.2.1 Kernels as a Way to Express Prior Knowledge 16.2.2 Characterizing Kernel Functions[sup(*)] 16.3 Implementing Soft-SVM with Kernels 16.4 Summary 16.5 Bibliographic Remarks 16.6 Exercises 17 Multiclass, Ranking, and Complex Prediction Problems 17.1 One-versus-All and All-Pairs 17.2 Linear Multiclass Predictors 17.2.1 How to Construct Ψ 17.2.2 Cost-Sensitive Classification 17.2.3 ERM 17.2.4 Generalized Hinge Loss 17.2.5 Multiclass SVM and SGD 17.3 Structured Output Prediction 17.4 Ranking 17.4.1 Linear Predictors for Ranking 17.5 Bipartite Ranking and Multivariate Performance Measures 17.5.1 Linear Predictors for Bipartite Ranking 17.6 Summary 17.7 Bibliographic Remarks 17.8 Exercises 18 Decision Trees 18.1 Sample Complexity 18.2 Decision Tree Algorithms 18.2.1 Implementations of the Gain Measure 18.2.2 Pruning 18.2.3 Threshold-Based Splitting Rules for Real-Valued Features 18.3 Random Forests 18.4 Summary 18.5 Bibliographic Remarks 18.6 Exercises 19 Nearest Neighbor 19.1 k Nearest Neighbors 19.2 Analysis 19.2.1 A Generalization Bound for the 1-NN Rule 19.2.2 The \'\'Curse of Dimensionality\'\' 19.3 Efficient Implementation[sup(*)] 19.4 Summary 19.5 Bibliographic Remarks 19.6 Exercises 20 Neural Networks 20.1 Feedforward Neural Networks 20.2 Learning Neural Networks 20.3 The Expressive Power of Neural Networks 20.3.1 Geometric Intuition 20.4 The Sample Complexity of Neural Networks 20.5 The Runtime of Learning Neural Networks 20.6 SGD and Backpropagation 20.7 Summary 20.8 Bibliographic Remarks 20.9 Exercises Part 3 Additional Learning Models 21 Online Learning 21.1 Online Classification in the Realizable Case 21.1.1 Online Learnability 21.2 Online Classification in the Unrealizable Case 21.2.1 Weighted-Majority 21.3 Online Convex Optimization 21.4 The Online Perceptron Algorithm 21.5 Summary 21.6 Bibliographic Remarks 21.7 Exercises 22 Clustering 22.1 Linkage-Based Clustering Algorithms 22.2 k-Means and Other Cost Minimization Clusterings 22.2.1 The k-Means Algorithm 22.3 Spectral Clustering 22.3.1 Graph Cut 22.3.2 Graph Laplacian and Relaxed Graph Cuts 22.3.3 Unnormalized Spectral Clustering 22.4 Information Bottleneck[sup(*)] 22.5 A High Level View of Clustering 22.6 Summary 22.7 Bibliographic Remarks 22.8 Exercises 23 Dimensionality Reduction 23.1 Principal Component Analysis (PCA) 23.1.1 A More Efficient Solution for the Case d >> m 23.1.2 Implementation and Demonstration 23.2 Random Projections 23.3 Compressed Sensing 23.3.1 Proofs[sup(*)] 23.4 PCA or Compressed Sensing? 23.5 Summary 23.6 Bibliographic Remarks 23.7 Exercises 24 Generative Models 24.1 Maximum Likelihood Estimator 24.1.1 Maximum Likelihood Estimation for Continuous Random Variables 24.1.2 Maximum Likelihood and Empirical Risk Minimization 24.1.3 Generalization Analysis 24.2 Naive Bayes 24.3 Linear Discriminant Analysis 24.4 Latent Variables and the EM Algorithm 24.4.1 EM as an Alternate Maximization Algorithm 24.4.2 EM for Mixture of Gaussians (Soft k-Means) 24.5 Bayesian Reasoning 24.6 Summary 24.7 Bibliographic Remarks 24.8 Exercises 25 Feature Selection and Generation 25.1 Feature Selection 25.1.1 Filters 25.1.2 Greedy Selection Approaches 25.1.3 Sparsity-Inducing Norms 25.2 Feature Manipulation and Normalization 25.2.1 Examples of Feature Transformations 25.3 Feature Learning 25.3.1 Dictionary Learning Using Auto-Encoders 25.4 Summary 25.5 Bibliographic Remarks 25.6 Exercises Part 4 Advanced Theory 26 Rademacher Complexities 26.1 The Rademacher Complexity 26.1.1 Rademacher Calculus 26.2 Rademacher Complexity of Linear Classes 26.3 Generalization Bounds for SVM 26.4 Generalization Bounds for Predictors with Low[sup(1)] Norm 26.5 Bibliographic Remarks 27 Covering Numbers 27.1 Covering 27.1.1 Properties 27.2 From Covering to Rademacher Complexity via Chaining 27.3 Bibliographic Remarks 28 Proof of the Fundamental Theorem of Learning Theory 28.1 The Upper Bound for the Agnostic Case 28.2 The Lower Bound for the Agnostic Case 28.2.1 Showing That m(ε,δ) ≥ 0.5log(1/(4δ))/ε[sup(2)] 28.2.2 Showing That m(ε,1/8) ≥ 8d/ε[sup(2)] 28.3 The Upper Bound for the Realizable Case 28.3.1 From ε-Nets to PAC Learnability 29 Multiclass Learnability 29.1 The Natarajan Dimension 29.2 The Multiclass Fundamental Theorem 29.2.1 On the Proof of Theorem 29.3 29.3 Calculating the Natarajan Dimension 29.3.1 One-vs.-All Based Classes 29.3.2 General Multiclass-to-Binary Reductions 29.3.3 Linear Multiclass Predictors 29.4 On Good and Bad ERMs 29.5 Bibliographic Remarks 29.6 Exercises 30 Compression Bounds 30.1 Compression Bounds 30.2 Examples 30.2.1 Axis Aligned Rectangles 30.2.2 Halfspaces 30.2.3 Separating Polynomials 30.2.4 Separation with Margin 30.3 Bibliographic Remarks 31 PAC-Bayes 31.1 PAC-Bayes Bounds 31.2 Bibliographic Remarks 31.3 Exercises A Technical Lemmas B Measure Concentration B.1 Markov\'s Inequality B.2 Chebyshev\'s Inequality B.3 Chernoff\'s Bounds B.4 Hoeffding\'s Inequality B.5 Bennet\'s and Bernstein\'s Inequalities B.5.1 Application B.6 Slud\'s Inequality B.7 Concentration of χ[sup(2)] Variables C Linear Algebra C.1 Basic Definitions C.2 Eigenvalues and Eigenvectors C.3 Positive definite matrices C.4 Singular Value Decomposition (SVD) References Index