دسترسی نامحدود
برای کاربرانی که ثبت نام کرده اند
برای ارتباط با ما می توانید از طریق شماره موبایل زیر از طریق تماس و پیامک با ما در ارتباط باشید
در صورت عدم پاسخ گویی از طریق پیامک با پشتیبان در ارتباط باشید
برای کاربرانی که ثبت نام کرده اند
درصورت عدم همخوانی توضیحات با کتاب
از ساعت 7 صبح تا 10 شب
ویرایش: نویسندگان: Marc G. Bellemare, Will Dabney, Mark Rowland, سری: Adaptive Computation and Machine Learning ISBN (شابک) : 9780262374019, 9780262048019 ناشر: MIT Press سال نشر: 2023 تعداد صفحات: 379 زبان: English فرمت فایل : EPUB (درصورت درخواست کاربر به PDF، EPUB یا AZW3 تبدیل می شود) حجم فایل: 13 Mb
در صورت تبدیل فایل کتاب Distributional Reinforcement Learning به فرمت های PDF، EPUB، AZW3، MOBI و یا DJVU می توانید به پشتیبان اطلاع دهید تا فایل مورد نظر را تبدیل نمایند.
توجه داشته باشید کتاب یادگیری تقویتی توزیعی نسخه زبان اصلی می باشد و کتاب ترجمه شده به فارسی نمی باشد. وبسایت اینترنشنال لایبرری ارائه دهنده کتاب های زبان اصلی می باشد و هیچ گونه کتاب ترجمه شده یا نوشته شده به فارسی را ارائه نمی دهد.
The first comprehensive guide to distributional reinforcement learning, providing a new mathematical formalism for thinking about decisions from a probabilistic perspective. Distributional reinforcement learning is a new mathematical formalism for thinking about decisions. Going beyond the common approach to reinforcement learning and expected values, it focuses on the total reward or return obtained as a consequence of an agent\'s choices—specifically, how this return behaves from a probabilistic perspective. In this first comprehensive guide to distributional reinforcement learning, Marc G. Bellemare, Will Dabney, and Mark Rowland, who spearheaded development of the field, present its key concepts and review some of its many applications. They demonstrate its power to account for many complex, interesting phenomena that arise from interactions with one\'s environment. The authors present core ideas from classical reinforcement learning to contextualize distributional topics and include mathematical proofs pertaining to major results discussed in the text. They guide the reader through a series of algorithmic and mathematical developments that, in turn, characterize, compute, estimate, and make decisions on the basis of the random return. Practitioners in disciplines as diverse as finance (risk management), computational neuroscience, computational psychiatry, psychology, macroeconomics, and robotics are already using distributional reinforcement learning, paving the way for its expanding applications in mathematical finance, engineering, and the life sciences. More than a mathematical approach, distributional reinforcement learning represents a new perspective on how intelligent agents make predictions and decisions.
1 Introduction 3 1.1 Why Distributional Reinforcement Learning? 4 1.2 An Example: Kuhn Poker 5 1.3 How Is Distributional Reinforcement Learning Different? 6 1.4 Intended Audience and Organization 7 1.5 Bibliographical Remarks 8 2 The Distribution of Returns 9 2.1 Random Variables and Their Probability Distributions 10 2.2 Markov Decision Processes 11 2.3 The Pinball Model 12 2.4 The Return 13 2.5 The Bellman Equation 14 2.6 Properties of the Random Trajectory 15 2.7 The Random-Variable Bellman Equation 16 2.8 From Random Variables to Probability Distributions 17 2.9 Alternative Notions of the Return Distribution* 18 2.10 Technical Remarks 19 2.11 Bibliographical Remarks 20 2.12 Exercises 21 3 Learning the Return Distribution 22 3.1 The Monte Carlo Method 23 3.2 Incremental Learning 24 3.3 Temporal-Difference Learning 25 3.4 From Values to Probabilities 26 3.5 The Projection Step 27 3.6 Categorical Temporal-Difference Learning 28 3.7 Learning to Control 29 3.8 Further Considerations 30 3.9 Technical Remarks 31 3.10 Bibliographical Remarks 32 3.11 Exercises 33 4 Operators and Metrics 34 4.1 The Bellman Operator 35 4.2 Contraction Mappings 36 4.3 The Distributional Bellman Operator 37 4.4 Wasserstein Distances for Return Functions 38 4.5 ℓ p Probability Metrics and the Cramér Distance 39 4.6 Sufficient Conditions for Contractivity 40 4.7 A Matter of Domain 41 4.8 Weak Convergence of Return Functions* 42 4.9 Random-Variable Bellman Operators* 43 4.10 Technical Remarks 44 4.11 Bibliographical Remarks 45 4.12 Exercises 46 5 Distributional Dynamic Programming 47 5.1 Computational Model 48 5.2 Representing Return-Distribution Functions 49 5.3 The Empirical Representation 50 5.4 The Normal Representation 5.5 Fixed-Size Empirical Representations 52 5.6 The Projection Step 53 5.7 Distributional Dynamic Programming 54 5.8 Error Due to Diffusion 55 5.9 Convergence of Distributional Dynamic Programming 56 5.10 Quality of the Distributional Approximation 57 5.11 Designing Distributional Dynamic Programming Algorithms 58 5.12 Technical Remarks 59 5.13 Bibliographical Remarks 60 5.14 Exercises 61 6 Incremental Algorithms 62 6.1 Computation and Statistical Estimation 63 6.2 From Operators to Incremental Algorithms 64 6.3 Categorical Temporal-Difference Learning 65 6.4 Quantile Temporal-Difference Learning 66 6.5 An Algorithmic Template for Theoretical Analysis 67 6.6 The Right Step Sizes 68 6.7 Overview of Convergence Analysis 69 6.8 Convergence of Incremental Algorithms* 70 6.9 Convergence of Temporal-Difference Learning* 71 6.10 Convergence of Categorical Temporal-Difference Learning* 72 6.11 Technical Remarks 73 6.12 Bibliographical Remarks 74 6.13 Exercises 75 7 Control 76 7.1 Risk-Neutral Control 77 7.2 Value Iteration and Q-Learning 78 7.3 Distributional Value Iteration 79 7.4 Dynamics of Distributional Optimality Operators 80 7.5 Dynamics in the Presence of Multiple Optimal Policies* 81 7.6 Risk and Risk-Sensitive Control 82 7.7 Challenges in Risk-Sensitive Control 83 7.8 Conditional Value-At-Risk* 84 7.9 Technical Remarks 85 7.10 Bibliographical Remarks 86 7.11 Exercises 87 8 Statistical Functionals 88 8.1 Statistical Functionals 89 8.2 Moments 90 8.3 Bellman Closedness 91 8.4 Statistical Functional Dynamic Programming 92 8.5 Relationship to Distributional Dynamic Programming 93 8.6 Expectile Dynamic Programming 94 8.7 Infinite Collections of Statistical Functionals 95 8.8 Moment Temporal-Difference Learning* 96 8.9 Technical Remarks 97 8.10 Bibliographical Remarks 98 8.11 Exercises 99 9 Linear Function Approximation 100 9.1 Function Approximation and Aliasing 101 9.2 Optimal Linear Value Function Approximations 102 9.3 A Projected Bellman Operator for Linear Value Function Approximation 103 9.4 Semi-Gradient Temporal-Difference Learning 104 9.5 Semi-Gradient Algorithms for Distributional Reinforcement Learning 105 9.6 An Algorithm Based on Signed Distributions* 106 9.7 Convergence of the Signed Algorithm* 107 9.8 Technical Remarks 108 9.9 Bibliographical Remarks 109 9.10 Exercises 110 10 Deep Reinforcement Learning 111 10.1 Learning with a Deep Neural Network 10.2 Distributional Reinforcement Learning with Deep Neural Networks 113 10.3 Implicit Parameterizations 114 10.4 Evaluation of Deep Reinforcement Learning Agents 115 10.5 How Predictions Shape State Representations 116 10.6 Technical Remarks 117 10.7 Bibliographical Remarks 118 10.8 Exercises 119 11 Two Applications and a Conclusion 120 11.1 Multiagent Reinforcement Learning 121 11.2 Computational Neuroscience 122 11.3 Conclusion 123 11.4 Bibliographical Remarks 124 Notation 125 References 126 Index