ورود به حساب

نام کاربری گذرواژه

گذرواژه را فراموش کردید؟ کلیک کنید

حساب کاربری ندارید؟ ساخت حساب

ساخت حساب کاربری

نام نام کاربری ایمیل شماره موبایل گذرواژه

برای ارتباط با ما می توانید از طریق شماره موبایل زیر از طریق تماس و پیامک با ما در ارتباط باشید


09117307688
09117179751

در صورت عدم پاسخ گویی از طریق پیامک با پشتیبان در ارتباط باشید

دسترسی نامحدود

برای کاربرانی که ثبت نام کرده اند

ضمانت بازگشت وجه

درصورت عدم همخوانی توضیحات با کتاب

پشتیبانی

از ساعت 7 صبح تا 10 شب

دانلود کتاب Distributional Reinforcement Learning

دانلود کتاب یادگیری تقویتی توزیعی

Distributional Reinforcement Learning

مشخصات کتاب

Distributional Reinforcement Learning

ویرایش:  
نویسندگان: , , ,   
سری: Adaptive Computation and Machine Learning 
ISBN (شابک) : 9780262374019, 9780262048019 
ناشر: MIT Press 
سال نشر: 2023 
تعداد صفحات: 379 
زبان: English 
فرمت فایل : EPUB (درصورت درخواست کاربر به PDF، EPUB یا AZW3 تبدیل می شود) 
حجم فایل: 13 Mb 

قیمت کتاب (تومان) : 46,000



ثبت امتیاز به این کتاب

میانگین امتیاز به این کتاب :
       تعداد امتیاز دهندگان : 10


در صورت تبدیل فایل کتاب Distributional Reinforcement Learning به فرمت های PDF، EPUB، AZW3، MOBI و یا DJVU می توانید به پشتیبان اطلاع دهید تا فایل مورد نظر را تبدیل نمایند.

توجه داشته باشید کتاب یادگیری تقویتی توزیعی نسخه زبان اصلی می باشد و کتاب ترجمه شده به فارسی نمی باشد. وبسایت اینترنشنال لایبرری ارائه دهنده کتاب های زبان اصلی می باشد و هیچ گونه کتاب ترجمه شده یا نوشته شده به فارسی را ارائه نمی دهد.


توضیحاتی در مورد کتاب یادگیری تقویتی توزیعی




توضیحاتی درمورد کتاب به خارجی

The first comprehensive guide to distributional reinforcement learning, providing a new mathematical formalism for thinking about decisions from a probabilistic perspective. Distributional reinforcement learning is a new mathematical formalism for thinking about decisions. Going beyond the common approach to reinforcement learning and expected values, it focuses on the total reward or return obtained as a consequence of an agent\'s choices—specifically, how this return behaves from a probabilistic perspective. In this first comprehensive guide to distributional reinforcement learning, Marc G. Bellemare, Will Dabney, and Mark Rowland, who spearheaded development of the field, present its key concepts and review some of its many applications. They demonstrate its power to account for many complex, interesting phenomena that arise from interactions with one\'s environment. The authors present core ideas from classical reinforcement learning to contextualize distributional topics and include mathematical proofs pertaining to major results discussed in the text. They guide the reader through a series of algorithmic and mathematical developments that, in turn, characterize, compute, estimate, and make decisions on the basis of the random return. Practitioners in disciplines as diverse as finance (risk management), computational neuroscience, computational psychiatry, psychology, macroeconomics, and robotics are already using distributional reinforcement learning, paving the way for its expanding applications in mathematical finance, engineering, and the life sciences. More than a mathematical approach, distributional reinforcement learning represents a new perspective on how intelligent agents make predictions and decisions.



فهرست مطالب

1 Introduction
 3  1.1 Why Distributional Reinforcement Learning?
 4  1.2 An Example: Kuhn Poker
 5  1.3 How Is Distributional Reinforcement Learning Different?
 6  1.4 Intended Audience and Organization
 7  1.5 Bibliographical Remarks
 8  2 The Distribution of Returns
 9  2.1 Random Variables and Their Probability Distributions
 10  2.2 Markov Decision Processes
 11  2.3 The Pinball Model
 12  2.4 The Return
 13  2.5 The Bellman Equation
 14  2.6 Properties of the Random Trajectory
 15  2.7 The Random-Variable Bellman Equation
 16  2.8 From Random Variables to Probability Distributions
 17  2.9 Alternative Notions of the Return Distribution*
 18  2.10 Technical Remarks
 19  2.11 Bibliographical Remarks
 20  2.12 Exercises
 21  3 Learning the Return Distribution
 22  3.1 The Monte Carlo Method
 23  3.2 Incremental Learning
 24  3.3 Temporal-Difference Learning
 25  3.4 From Values to Probabilities
 26  3.5 The Projection Step
 27  3.6 Categorical Temporal-Difference Learning
 28  3.7 Learning to Control
 29  3.8 Further Considerations
 30  3.9 Technical Remarks
 31  3.10 Bibliographical Remarks
 32  3.11 Exercises
 33  4 Operators and Metrics
 34  4.1 The Bellman Operator
 35  4.2 Contraction Mappings
 36  4.3 The Distributional Bellman Operator
 37  4.4 Wasserstein Distances for Return Functions
 38  4.5 ℓ p Probability Metrics and the Cramér Distance
 39  4.6 Sufficient Conditions for Contractivity
 40  4.7 A Matter of Domain
 41  4.8 Weak Convergence of Return Functions*
 42  4.9 Random-Variable Bellman Operators*
 43  4.10 Technical Remarks
 44  4.11 Bibliographical Remarks
 45  4.12 Exercises
 46  5 Distributional Dynamic Programming
 47  5.1 Computational Model
 48  5.2 Representing Return-Distribution Functions
 49  5.3 The Empirical Representation
 50  5.4 The Normal Representation
5.5 Fixed-Size Empirical Representations
 52  5.6 The Projection Step
 53  5.7 Distributional Dynamic Programming
 54  5.8 Error Due to Diffusion
 55  5.9 Convergence of Distributional Dynamic Programming
 56  5.10 Quality of the Distributional Approximation
 57  5.11 Designing Distributional Dynamic Programming Algorithms
 58  5.12 Technical Remarks
 59  5.13 Bibliographical Remarks
 60  5.14 Exercises
 61  6 Incremental Algorithms
 62  6.1 Computation and Statistical Estimation
 63  6.2 From Operators to Incremental Algorithms
 64  6.3 Categorical Temporal-Difference Learning
 65  6.4 Quantile Temporal-Difference Learning
 66  6.5 An Algorithmic Template for Theoretical Analysis
 67  6.6 The Right Step Sizes
 68  6.7 Overview of Convergence Analysis
 69  6.8 Convergence of Incremental Algorithms*
 70  6.9 Convergence of Temporal-Difference Learning*
 71  6.10 Convergence of Categorical Temporal-Difference Learning*
 72  6.11 Technical Remarks
 73  6.12 Bibliographical Remarks
 74  6.13 Exercises
 75  7 Control
 76  7.1 Risk-Neutral Control
 77  7.2 Value Iteration and Q-Learning
 78  7.3 Distributional Value Iteration
 79  7.4 Dynamics of Distributional Optimality Operators
 80  7.5 Dynamics in the Presence of Multiple Optimal Policies*
 81  7.6 Risk and Risk-Sensitive Control
 82  7.7 Challenges in Risk-Sensitive Control
 83  7.8 Conditional Value-At-Risk*
 84  7.9 Technical Remarks
 85  7.10 Bibliographical Remarks
 86  7.11 Exercises
 87  8 Statistical Functionals
 88  8.1 Statistical Functionals
 89  8.2 Moments
 90  8.3 Bellman Closedness
 91  8.4 Statistical Functional Dynamic Programming
 92  8.5 Relationship to Distributional Dynamic Programming
 93  8.6 Expectile Dynamic Programming
 94  8.7 Infinite Collections of Statistical Functionals
 95  8.8 Moment Temporal-Difference Learning*
 96  8.9 Technical Remarks
 97  8.10 Bibliographical Remarks
 98  8.11 Exercises
 99  9 Linear Function Approximation
 100  9.1 Function Approximation and Aliasing
 101  9.2 Optimal Linear Value Function Approximations
 102  9.3 A Projected Bellman Operator for Linear Value Function Approximation
 103  9.4 Semi-Gradient Temporal-Difference Learning
 104  9.5 Semi-Gradient Algorithms for Distributional Reinforcement Learning
 105  9.6 An Algorithm Based on Signed Distributions*
 106  9.7 Convergence of the Signed Algorithm*
 107  9.8 Technical Remarks
 108  9.9 Bibliographical Remarks
 109  9.10 Exercises
 110  10 Deep Reinforcement Learning
 111  10.1 Learning with a Deep Neural Network
10.2 Distributional Reinforcement Learning with Deep Neural Networks
 113  10.3 Implicit Parameterizations
 114  10.4 Evaluation of Deep Reinforcement Learning Agents
 115  10.5 How Predictions Shape State Representations
 116  10.6 Technical Remarks
 117  10.7 Bibliographical Remarks
 118  10.8 Exercises
 119  11 Two Applications and a Conclusion
 120  11.1 Multiagent Reinforcement Learning
 121  11.2 Computational Neuroscience
 122  11.3 Conclusion
 123  11.4 Bibliographical Remarks
 124  Notation
 125  References
 126  Index




نظرات کاربران