更新时间:2021-06-18 19:12:48
封面
版权信息
Packt Upsell
Why subscribe?
PacktPub.com
Contributors
About the author
About the reviewers
Packt is searching for authors like you
Preface
Who this book is for
What this book covers
To get the most out of this book
Download the example code files
Download the color images
Conventions used
Get in touch
Reviews
Introduction to Reinforcement Learning
What is RL?
RL algorithm
How RL differs from other ML paradigms
Elements of RL
Agent
Policy function
Value function
Model
Agent environment interface
Types of RL environment
Deterministic environment
Stochastic environment
Fully observable environment
Partially observable environment
Discrete environment
Continuous environment
Episodic and non-episodic environment
Single and multi-agent environment
RL platforms
OpenAI Gym and Universe
DeepMind Lab
RL-Glue
Project Malmo
ViZDoom
Applications of RL
Education
Medicine and healthcare
Manufacturing
Inventory management
Finance
Natural Language Processing and Computer Vision
Summary
Questions
Further reading
Getting Started with OpenAI and TensorFlow
Setting up your machine
Installing Anaconda
Installing Docker
Installing OpenAI Gym and Universe
Common error fixes
OpenAI Gym
Basic simulations
Training a robot to walk
OpenAI Universe
Building a video game bot
TensorFlow
Variables constants and placeholders
Variables
Constants
Placeholders
Computation graph
Sessions
TensorBoard
Adding scope
The Markov Decision Process and Dynamic Programming
The Markov chain and Markov process
Markov Decision Process
Rewards and returns
Episodic and continuous tasks
Discount factor
The policy function
State value function
State-action value function (Q function)
The Bellman equation and optimality
Deriving the Bellman equation for value and Q functions
Solving the Bellman equation
Dynamic programming
Value iteration
Policy iteration
Solving the frozen lake problem
Gaming with Monte Carlo Methods
Monte Carlo methods
Estimating the value of pi using Monte Carlo