Hands-On Reinforcement Learning with Python

Sudharsan Ravichandiran

更新时间：2021-06-18 19:12:48

最新章节：Leave a review - let other readers know what you think

封面

版权信息

Packt Upsell

Why subscribe?

PacktPub.com

Contributors

About the author

About the reviewers

Packt is searching for authors like you

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Download the color images

Conventions used

Get in touch

Reviews

Introduction to Reinforcement Learning

What is RL?

RL algorithm

How RL differs from other ML paradigms

Elements of RL

Agent

Policy function

Value function

Model

Agent environment interface

Types of RL environment

Deterministic environment

Stochastic environment

Fully observable environment

Partially observable environment

Discrete environment

Continuous environment

Episodic and non-episodic environment

Single and multi-agent environment

RL platforms

OpenAI Gym and Universe

DeepMind Lab

RL-Glue

Project Malmo

ViZDoom

Applications of RL

Education

Medicine and healthcare

Manufacturing

Inventory management

Finance

Natural Language Processing and Computer Vision

Summary

Questions

Further reading

Getting Started with OpenAI and TensorFlow

Setting up your machine

Installing Anaconda

Installing Docker

Installing OpenAI Gym and Universe

Common error fixes

OpenAI Gym

Basic simulations

Training a robot to walk

OpenAI Universe

Building a video game bot

TensorFlow

Variables constants and placeholders

Variables

Constants

Placeholders

Computation graph

Sessions

TensorBoard

Adding scope

Summary

Questions

Further reading

The Markov Decision Process and Dynamic Programming

The Markov chain and Markov process

Markov Decision Process

Rewards and returns

Episodic and continuous tasks

Discount factor

The policy function

State value function

State-action value function (Q function)

The Bellman equation and optimality

Deriving the Bellman equation for value and Q functions

Solving the Bellman equation

Dynamic programming

Value iteration

Policy iteration

Solving the frozen lake problem

Value iteration

Policy iteration

Summary

Questions

Further reading

Gaming with Monte Carlo Methods

Monte Carlo methods

Estimating the value of pi using Monte Carlo