00726c8b57ad409363ede0e754c0137d3fcc71cc,chapter04/gamblers_problem.py,,figure_4_3,#,23

Before Change


                action_returns.append(
                    HEAD_PROB * state_value[state + action] + (1 - HEAD_PROB) * state_value[state - action])
            new_value = np.max(action_returns)
            delta += np.abs(state_value[state] - new_value)
            // update state value
            state_value[state] = new_value
        if delta < 1e-9:
            break

    // compute the optimal policy
    policy = np.zeros(GOAL + 1)
    for state in STATES[1:GOAL]:

After Change



    // value iteration
    while True:
        old_state_value = state_value.copy()
        sweeps_history.append(old_state_value)

        for state in STATES[1:GOAL]:
            // get possilbe actions for current state
            actions = np.arange(min(state, GOAL - state) + 1)
            action_returns = []
            for action in actions:
                action_returns.append(
                    HEAD_PROB * state_value[state + action] + (1 - HEAD_PROB) * state_value[state - action])
            new_value = np.max(action_returns)
            state_value[state] = new_value
        delta = abs(state_value - old_state_value).max()
        if delta < 1e-9:
            sweeps_history.append(state_value)
            break

In pattern: SUPERPATTERN

Frequency: 3

Non-data size: 4

Instances

Link

Project Name: ShangtongZhang/reinforcement-learning-an-introduction

Commit Name: 00726c8b57ad409363ede0e754c0137d3fcc71cc

Time: 2019-06-12

Author: wlbksy@126.com

File Name: chapter04/gamblers_problem.py

Class Name:

Method Name: figure_4_3

Link

Project Name: nilearn/nilearn

Commit Name: dd7c34ea3480f2ffd8843171676aaa22b1777bd8

Time: 2014-05-28

Author: bertrand.thirion@inria.fr

File Name: nilearn/decomposition/tests/test_canica.py

Class Name:

Method Name: test_canica_square_img

Link

Project Name: studioml/studio

Commit Name: 5064e583f4d372e3a4038df3123efe454d34fd7d

Time: 2018-06-29

Author: karlmutch@users.noreply.github.com

File Name: studio/rabbit_queue.py

Class Name: RMQueue

Method Name: enqueue