5bb63d23a7722d060f7df345a9d2b51dc98c035a,ch09/02_cartpole_reinforce.py,,,#,34

Before Change


        step_rewards = step_rewards[-1000:]

        baseline = np.mean(step_rewards)
        writer.add_scalar("baseline", baseline, step_idx)
        batch_states.append(exp.state)
        batch_actions.append(int(exp.action))
        batch_scales.append(exp.reward - baseline)

After Change



    batch_episodes = 0
    batch_states, batch_actions, batch_qvals = [], [], []
    cur_states, cur_actions, cur_rewards = [], [], []

    for step_idx, exp in enumerate(exp_source):
        cur_states.append(exp.state)
        cur_actions.append(int(exp.action))
Italian Trulli
In pattern: SUPERPATTERN

Frequency: 3

Non-data size: 3

Instances


Project Name: PacktPublishing/Deep-Reinforcement-Learning-Hands-On
Commit Name: 5bb63d23a7722d060f7df345a9d2b51dc98c035a
Time: 2017-12-03
Author: max.lapan@gmail.com
File Name: ch09/02_cartpole_reinforce.py
Class Name:
Method Name:


Project Name: PacktPublishing/Deep-Reinforcement-Learning-Hands-On
Commit Name: 8acf099847ebf73ad8cdae1341d0f768dbe1c094
Time: 2017-12-04
Author: max.lapan@gmail.com
File Name: ch09/04_pong_pg.py
Class Name:
Method Name:


Project Name: lufficc/SSD
Commit Name: 94a995defe223eed0898f25d2332ba6178a92abe
Time: 2018-12-19
Author: luffy.lcc@gmail.com
File Name: ssd/engine/trainer.py
Class Name:
Method Name: do_train