e70bdb2d089ae283781c45b8d97963823a984baa,ch10/00_pong_pg.py,,,#,51

Before Change


            grad_vars = 0.0
            grad_count = 0
            for p in net.parameters():
                grad_max = max(grad_max, p.grad.abs().max().data.cpu().numpy()[0])
                grad_means += (p.grad ** 2).mean().sqrt().data.cpu().numpy()[0]
                grad_vars += torch.var(p.grad).data.cpu().numpy()[0]
                grad_count += 1

After Change


            if USE_MEAN_BASELINE:
                batch_scales.append(exp.reward - baseline)
            else:
                batch_scales.append(exp.reward)

            // handle new rewards
            new_rewards = exp_source.pop_total_rewards()
            if new_rewards:
Italian Trulli
In pattern: SUPERPATTERN

Frequency: 3

Non-data size: 4

Instances


Project Name: PacktPublishing/Deep-Reinforcement-Learning-Hands-On
Commit Name: e70bdb2d089ae283781c45b8d97963823a984baa
Time: 2017-12-15
Author: max.lapan@gmail.com
File Name: ch10/00_pong_pg.py
Class Name:
Method Name:


Project Name: lcswillems/torch-rl
Commit Name: 87232aa39c159f04a501ad268012da45a1ff537c
Time: 2018-05-14
Author: lcswillems@gmail.com
File Name: torch_rl/torch_rl/algos/base.py
Class Name: BaseAlgo
Method Name: collect_experiences


Project Name: Pinafore/qb
Commit Name: e604607e0a26bd5ca244b60dba8769779f2f07a4
Time: 2018-04-19
Author: sjtufs@gmail.com
File Name: qanta/guesser/dan.py
Class Name: DanGuesser
Method Name: _guess_batch