e70bdb2d089ae283781c45b8d97963823a984baa,ch10/00_pong_pg.py,,,#,51
Before Change
grad_vars = 0.0
grad_count = 0
for p in net.parameters():
grad_max = max(grad_max, p.grad.abs().max().data.cpu().numpy()[0])
grad_means += (p.grad ** 2).mean().sqrt().data.cpu().numpy()[0]
grad_vars += torch.var(p.grad).data.cpu().numpy()[0]
grad_count += 1
After Change
if USE_MEAN_BASELINE:
batch_scales.append(exp.reward - baseline)
else:
batch_scales.append(exp.reward)
// handle new rewards
new_rewards = exp_source.pop_total_rewards()
if new_rewards:
In pattern: SUPERPATTERN
Frequency: 3
Non-data size: 4
Instances
Project Name: PacktPublishing/Deep-Reinforcement-Learning-Hands-On
Commit Name: e70bdb2d089ae283781c45b8d97963823a984baa
Time: 2017-12-15
Author: max.lapan@gmail.com
File Name: ch10/00_pong_pg.py
Class Name:
Method Name:
Project Name: lcswillems/torch-rl
Commit Name: 87232aa39c159f04a501ad268012da45a1ff537c
Time: 2018-05-14
Author: lcswillems@gmail.com
File Name: torch_rl/torch_rl/algos/base.py
Class Name: BaseAlgo
Method Name: collect_experiences
Project Name: Pinafore/qb
Commit Name: e604607e0a26bd5ca244b60dba8769779f2f07a4
Time: 2018-04-19
Author: sjtufs@gmail.com
File Name: qanta/guesser/dan.py
Class Name: DanGuesser
Method Name: _guess_batch