ffcfc369a1fafea7e22a2cff6dfa1c58646544c6,agent.py,Agent,learn,#Agent#Any#,49

Before Change


    loss = -torch.sum(m * log_ps_a, 1)  // Cross-entropy loss (minimises DKL(m||p(s_t, a_t)))
    loss = weights * loss  // Importance weight losses before prioritised experience replay (done after for original/non-distributional version)
    self.online_net.zero_grad()
    loss.mean().backward()  // Backpropagate minibatch loss
    self.optimiser.step()
    nn.utils.clip_grad_norm_(self.online_net.parameters(), self.norm_clip)  // Clip gradients by L2 norm

After Change



    loss = -torch.sum(m * log_ps_a, 1)  // Cross-entropy loss (minimises DKL(m||p(s_t, a_t)))
    self.online_net.zero_grad()
    (weights * loss).mean().backward()  // Backpropagate importance-weighted minibatch loss
    self.optimiser.step()

    mem.update_priorities(idxs, loss.detach())  // Update priorities of sampled transitions
Italian Trulli
In pattern: SUPERPATTERN

Frequency: 3

Non-data size: 3

Instances


Project Name: Kaixhin/Rainbow
Commit Name: ffcfc369a1fafea7e22a2cff6dfa1c58646544c6
Time: 2018-08-11
Author: design@kaixhin.com
File Name: agent.py
Class Name: Agent
Method Name: learn


Project Name: mariogeiger/se3cnn
Commit Name: e3883e98057396d886f90328ea32e0f0a01f2535
Time: 2018-08-01
Author: geiger.mario@gmail.com
File Name: examples/tetris.py
Class Name:
Method Name: train


Project Name: Kaixhin/Rainbow
Commit Name: d6538df32693a9f3cabd13c852abbcc1e7cfe349
Time: 2018-06-18
Author: design@kaixhin.com
File Name: agent.py
Class Name: Agent
Method Name: learn