ec849adaf4ceb42ed52ca142c839f627c34b9434,slm_lab/agent/algorithm/reinforce.py,Reinforce,calc_advantage,#Reinforce#Any#,158
Before Change
for epi_rewards in raw_rewards:
rewards = []
big_r = 0
for r in epi_rewards[::-1]:
big_r = r + self.gamma * big_r
rewards.insert(0, big_r)
rewards = torch.Tensor(rewards)
logger.debug3(f"Rewards: {rewards}")
rewards = (rewards - rewards.mean()) / (rewards.std() + np.finfo(np.float32).eps)
logger.debug3(f"Normalized rewards: {rewards}")
After Change
big_r = 0
T = len(epi_rewards)
returns = np.empty(T, "float32")
for t in reversed(range(T)):
big_r = epi_rewards[t] + self.gamma * big_r
returns[t] = big_r
logger.debug3(f"Rewards: {returns}")
returns = (returns - returns.mean()) / (returns.std() + 1e-08)
returns = torch.from_numpy(returns)
logger.debug3(f"Normalized returns: {returns}")
In pattern: SUPERPATTERN
Frequency: 4
Non-data size: 5
Instances
Project Name: kengz/SLM-Lab
Commit Name: ec849adaf4ceb42ed52ca142c839f627c34b9434
Time: 2018-05-21
Author: kengzwl@gmail.com
File Name: slm_lab/agent/algorithm/reinforce.py
Class Name: Reinforce
Method Name: calc_advantage
Project Name: biocore/scikit-bio
Commit Name: 791c934318c81fb768275a9abb2f53e919cb9813
Time: 2015-03-26
Author: jai.rideout@gmail.com
File Name: skbio/sequence/tests/test_sequence.py
Class Name: SequenceTests
Method Name: test_reversed
Project Name: pantsbuild/pants
Commit Name: 38d994a74f40e481f10a7dd90fbddeb0196f0b4a
Time: 2014-07-02
Author: john.sirois@gmail.com
File Name: src/python/pants/engine/round_engine.py
Class Name: RoundEngine
Method Name: _prepare
Project Name: GoogleCloudPlatform/PerfKitBenchmarker
Commit Name: abc343bce8266d2867528956ddbe78d6b83d300b
Time: 2015-12-01
Author: hildrum@google.com
File Name: perfkitbenchmarker/linux_benchmarks/iperf_benchmark.py
Class Name:
Method Name: Run