f34e4fbad1b40627dfdc92c6eaf56969cba77c06,test/data/test_builtin_datasets.py,TestDataset,test_wikitext2,#TestDataset#,17

Before Change


    def test_wikitext2(self):
        // smoke test to ensure wikitext2 works properly
        ds = WikiText2
        TEXT = data.Field(lower=True, batch_first=True)
        train, valid, test = ds.splits(TEXT)
        TEXT.build_vocab(train)
        train_iter, valid_iter, test_iter = data.BPTTIterator.splits(
            (train, valid, test), batch_size=3, bptt_len=30)

        train_iter, valid_iter, test_iter = ds.iters(batch_size=4,

After Change


        self.assertEqual(len(valid_dataset), 214417)

        vocab = train_dataset.get_vocab()
        tokens_ids = [vocab[token] for token in "the player characters rest".split()]
        self.assertEqual(tokens_ids, [2, 286, 503, 700])

        // Delete the dataset after we"re done to save disk space on CI
Italian Trulli
In pattern: SUPERPATTERN

Frequency: 4

Non-data size: 4

Instances


Project Name: pytorch/text
Commit Name: f34e4fbad1b40627dfdc92c6eaf56969cba77c06
Time: 2019-11-25
Author: 6156351+zhangguanheng66@users.noreply.github.com
File Name: test/data/test_builtin_datasets.py
Class Name: TestDataset
Method Name: test_wikitext2


Project Name: pytorch/examples
Commit Name: 9108041562e1b7a4fb159a8c0afe0caf54fe2a6d
Time: 2017-02-06
Author: bryan.mccann.is@gmail.com
File Name: snli/train.py
Class Name:
Method Name:


Project Name: pytorch/text
Commit Name: f34e4fbad1b40627dfdc92c6eaf56969cba77c06
Time: 2019-11-25
Author: 6156351+zhangguanheng66@users.noreply.github.com
File Name: test/data/test_builtin_datasets.py
Class Name: TestDataset
Method Name: test_penntreebank