0e52a77af80ef1aefb4958564d67ffbcdd24cc84,torchnlp/encoders/text/spacy_encoder.py,SpacyEncoder,batch_encode,#SpacyEncoder#Any#,79

Before Change



    def batch_encode(self, sequences):
        return_ = []
        for tokens in self.spacy.pipe(sequences, n_threads=-1):
            text = [token.text for token in tokens]
            vector = [self.stoi.get(token, self.unknown_index) for token in text]
            if self.append_eos:
                vector.append(self.eos_index)
            return_.append(torch.tensor(vector))
        return return_

After Change


        // Batch tokenization is handled by ``self.spacy.pipe``
        original = self.tokenize
        self.tokenize = lambda sequence: [token.text for token in sequence]
        return_ = super().batch_encode(self.spacy.pipe(sequences))
        self.tokenize = original
        return return_
Italian Trulli
In pattern: SUPERPATTERN

Frequency: 3

Non-data size: 5

Instances


Project Name: PetrochukM/PyTorch-NLP
Commit Name: 0e52a77af80ef1aefb4958564d67ffbcdd24cc84
Time: 2019-04-09
Author: petrochukm@gmail.com
File Name: torchnlp/encoders/text/spacy_encoder.py
Class Name: SpacyEncoder
Method Name: batch_encode


Project Name: explosion/spaCy
Commit Name: a2745b0e84f15867758fca2867500fba9784623c
Time: 2017-11-14
Author: ligser@gmail.com
File Name: spacy/tests/regression/test_issue1506.py
Class Name:
Method Name: test_issue1506


Project Name: explosion/spaCy
Commit Name: 46628d88903edaa2c3614339a0d464b9fcdcc690
Time: 2020-02-12
Author: sofie.vanlandeghem@gmail.com
File Name: spacy/tests/regression/test_issue4903.py
Class Name:
Method Name: test_issue4903