2c5f22199b46837b9801378ce172dcd13f125769,gluonnlp/vocab/subwords.py,NGramHashes,_word_to_hashes,#NGramHashes#Any#,211

Before Change


            if word not in self.special_tokens:
                hashes = nd.array([
                    self.fasttext_hash_asbytes((u"<" + word + u">")[i:i + N]) % self.num_subwords
                    for N in self.ngrams for i in range((len(word) + 2) - N + 1)
                ])
            else:
                hashes = nd.zeros(shape=0)

After Change



    def _word_to_hashes(self, word):
        if word not in self.special_tokens:
            word_enc = bytearray((u"<" + word + u">").encode("utf-8"))
            hashes = _fasttext_ngram_hashes(
                memoryview(word_enc), ns=self._ngrams,
                bucket_size=self.num_subwords)
        else:
            hashes = []
        return hashes
Italian Trulli
In pattern: SUPERPATTERN

Frequency: 3

Non-data size: 3

Instances


Project Name: dmlc/gluon-nlp
Commit Name: 2c5f22199b46837b9801378ce172dcd13f125769
Time: 2018-07-17
Author: leonard@lausen.nl
File Name: gluonnlp/vocab/subwords.py
Class Name: NGramHashes
Method Name: _word_to_hashes


Project Name: ekzhu/datasketch
Commit Name: cd91a294f32206728436890be3e697b6c1325841
Time: 2015-04-08
Author: erkangzhu@gmail.com
File Name: datasketch/minhash.py
Class Name: MinHash
Method Name: __getstate__


Project Name: dask/distributed
Commit Name: c67705f3f513de5bc09b897c400011b543ff0f7c
Time: 2020-07-17
Author: jakirkham@gmail.com
File Name: distributed/protocol/utils.py
Class Name:
Method Name: merge_frames