19a2292b8b21da8fd83e5f8129debb4b9ab4c14f,gensim/corpora/dictionary.py,Dictionary,from_corpus,#Any#Any#,340

Before Change


                result.dfs[wordid] = result.dfs.get(wordid, 0) + 1

        // now make sure length(result) == get_max_id(corpus) + 1
        if (id2word is None): id2word = list(map(str, xrange(max_id + 1)))
        for i in xrange(max_id + 1):
            result.token2id[id2word[i]] = i
            result.dfs[i] = result.dfs.get(i, 0)

After Change



        if id2word is None:
            // make sure length(result) == get_max_id(corpus) + 1
            result.token2id = dict((unicode(i), i) for i in xrange(max_id + 1))
        else:
            // id=>word mapping given: simply copy it
            result.token2id = dict((utils.to_unicode(token), id) for id, token in iteritems(id2word))
        for id in itervalues(result.token2id):
Italian Trulli
In pattern: SUPERPATTERN

Frequency: 3

Non-data size: 3

Instances


Project Name: RaRe-Technologies/gensim
Commit Name: 19a2292b8b21da8fd83e5f8129debb4b9ab4c14f
Time: 2014-07-01
Author: radimrehurek@seznam.cz
File Name: gensim/corpora/dictionary.py
Class Name: Dictionary
Method Name: from_corpus


Project Name: WZBSocialScienceCenter/tmtoolkit
Commit Name: 1070ee6fe00f2a3b03273e6a6dbf5625ab4dffc7
Time: 2019-03-12
Author: markus.konrad@wzb.eu
File Name: tmtoolkit/preprocess/_tmpreproc.py
Class Name: TMPreproc
Method Name: vocabulary_abs_doc_frequency


Project Name: khaotik/DaNet-Tensorflow
Commit Name: de00082780be884fc90e0113d323bfd63006ffba
Time: 2017-08-07
Author: junkkhaotik@gmail.com
File Name: main.py
Class Name: Model
Method Name: train