2299d6fb7437903bf421884c0191c38d59d06f7b,src/gensim/corpora/bleicorpus.py,BleiCorpus,saveCorpus,#Any#Any#Any#,72
Before Change
logging.info("no word id mapping provided; initializing from corpus")
maxId = -1
for document in corpus:
maxId = max(maxId, max([-1] + [fieldId for fieldId, _ in document]))
numTerms = 1 + maxId
id2word = dict(zip(xrange(numTerms), xrange(numTerms))) // word id mapping will be identity
else:
numTerms = 1 + max([-1] + id2word.keys())
After Change
if id2word is None:
logging.info("no word id mapping provided; initializing from corpus")
id2word = utils.dictFromCorpus(corpus)
numTerms = len(id2word)
else:
numTerms = 1 + max([-1] + id2word.keys())
logging.info("storing corpus in Blei"s LDA-C format: %s" % fname)
In pattern: SUPERPATTERN
Frequency: 4
Non-data size: 9
Instances Project Name: RaRe-Technologies/gensim
Commit Name: 2299d6fb7437903bf421884c0191c38d59d06f7b
Time: 2010-03-12
Author: piskvorky@92d0401f-a546-4972-9173-107b360ed7e5
File Name: src/gensim/corpora/bleicorpus.py
Class Name: BleiCorpus
Method Name: saveCorpus
Project Name: RaRe-Technologies/gensim
Commit Name: 2e07c2d2743bc80fe0a2b9c8ec5a8460b2f5d6dd
Time: 2010-03-12
Author: radimrehurek@seznam.cz
File Name: src/gensim/corpora/bleicorpus.py
Class Name: BleiCorpus
Method Name: saveCorpus
Project Name: RaRe-Technologies/gensim
Commit Name: 6e5ac39b4247082efdf934e0e03cc234ddcef529
Time: 2010-04-02
Author: piskvorky@92d0401f-a546-4972-9173-107b360ed7e5
File Name: src/gensim/corpora/dmlcorpus.py
Class Name: DmlCorpus
Method Name: loadDictionary
Project Name: RaRe-Technologies/gensim
Commit Name: aaa0d4fcdff881ccbd69d4be0e370ac55b930f10
Time: 2010-04-02
Author: radimrehurek@seznam.cz
File Name: src/gensim/corpora/dmlcorpus.py
Class Name: DmlCorpus
Method Name: loadDictionary