9bd7ddae2acd48f26344b7b6e905ab3ab7a81a60,docsim.py,SimilarityABC,__getitem__,#SimilarityABC#Any#,37
Before Change
doc may be either a bag-of-words iterable (corpus document), or a numpy
array, or a scipy.sparse matrix.
raise NotImplementedError("cannot instantiate Abstract Base Class")
def __iter__(self):
After Change
def __getitem__(self, doc):
// get similarities of doc to all documents in the corpus
allSims = self.getSimilarities(doc)
// return either all similarities as a list, or only self.numBest most similar, depending on settings from the constructor
if self.numBest is None:
return allSims
else:
tops = [(docNo, sim) for docNo, sim in enumerate(allSims) if sim > 0]
tops = sorted(tops, key = lambda item: -item[1]) // sort by -sim => highest cossim first
return tops[ : self.numBest] // return at most numBest top 2-tuples (docId, docSim)
def __iter__(self):
For each document, compute cosine similarity against all other documents
and yield the result.
In pattern: SUPERPATTERN
Frequency: 4
Non-data size: 14
Instances
Project Name: RaRe-Technologies/gensim
Commit Name: 9bd7ddae2acd48f26344b7b6e905ab3ab7a81a60
Time: 2010-02-03
Author: piskvorky@92d0401f-a546-4972-9173-107b360ed7e5
File Name: docsim.py
Class Name: SimilarityABC
Method Name: __getitem__
Project Name: RaRe-Technologies/gensim
Commit Name: 3fd9809dfdfcf47bf34a9f9a780277abaae76105
Time: 2010-02-03
Author: radimrehurek@seznam.cz
File Name: docsim.py
Class Name: SimilarityABC
Method Name: __getitem__
Project Name: RaRe-Technologies/gensim
Commit Name: 474a7aa3c5300446a2d471d24c6c66dee8fc46c7
Time: 2010-02-03
Author: piskvorky@92d0401f-a546-4972-9173-107b360ed7e5
File Name: docsim.py
Class Name: SimilarityABC
Method Name: __iter__
Project Name: RaRe-Technologies/gensim
Commit Name: 51a01331d5a71c638d421f0ac45de19cda251a81
Time: 2010-02-03
Author: radimrehurek@seznam.cz
File Name: docsim.py
Class Name: SimilarityABC
Method Name: __iter__