9bd7ddae2acd48f26344b7b6e905ab3ab7a81a60,docsim.py,SimilarityABC,__getitem__,#SimilarityABC#Any#,37

Before Change


        doc may be either a bag-of-words iterable (corpus document), or a numpy 
        array, or a scipy.sparse matrix.
        
        raise NotImplementedError("cannot instantiate Abstract Base Class")


    def __iter__(self):
        

After Change



    def __getitem__(self, doc):
        // get similarities of doc to all documents in the corpus
        allSims = self.getSimilarities(doc)
        
        // return either all similarities as a list, or only self.numBest most similar, depending on settings from the constructor
        if self.numBest is None:
            return allSims
        else:
            tops = [(docNo, sim) for docNo, sim in enumerate(allSims) if sim > 0]
            tops = sorted(tops, key = lambda item: -item[1]) // sort by -sim => highest cossim first
            return tops[ : self.numBest] // return at most numBest top 2-tuples (docId, docSim)


    def __iter__(self):
        
        For each document, compute cosine similarity against all other documents 
        and yield the result.
Italian Trulli
In pattern: SUPERPATTERN

Frequency: 4

Non-data size: 14

Instances


Project Name: RaRe-Technologies/gensim
Commit Name: 9bd7ddae2acd48f26344b7b6e905ab3ab7a81a60
Time: 2010-02-03
Author: piskvorky@92d0401f-a546-4972-9173-107b360ed7e5
File Name: docsim.py
Class Name: SimilarityABC
Method Name: __getitem__


Project Name: RaRe-Technologies/gensim
Commit Name: 3fd9809dfdfcf47bf34a9f9a780277abaae76105
Time: 2010-02-03
Author: radimrehurek@seznam.cz
File Name: docsim.py
Class Name: SimilarityABC
Method Name: __getitem__


Project Name: RaRe-Technologies/gensim
Commit Name: 474a7aa3c5300446a2d471d24c6c66dee8fc46c7
Time: 2010-02-03
Author: piskvorky@92d0401f-a546-4972-9173-107b360ed7e5
File Name: docsim.py
Class Name: SimilarityABC
Method Name: __iter__


Project Name: RaRe-Technologies/gensim
Commit Name: 51a01331d5a71c638d421f0ac45de19cda251a81
Time: 2010-02-03
Author: radimrehurek@seznam.cz
File Name: docsim.py
Class Name: SimilarityABC
Method Name: __iter__