a31eab99dfbc6dcb7fe2ef149c59a30910adbbbd,src/gensim/similarities/docsim.py,SparseMatrixSimilarity,getSimilarities,#SparseMatrixSimilarity#Any#,206

Before Change



        // compute cosine similarity against every other document in the collection
        allSims = self.corpus * vec.tocsc() // N x T * T x 1 = N x 1
        allSims = list(allSims.toarray().flat) // convert to plain python list
        assert len(allSims) == self.corpus.shape[0] // make sure no document got lost!
        return allSims
//endclass SparseMatrixSimilarity

After Change


        faster than processing each document in turn).
        
        is_corpus, query = utils.isCorpus(query)
        if is_corpus:
            query = matutils.corpus2csc(query)
        else:
            if scipy.sparse.issparse(query):
                query = query.T // convert documents=rows to documents=columns
            elif isinstance(query, numpy.ndarray):
                if query.ndim == 1:
                    query.shape = (len(query), 1)
                query = scipy.sparse.csc_matrix(query)
            else:
                // default case: query is a single vector, in sparse gensim format
                query = matutils.corpus2csc([query], self.corpus.shape[1])

        // compute cosine similarity against every other document in the collection
        result = self.corpus * query.tocsc() // N x T * T x C = N x C
        if result.shape[1] == 1:
            // for queries of one document, return a 1d array
            result = result.toarray().flatten()
Italian Trulli
In pattern: SUPERPATTERN

Frequency: 3

Non-data size: 5

Instances


Project Name: RaRe-Technologies/gensim
Commit Name: a31eab99dfbc6dcb7fe2ef149c59a30910adbbbd
Time: 2011-05-15
Author: radimrehurek@seznam.cz
File Name: src/gensim/similarities/docsim.py
Class Name: SparseMatrixSimilarity
Method Name: getSimilarities


Project Name: markovmodel/PyEMMA
Commit Name: cf1e0faf5c3d04fd54bceeb7fa8c51dfd8120299
Time: 2017-05-26
Author: simols@hotmail.com
File Name: pyemma/msm/estimators/maximum_likelihood_msm.py
Class Name: AugmentedMarkovModel
Method Name: _estimate


Project Name: scikit-learn-contrib/sklearn-pandas
Commit Name: 1c7a87e96c4b6180e423586193c26e5ccd2f6bfd
Time: 2015-08-02
Author: mahmoud@thehumangeo.com
File Name: sklearn_pandas/__init__.py
Class Name: DataFrameMapper
Method Name: transform