cac5f014c09193f7a5ad6b71e4012defa0a96baa,src/gensim/matutils.py,,corpus2csc,#Any#Any#Any#,24

Before Change


    logger.debug("constructing sparse document matrix")
    // construct the sparse matrix as lil_matrix first, convert to csc later
    // lil_matrix can quickly update rows, so initialize it transposed (documents=rows)
    mat = scipy.sparse.lil_matrix((1, 1), dtype = dtype)
    mat.rows, mat.data = [], []
    for i, doc in enumerate(corpus):
        doc = sorted(doc)
        mat.rows.append([fid for fid, _ in doc])
        mat.data.append([val for _, val in doc])
    docs = i + 1
    mat._shape = (docs, m)
    mat = mat.tocsr().transpose() // transpose back to documents=columns
    assert isinstance(mat, scipy.sparse.csc_matrix)
    return mat

After Change


    with documents as columns.
    
    logger.debug("constructing sparse document matrix")
    docs, data, indices, indptr = 0, [], [], [0]
    for doc in corpus:
        indptr.append(len(doc))
        indices.extend([feature_id for feature_id, _ in doc])
        data.extend([feature_weight for _, feature_weight in doc])
        docs += 1
    indptr = numpy.cumsum(indptr)
    data = numpy.asarray(data)
    indices = numpy.asarray(indices)
    return scipy.sparse.csc_matrix((data, indices, indptr), shape = (num_terms, docs), dtype = dtype)

In pattern: SUPERPATTERN

Frequency: 3

Non-data size: 11

Instances

Link

Project Name: RaRe-Technologies/gensim

Commit Name: cac5f014c09193f7a5ad6b71e4012defa0a96baa

Time: 2010-09-05

Author: radimrehurek@seznam.cz

File Name: src/gensim/matutils.py

Class Name:

Method Name: corpus2csc

Link

Project Name: RaRe-Technologies/gensim

Commit Name: cac5f014c09193f7a5ad6b71e4012defa0a96baa

Time: 2010-09-05

Author: radimrehurek@seznam.cz

File Name: src/gensim/matutils.py

Class Name:

Method Name: corpus2csc

Link

Project Name: CellProfiler/CellProfiler

Commit Name: 862c3ed042ada33d2824e67e47bd92ef0881cf00

Time: 2017-04-19

Author: mcquin@users.noreply.github.com

File Name: cellprofiler/modules/relateobjects.py

Class Name: RelateObjects

Method Name: calculate_minimum_distances

Link

Project Name: RaRe-Technologies/gensim

Commit Name: c55d1b295cb6717ba6494917b88183e8d3f284a9

Time: 2010-09-05

Author: piskvorky@92d0401f-a546-4972-9173-107b360ed7e5

File Name: src/gensim/matutils.py

Class Name:

Method Name: corpus2csc