88b754261ee28f8e4143a573135a0f33da42d249,bugbug/similarity.py,Word2VecWmdSimilarity,__init__,#Word2VecWmdSimilarity#Any#Any#,230

Before Change


            self.bug_ids.append(bug["id"])

        indexes = list(range(len(self.corpus)))
        random.shuffle(indexes)
        self.corpus = [self.corpus[idx] for idx in indexes]
        self.bug_ids = [self.bug_ids[idx] for idx in indexes]

        self.w2vmodel = Word2Vec(self.corpus, size=100, min_count=5)

After Change


            self.corpus.append([bug["id"], textual_features])

        // Assigning unique integer ids to all words
        self.dictionary = Dictionary(text for bug_id, text in self.corpus)

        // Conversion to BoW
        corpus_final = [self.dictionary.doc2bow(text) for bug_id, text in self.corpus]

        // Initializing and applying the tfidf transformation model on same corpus,resultant corpus is of same dimensions
        tfidf = models.TfidfModel(corpus_final)
        corpus_tfidf = tfidf[corpus_final]

        // Transform TF-IDF corpus to latent 300-D space via Latent Semantic Indexing
        self.lsi = models.LsiModel(
            corpus_tfidf, id2word=self.dictionary, num_topics=300
        )
        corpus_lsi = self.lsi[corpus_tfidf]

        // Indexing the corpus
        self.index = similarities.Similarity(
            output_prefix="simdata.shdat", corpus=corpus_lsi, num_features=300
        )
Italian Trulli
In pattern: SUPERPATTERN

Frequency: 3

Non-data size: 6

Instances


Project Name: mozilla/bugbug
Commit Name: 88b754261ee28f8e4143a573135a0f33da42d249
Time: 2019-07-29
Author: ayush.shridhar1506@gmail.com
File Name: bugbug/similarity.py
Class Name: Word2VecWmdSimilarity
Method Name: __init__


Project Name: mozilla/bugbug
Commit Name: 4ace4ef2fb1956ec4df46f78c9edd02154780913
Time: 2019-07-24
Author: cklyyung@users.noreply.github.com
File Name: bugbug/similarity.py
Class Name: Word2VecWmdSimilarity
Method Name: __init__


Project Name: YerevaNN/mimic3-benchmarks
Commit Name: 4034844b505f20da652a417e605d391eae8d6c0a
Time: 2017-03-17
Author: harhro@gmail.com
File Name: scripts/split_train_and_test.py
Class Name:
Method Name: