2573c649518391ada6214cfc72d20421dfac4072,src/preprocess.py,,get_embeddings,#Any#Any#Any#,194

Before Change


    embeddings = np.zeros((word_v_size, d_word))
    for idx in range(word_v_size): // kind of hacky
        word = vocab.get_token_from_index(idx)
        if word == "@@PADDING@@" or word == "@@UNKNOWN@@":
            continue
        try:
            assert word in word2vec
        except AssertionError as error:
            log.debug(error)

After Change


    word_v_size, unk_idx = vocab.get_vocab_size("tokens"), vocab.get_token_index(vocab._oov_token)
    embeddings = np.random.randn(word_v_size, d_word) //np.zeros((word_v_size, d_word))
    with open(vec_file) as vec_fh:
        for line in vec_fh:
            word, vec = line.split(" ", 1)
            idx = vocab.get_token_index(word)
            if idx != unk_idx:
                idx = vocab.get_token_index(word)
                embeddings[idx] = np.array(list(map(float, vec.split())))
    embeddings[vocab.get_token_index("@@PADDING@@")] = 0.
    embeddings = torch.FloatTensor(embeddings)
    log.info("\tFinished loading embeddings")
    return embeddings
Italian Trulli
In pattern: SUPERPATTERN

Frequency: 3

Non-data size: 4

Instances


Project Name: jsalt18-sentence-repl/jiant
Commit Name: 2573c649518391ada6214cfc72d20421dfac4072
Time: 2018-03-16
Author: wang.alex.c@gmail.com
File Name: src/preprocess.py
Class Name:
Method Name: get_embeddings


Project Name: jsalt18-sentence-repl/jiant
Commit Name: c2d5bde2b2d100b5d11e2c7f2e58186bca296f0d
Time: 2018-07-26
Author: wang.alex.c@gmail.com
File Name: src/beamsearch.py
Class Name:
Method Name: write_translation_preds


Project Name: jsalt18-sentence-repl/jiant
Commit Name: 852d7bd6143faa1acdd4ef47a2fe84372f3b48c9
Time: 2018-07-26
Author: yu.katherin@gmail.com
File Name: src/beamsearch.py
Class Name:
Method Name: write_translation_preds