63bde2f7cf28dbf6bca6e77fe0b0a9966dc6aee6,finetune/encoding/input_encoder.py,,tokenize_context,#Any#Any#Any#,203

Before Change


        // (this would not be the case if multiple context spans make up the same token)
        if char_loc == -1:
            tokenized_context.append(default_context)
        elif token in ["\n"]:
            tokenized_context.append(context_by_char_loc[current_char_loc][1])
        else:
            if char_loc > context_by_char_loc[current_char_loc][0]:

After Change


                    // TODO: this is a workaround that has no guarantees of being correct
                    raise ValueError("Context cannot be fully matched as it appears to not cover the end of the sequence for token {}".format(token))
            if token.strip() not in context_by_char_loc[current_char_loc][2]:
                warnings.warn("subtoken: {} has matched up with the context for token: {}".format(repr(token), repr(context_by_char_loc[current_char_loc][2])))
            tokenized_context.append(context_by_char_loc[current_char_loc][1])

    assert len(tokenized_context) == len(encoded_output.token_ends)
Italian Trulli
In pattern: SUPERPATTERN

Frequency: 3

Non-data size: 3

Instances


Project Name: IndicoDataSolutions/finetune
Commit Name: 63bde2f7cf28dbf6bca6e77fe0b0a9966dc6aee6
Time: 2020-05-14
Author: benlt@hotmail.co.uk
File Name: finetune/encoding/input_encoder.py
Class Name:
Method Name: tokenize_context


Project Name: brian-team/brian2
Commit Name: 2e1ca5383704af21a4285984c5d00d2c11f13c22
Time: 2017-03-14
Author: marcel.stimberg@inserm.fr
File Name: brian2/units/fundamentalunits.py
Class Name: UnitRegistry
Method Name: add


Project Name: facebookresearch/ParlAI
Commit Name: 9ad1d2da68aa4acf817562502340bf319276b283
Time: 2019-05-14
Author: jju@fb.com
File Name: parlai/mturk/core/dev/socket_manager.py
Class Name: Packet
Method Name: from_dict