4b21300999e11ba6f91952c05a936ccec0673e2e,nltk/tokenize/treebank.py,TreebankWordTokenizer,span_tokenize,#TreebankWordTokenizer#Any#,147

Before Change


                real_token = word_token
            ix = text.find(real_token, ix)
            end = ix + len(real_token)
            spans.append((ix, end))
            ix = end

        return spans

After Change


        // treated as starting quotes).
        if (""" in text) or ("""" in text):
            // Find double quotes and converted quotes
            matched = [m.group() for m in re.finditer(r"``|"{2}|\"", text)]
            
            // Replace converted quotes back to double quotes
            tokens = [matched.pop(0) if tok in [""", "``", """"] else tok for tok in raw_tokens]
Italian Trulli
In pattern: SUPERPATTERN

Frequency: 3

Non-data size: 3

Instances


Project Name: nltk/nltk
Commit Name: 4b21300999e11ba6f91952c05a936ccec0673e2e
Time: 2017-11-29
Author: lyyb46@gmail.com
File Name: nltk/tokenize/treebank.py
Class Name: TreebankWordTokenizer
Method Name: span_tokenize


Project Name: studioml/studio
Commit Name: f31b7ad689b1435e76744af4ff443607643a37fd
Time: 2017-12-28
Author: peter.zhokhov@sentient.ai
File Name: studio/experiment.py
Class Name:
Method Name: create_experiment


Project Name: akkana/scripts
Commit Name: 9a88eac5a23150345337d62e83621316cf2d986f
Time: 2019-09-21
Author: akkana@shallowsky.com
File Name: censusdata.py
Class Name:
Method Name: codesFromZipFile