4b21300999e11ba6f91952c05a936ccec0673e2e,nltk/tokenize/treebank.py,TreebankWordTokenizer,span_tokenize,#TreebankWordTokenizer#Any#,147

Before Change


        ix = 0

        spans = []
        for word_token in self.tokenize(text):
            if word_token in ("``", """"):
                orig_idx = text.find(word_token, ix)
                quote_idx = text.find(""", ix)
                if orig_idx < 0:
                    real_token = """
                elif quote_idx < 0:
                    real_token = word_token
                elif orig_idx < quote_idx:
                    real_token = word_token
                else:
                    real_token = """
            else:
                real_token = word_token
            ix = text.find(real_token, ix)
            end = ix + len(real_token)
            spans.append((ix, end))
            ix = end

        return spans


class TreebankWordDetokenizer(TokenizerI):

After Change


            matched = [m.group() for m in re.finditer(r"``|"{2}|\"", text)]
            
            // Replace converted quotes back to double quotes
            tokens = [matched.pop(0) if tok in [""", "``", """"] else tok for tok in raw_tokens]
        else:
            tokens = raw_tokens

        return align_tokens(tokens, text)
Italian Trulli
In pattern: SUPERPATTERN

Frequency: 3

Non-data size: 9

Instances


Project Name: nltk/nltk
Commit Name: 4b21300999e11ba6f91952c05a936ccec0673e2e
Time: 2017-11-29
Author: lyyb46@gmail.com
File Name: nltk/tokenize/treebank.py
Class Name: TreebankWordTokenizer
Method Name: span_tokenize


Project Name: pyinstaller/pyinstaller
Commit Name: 65546b26dd142eb99f34c968b03987032a25888f
Time: 2011-11-23
Author: h.goebel@goebel-consult.de
File Name: PyInstaller/hooks/hook-os.py
Class Name:
Method Name: hook


Project Name: pyinstaller/pyinstaller
Commit Name: fa7a65464d94f349af68b78470777ac6b57f081b
Time: 2011-11-23
Author: h.goebel@goebel-consult.de
File Name: PyInstaller/hooks/hook-iu.py
Class Name:
Method Name: hook