bb4c3f8bc223be2a8afbaae9c8afa7f5b1a32204,pycorrector/seq2seq_attention/preprocess.py,,parse_xml_file,#Any#,16

Before Change


        correction = doc.getElementsByTagName("CORRECTION")[0]. \
            childNodes[0].data.strip()

        source = segment(text, cut_type="char")
        target = segment(correction, cut_type="char")
        pair = [source, target]
        if pair not in data_list:
            data_list.append(pair)

After Change


        correction = doc.getElementsByTagName("CORRECTION")[0]. \
            childNodes[0].data.strip()

        pair = [text.strip(), correction.strip()]
        if pair not in data_list:
            data_list.append(pair)
    return data_list
Italian Trulli
In pattern: SUPERPATTERN

Frequency: 3

Non-data size: 2

Instances


Project Name: shibing624/pycorrector
Commit Name: bb4c3f8bc223be2a8afbaae9c8afa7f5b1a32204
Time: 2019-07-29
Author: 507153809@qq.com
File Name: pycorrector/seq2seq_attention/preprocess.py
Class Name:
Method Name: parse_xml_file


Project Name: shibing624/pycorrector
Commit Name: 4e144c9f842d7415d8be5bdbb5912d88ae32cced
Time: 2018-04-16
Author: 507153809@qq.com
File Name: pycorrector/seq2seq/corpus_reader.py
Class Name: CGEDReader
Method Name: read_tokens


Project Name: shibing624/pycorrector
Commit Name: 4e144c9f842d7415d8be5bdbb5912d88ae32cced
Time: 2018-04-16
Author: 507153809@qq.com
File Name: pycorrector/seq2seq/corpus_reader.py
Class Name: CGEDReader
Method Name: read_samples_by_string