bd334ef20fdccb74d310ca00b1134388645ba0a5,vendor/readability/encoding.py,,get_encoding,#Any#,4

Before Change


    try:
        diff = text.decode(enc, "ignore").encode(enc)
        sizes = len(diff), len(text)
        if abs(len(text) - len(diff)) < max(sizes) * 0.01: // 99% of utf-8
            return enc
    except UnicodeDecodeError:
        pass

After Change


    pragma_re = re.compile(r"<meta.*?content=["\"]*;?charset=(.+?)["\">]", flags=re.I)
    xml_re = re.compile(r"^<\?xml.*?encoding=["\"]*(.+?)["\">]")

    declared_encodings = (charset_re.findall(page) +
            pragma_re.findall(page) +
            xml_re.findall(page))

    // Try any declared encodings
    if len(declared_encodings) > 0:
        for declared_encoding in declared_encodings:
Italian Trulli
In pattern: SUPERPATTERN

Frequency: 3

Non-data size: 3

Instances


Project Name: samuelclay/NewsBlur
Commit Name: bd334ef20fdccb74d310ca00b1134388645ba0a5
Time: 2014-07-21
Author: samuel@ofbrooklyn.com
File Name: vendor/readability/encoding.py
Class Name:
Method Name: get_encoding


Project Name: dPys/PyNets
Commit Name: 5aef6f1a69433fdf47a0fea132cfe0b713a5287f
Time: 2018-05-22
Author: dpisner@utexas.edu
File Name: pynets/utils.py
Class Name:
Method Name: compile_iterfields


Project Name: idaholab/raven
Commit Name: af0f15967d0e5a3743bcbf2ecea875991c8cda84
Time: 2017-10-30
Author: paul.talbot@inl.gov
File Name: framework/utils/xmlUtils.py
Class Name:
Method Name: findPath