bd334ef20fdccb74d310ca00b1134388645ba0a5,vendor/readability/encoding.py,,get_encoding,#Any#,4

Before Change


    if not text.strip() or len(text) < 10:
        return enc // can"t guess
    try:
        diff = text.decode(enc, "ignore").encode(enc)
        sizes = len(diff), len(text)
        if abs(len(text) - len(diff)) < max(sizes) * 0.01: // 99% of utf-8
            return enc
    except UnicodeDecodeError:
        pass

After Change


            xml_re.findall(page))

    // Try any declared encodings
    if len(declared_encodings) > 0:
        for declared_encoding in declared_encodings:
            try:
                page.decode(custom_decode(declared_encoding))
                return custom_decode(declared_encoding)
            except UnicodeDecodeError:
                pass

    // Fallback to chardet if declared encodings fail
    text = re.sub("</?[^>]*>\s*", " ", page)
    enc = "utf-8"
    if not text.strip() or len(text) < 10:
        return enc // can"t guess
Italian Trulli
In pattern: SUPERPATTERN

Frequency: 3

Non-data size: 5

Instances


Project Name: samuelclay/NewsBlur
Commit Name: bd334ef20fdccb74d310ca00b1134388645ba0a5
Time: 2014-07-21
Author: samuel@ofbrooklyn.com
File Name: vendor/readability/encoding.py
Class Name:
Method Name: get_encoding


Project Name: shubhomoydas/ad_examples
Commit Name: f707bd92107953e6d8ba05d2dff3ee3133b2d805
Time: 2018-11-25
Author: smd.shubhomoydas@gmail.com
File Name: python/graph/simple_gcn.py
Class Name: SimpleGCN
Method Name: _fit


Project Name: tiberiu44/TTS-Cube
Commit Name: 9cf2bcdb24f23a17ec11e69b8885851771dfd3d8
Time: 2018-10-25
Author: boros@adobe.com
File Name: cube/models/vocoder.py
Class Name: BeeCoder
Method Name: learn