51bd98a75fc1fc01b79f30ffba96e9ed1837c68d,textacy/corpora/reddit_reader.py,RedditReader,__iter__,#RedditReader#,51
Before Change
try:
with bzip_open(path, mode="rt") as f:
for line in f:
yield json.loads(line)
except ValueError: // Python 2 sucks and can"t open bzip in text mode
with bzip_open(path, mode="rb") as f:
for line in f:
yield json.loads(line)
After Change
def __iter__(self):
for path in self.paths:
if PY2 is False:
for json_line in read_json_lines(path, mode="rt"):
yield json_line
else: // Python 2 can"t open json in text mode
for json_line in read_json_lines(path, mode="rb"):
yield json_line
def _clean_content(self, content):
// strip out link markup, e.g. [foo](http://foo.com)
content = REDDIT_LINK_RE.sub(r"\1", content)
// clean up basic HTML cruft
In pattern: SUPERPATTERN
Frequency: 3
Non-data size: 3
Instances Project Name: chartbeat-labs/textacy
Commit Name: 51bd98a75fc1fc01b79f30ffba96e9ed1837c68d
Time: 2016-07-18
Author: burton@chartbeat.com
File Name: textacy/corpora/reddit_reader.py
Class Name: RedditReader
Method Name: __iter__
Project Name: streamlit/streamlit
Commit Name: bd163732d8c83ad9c643f319d648cccf6dbc185b
Time: 2018-06-18
Author: adrien.g.treuille@gmail.com
File Name: lib/streamlit/proxy/ClientWebSocket.py
Class Name: ClientWebSocket
Method Name: on_message
Project Name: chakki-works/doccano
Commit Name: 49d41416e440926f0a9a8243b4d77f6f5468efe9
Time: 2019-03-12
Author: light.tree.1.13@gmail.com
File Name: app/server/utils.py
Class Name: JsonHandler
Method Name: parse