1b43ea3d9f7db02075536d9578891af06e324b9a,scattertext/WhitespaceNLP.py,,whitespace_nlp_with_sentences,#Any#Any#Any#,111

Before Change


		toks = []
		for tok in re.split(r"(\W)", sent):
			pos = "WORD"
			if tok.strip() == "":
				pos = "SPACE"
			elif re.match("^\W+$", tok):
				pos = "PUNCT"
			toks.append(Tok(pos,
			                tok[:2].lower(),
			                tok.lower(),
			                ent_type="" if entity_type is None else entity_type.get(tok, ""),

After Change


	pat = re.compile(r"([^\.!?]*?[\.!?$])", re.M)
	sents = []
	raw_sents = pat.findall(doc)
	if len(raw_sents) == 0:
		raw_sents = [doc]
	for sent in raw_sents:
		toks = []
		for tok in re.split(r"(\W)", sent):
			if len(tok) > 0:
Italian Trulli
In pattern: SUPERPATTERN

Frequency: 3

Non-data size: 7

Instances


Project Name: JasonKessler/scattertext
Commit Name: 1b43ea3d9f7db02075536d9578891af06e324b9a
Time: 2017-12-04
Author: jason.kessler@gmail.com
File Name: scattertext/WhitespaceNLP.py
Class Name:
Method Name: whitespace_nlp_with_sentences


Project Name: thenetcircle/dino
Commit Name: f509a520f72375acc486c1d9d22c8a2ac9fa1e8b
Time: 2016-12-26
Author: oscar.eriks@gmail.com
File Name: dino/api.py
Class Name:
Method Name: on_message


Project Name: biotite-dev/biotite
Commit Name: 80e52e88aa0e975d9dfd74887108ed11e5cf9d38
Time: 2018-02-01
Author: patrick.kunzm@gmail.com
File Name: src/biotite/sequence/io/genbank/file.py
Class Name: GenBankFile
Method Name: read