d1196006be574c16473df6efed448f9fa308a680,tests/test_preprocess.py,,test_tmpreproc_en_lemmatize,#Any#,455
Before Change
def test_tmpreproc_en_lemmatize(tmpreproc_en):
tokens = tmpreproc_en.tokenize().tokens
lemmata = tmpreproc_en.pos_tag().lemmatize().tokens
assert set(tokens.keys()) == set(lemmata.keys())
After Change
dt_ = lemmata[dl]
assert len(dt) == len(dt_)
assert len(tmpreproc_en.vocabulary) < len(vocab)
_check_save_load_state(tmpreproc_en)
In pattern: SUPERPATTERN
Frequency: 7
Non-data size: 3
Instances
Project Name: WZBSocialScienceCenter/tmtoolkit
Commit Name: d1196006be574c16473df6efed448f9fa308a680
Time: 2019-03-06
Author: markus.konrad@wzb.eu
File Name: tests/test_preprocess.py
Class Name:
Method Name: test_tmpreproc_en_lemmatize
Project Name: WZBSocialScienceCenter/tmtoolkit
Commit Name: bbca1fca586636e0bf90336893937956c9962c7d
Time: 2019-03-07
Author: markus.konrad@wzb.eu
File Name: tests/test_preprocess.py
Class Name:
Method Name: test_tmpreproc_en_clean_tokens
Project Name: WZBSocialScienceCenter/tmtoolkit
Commit Name: 1273d579c5ce666aaf8ff20942ce83681bf5eb06
Time: 2019-03-13
Author: markus.konrad@wzb.eu
File Name: tests/test_preprocess.py
Class Name:
Method Name: test_tmpreproc_de_lemmatize
Project Name: WZBSocialScienceCenter/tmtoolkit
Commit Name: 1070ee6fe00f2a3b03273e6a6dbf5625ab4dffc7
Time: 2019-03-12
Author: markus.konrad@wzb.eu
File Name: tests/test_preprocess.py
Class Name:
Method Name: test_tmpreproc_en_get_dtm
Project Name: WZBSocialScienceCenter/tmtoolkit
Commit Name: ab1359b176f8ac95ac443735395d8a316be2df16
Time: 2019-03-06
Author: markus.konrad@wzb.eu
File Name: tests/test_preprocess.py
Class Name:
Method Name: test_tmpreproc_en_vocabulary
Project Name: WZBSocialScienceCenter/tmtoolkit
Commit Name: 1273d579c5ce666aaf8ff20942ce83681bf5eb06
Time: 2019-03-13
Author: markus.konrad@wzb.eu
File Name: tests/test_preprocess.py
Class Name:
Method Name: test_tmpreproc_de_tokenize
Project Name: WZBSocialScienceCenter/tmtoolkit
Commit Name: bbca1fca586636e0bf90336893937956c9962c7d
Time: 2019-03-07
Author: markus.konrad@wzb.eu
File Name: tests/test_preprocess.py
Class Name:
Method Name: test_tmpreproc_en_remove_special_chars_in_tokens