f819a34e2fbea2dab4997b3b236b517fa12d115d,examples/03_midwest_survey.py,,,#,110

Before Change


import matplotlib.pyplot as plt

f, ax = plt.subplots()
ax.boxplot(all_scores, vert=False)
ax.set_yticklabels(["one-hot\nencoding", "similarity\nencoding"])
//////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
// We can see that encoding the data using a SimilarityEncoder instead of
// OneHotEncoder helps a lot in improving the cross validation score!

After Change


    pipeline = make_pipeline(method)
    // Now predict the census region of each participant
    scores = cross_val_score(pipeline, df, y, cv=cv)
    all_scores[method] = scores

    print("%s encoding" % method)
    print("Accuracy score:  mean: %.3f; std: %.3f\n"
Italian Trulli
In pattern: SUPERPATTERN

Frequency: 3

Non-data size: 3

Instances


Project Name: dirty-cat/dirty_cat
Commit Name: f819a34e2fbea2dab4997b3b236b517fa12d115d
Time: 2018-06-08
Author: gael.varoquaux@normalesup.org
File Name: examples/03_midwest_survey.py
Class Name:
Method Name:


Project Name: dirty-cat/dirty_cat
Commit Name: f819a34e2fbea2dab4997b3b236b517fa12d115d
Time: 2018-06-08
Author: gael.varoquaux@normalesup.org
File Name: examples/02_predict_employee_salaries.py
Class Name:
Method Name:


Project Name: ellisdg/3DUnetCNN
Commit Name: d194d8abd924932caab53d6e858918a84f3e5b64
Time: 2017-12-18
Author: david.ellis@unmc.edu
File Name: brats/evaluate.py
Class Name:
Method Name: main