68c9bab4a5c6821c2a4395763c676ba796406c49,dask_ml/preprocessing/_encoders.py,OneHotEncoder,_transform,#OneHotEncoder#Any#Any#,220
Before Change
import dask.dataframe as dd
// Validate that all are categorical.
if not (X.dtypes == "category").all():
raise ValueError("Must be all categorical.")
if not len(X.columns) == len(self.categories_):
raise ValueError(
"Number of columns ({}) does not match number "
"of categories_ ({})".format(len(X.columns), len(self.categories_))
After Change
for i, (col, dtype) in enumerate(zip(X.columns, self.dtypes_)):
Xi = X.iloc[:, i]
if not pd.api.types.is_categorical_dtype(Xi.dtype):
Xi = Xi.astype(dtype)
X[col] = Xi
if Xi.dtype != dtype:
raise ValueError(
"Different CategoricalDtype for fit and transform. "
In pattern: SUPERPATTERN
Frequency: 3
Non-data size: 5
Instances Project Name: dask/dask-ml
Commit Name: 68c9bab4a5c6821c2a4395763c676ba796406c49
Time: 2020-08-17
Author: TomAugspurger@users.noreply.github.com
File Name: dask_ml/preprocessing/_encoders.py
Class Name: OneHotEncoder
Method Name: _transform
Project Name: openai/gym
Commit Name: cee92691ad858952b4ed46c08cad6cc682868d22
Time: 2019-03-24
Author: zuoxingdong@users.noreply.github.com
File Name: gym/spaces/box.py
Class Name: Box
Method Name: __init__
Project Name: dask/dask-ml
Commit Name: 68c9bab4a5c6821c2a4395763c676ba796406c49
Time: 2020-08-17
Author: TomAugspurger@users.noreply.github.com
File Name: dask_ml/preprocessing/_encoders.py
Class Name: OneHotEncoder
Method Name: _fit