ea692ade194392250df2e4681764090868bdca82,horovod/spark/torch/estimator.py,TorchModel,_transform,#TorchModel#Any#,410

Before Change


        // Spark has to infer whether a filed is nullable or not from a limited number of samples.
        // It does not always get it right. We copy the nullable boolean variable for the fields
        // from the original dataframe to the final DF schema.
        nullables = {field.name: field.nullable for field in df.schema.fields}
        for field in final_output_schema.fields:
            if field.name in nullables:
                field.nullable = nullables[field.name]

After Change



        // append output schema
        override_fields = df.limit(1).rdd.mapPartitions(predict).toDF().schema.fields[-len(output_cols):]
        for name, override, label in zip(output_cols, override_fields, label_cols):
            // default data type as label type
            data_type = metadata[label]["spark_data_type"]()

            if type(override.dataType) == VectorUDT:
                // Override output to vector. This is mainly for torch"s classification loss
                // where label is a scalar but model output is a vector.
                data_type = VectorUDT()
            final_output_fields.append(StructField(name=name, dataType=data_type, nullable=True))

        final_output_schema = StructType(final_output_fields)

        pred_rdd = df.rdd.mapPartitions(predict)

In pattern: SUPERPATTERN

Frequency: 3

Non-data size: 7

Instances

Link

Project Name: horovod/horovod

Commit Name: ea692ade194392250df2e4681764090868bdca82

Time: 2021-02-04

Author: irasit@users.noreply.github.com

File Name: horovod/spark/torch/estimator.py

Class Name: TorchModel

Method Name: _transform

Link

Project Name: gboeing/osmnx

Commit Name: 313b79ce9cc8538a78edfc82ccc7b02c23766287

Time: 2020-10-20

Author: 44049940+Labulitiolle@users.noreply.github.com

File Name: osmnx/utils_graph.py

Class Name:

Method Name: graph_from_gdfs

Link

Project Name: GPflow/GPflow

Commit Name: 0b9e1f064ab1ce1d994f86686e7d662a46095e36

Time: 2020-03-30

Author: st--@users.noreply.github.com

File Name: doc/source/notebooks/advanced/mcmc.pct.py

Class Name:

Method Name: marginal_samples