6367318f455aa8c27b6341c9b98794351dfd168e,examples/pytorch/pinsage/data_utils.py,,train_test_split_by_time,#Any#Any#Any#,9

Before Change


    df["test_mask"] = np.zeros((len(df),), dtype=np.bool)
    df = df.sort_values([item, timestamp])
    for track_id in df[item].unique():
        idx = (df[item] == track_id).to_numpy().nonzero()[0]
        idx = df.index[idx]
        if len(idx) > 1:
            df.loc[idx[-1], "train_mask"] = False
            df.loc[idx[-1], "test_mask"] = True
        if len(idx) > 2:

After Change


            df.iloc[-2, -2] = True
        return df
    df = df.groupby(item).apply(train_test_split).compute(scheduler="processes").sort_index()
    print(df[df[item] == df[item].unique()[0]].sort_values(timestamp))
    return df["train_mask"].to_numpy().nonzero()[0], \
           df["val_mask"].to_numpy().nonzero()[0], \
           df["test_mask"].to_numpy().nonzero()[0]
Italian Trulli
In pattern: SUPERPATTERN

Frequency: 3

Non-data size: 3

Instances


Project Name: dmlc/dgl
Commit Name: 6367318f455aa8c27b6341c9b98794351dfd168e
Time: 2020-08-17
Author: coin2028@hotmail.com
File Name: examples/pytorch/pinsage/data_utils.py
Class Name:
Method Name: train_test_split_by_time


Project Name: deepchem/deepchem
Commit Name: d62d3a866dcb8f937d6aa5e869c309cee8a784ce
Time: 2016-07-11
Author: bharath.ramsundar@gmail.com
File Name: deepchem/models/multitask.py
Class Name: SingletaskToMultitask
Method Name: fit


Project Name: deepchem/deepchem
Commit Name: 2109dc81a70406c98305256587ab57e64135b96a
Time: 2016-07-29
Author: apappu97@gmail.com
File Name: deepchem/splits/tests/test_splitter.py
Class Name: TestSplitters
Method Name: test_stratified_multitask_split