doenut.data.modifiers.duplicate_averager

Module Contents

Classes

DuplicateAverager

Parses a dataset and removes all but the first instance of any row that

class doenut.data.modifiers.duplicate_averager.DuplicateAverager(inputs: pandas.DataFrame, responses: pandas.DataFrame)[source]

Bases: doenut.data.modifiers.duplicate_remover.DuplicateRemover

Parses a dataset and removes all but the first instance of any row that has duplicate values for the inputs. Will also remove the corresponding row in the responses, replacing the remaining response with the averages of the duplicates’ values.

Parameters:
  • inputs (pd.DataFrame) – The dataset’s inputs

  • responses (pd.DataFrame) – The dataset’s responses

classmethod _apply(data: pandas.DataFrame, duplicate_dict: Dict[int, Iterable[int]], non_duplicate_rows: List[int]) pandas.DataFrame[source]
apply_to_inputs(data: pandas.DataFrame) pandas.DataFrame[source]

Applies the modifier to the inputs of the dataset.

Parameters:

data (pd.DataFrame) – The input data

Returns:

The modified input data

Return type:

pd.DataFrame

apply_to_responses(data: pandas.DataFrame) pandas.DataFrame[source]

Applies the modifier to the responses of the dataset.

Parameters:

data (pd.DataFrame) – The response data

Returns:

The modified response data

Return type:

pd.DataFrame