doenut.data.modifiers.duplicate_remover
Module Contents
Classes
Parses a dataset and removes all but the first instance of any row that |
- class doenut.data.modifiers.duplicate_remover.DuplicateRemover(inputs: pandas.DataFrame, responses: pandas.DataFrame)[source]
Bases:
doenut.data.modifiers.data_set_modifier.DataSetModifierParses a dataset and removes all but the first instance of any row that has duplicate values for the inputs. Will also remove the corresponding row in the responses.
- Parameters:
inputs (pd.DataFrame) – The dataset’s inputs
responses (pd.DataFrame) – The dataset’s responses
- classmethod _get_non_duplicate_rows(data: pandas.DataFrame, duplicates_dict: Dict[int, Iterable[int]] = None) List[int][source]