doenut.data.modifiers
DoENUT: data.modifiers These classes provide ways to manipulate dataset (filtering, scaling, etc).
Submodules
Package Contents
Classes
Takes a dataset and scales it per column using an ortho scaling to |
|
DataSet Modifier to remove columns from the dataset |
|
Parses a dataset and removes all but the first instance of any row that |
|
Parses a dataset and removes all but the first instance of any row that |
- class doenut.data.modifiers.OrthoScaler(inputs: pandas.DataFrame, responses: pandas.DataFrame, scale_responses: bool = False)[source]
Bases:
doenut.data.modifiers.data_set_modifier.DataSetModifierTakes a dataset and scales it per column using an ortho scaling to the range -1 … 1
- Parameters:
inputs (pd.DataFrame) – The dataset’s inputs
responses (pd.DataFrame) – The dataset’s responses
scale_responses (bool, default False) – Whether to also scale the responses.
- class doenut.data.modifiers.ColumnSelector(inputs: pandas.DataFrame, responses: pandas.DataFrame, input_selector: List[str | int] = None, response_selector: List[str | int] = None)[source]
Bases:
doenut.data.modifiers.data_set_modifier.DataSetModifierDataSet Modifier to remove columns from the dataset
- Parameters:
inputs (pd.DataFrame) – The dataset’s inputs
responses (pd.DataFrame) – The dataset’s responses
input_selector (List["str | int"], optional) – A list to filter the inputs by
response_selector (List["str | int"], optional) – A list to filter the responses by
Warning
At least one of
input_selectorandresponse_selectormust be specified.- classmethod _parse_selector(data: pandas.DataFrame, selector: List[str | int]) Tuple[List[str], List[int]][source]
Internal helper function to take either a list of column names or column indices and convert it to the other.
- Parameters:
data (pd.DataFrame) – The data set the list applies to
selector (List["str | int"]) – The known selector list
- Returns:
List[str] – The list of column names selected
List[int] – The list of column indices selected
- class doenut.data.modifiers.DuplicateRemover(inputs: pandas.DataFrame, responses: pandas.DataFrame)[source]
Bases:
doenut.data.modifiers.data_set_modifier.DataSetModifierParses a dataset and removes all but the first instance of any row that has duplicate values for the inputs. Will also remove the corresponding row in the responses.
- Parameters:
inputs (pd.DataFrame) – The dataset’s inputs
responses (pd.DataFrame) – The dataset’s responses
- classmethod _get_non_duplicate_rows(data: pandas.DataFrame, duplicates_dict: Dict[int, Iterable[int]] = None) List[int][source]
- class doenut.data.modifiers.DuplicateAverager(inputs: pandas.DataFrame, responses: pandas.DataFrame)[source]
Bases:
doenut.data.modifiers.duplicate_remover.DuplicateRemoverParses a dataset and removes all but the first instance of any row that has duplicate values for the inputs. Will also remove the corresponding row in the responses, replacing the remaining response with the averages of the duplicates’ values.
- Parameters:
inputs (pd.DataFrame) – The dataset’s inputs
responses (pd.DataFrame) – The dataset’s responses
- classmethod _apply(data: pandas.DataFrame, duplicate_dict: Dict[int, Iterable[int]], non_duplicate_rows: List[int]) pandas.DataFrame[source]