FOCI module
FOCI
Class for computing FOCI.
Source code in xicorpy/foci.py
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 |
|
__init__(y, x)
Initialize and validate the FOCI object.
You can then use the select_features
method to select features.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
y |
npt.ArrayLike
|
A single list or 1D array or a pandas Series. |
required |
x |
npt.ArrayLike
|
A single list or list of lists or 1D/2D numpy array or pd.Series or pd.DataFrame. |
required |
Raises:
Type | Description |
---|---|
ValueError
|
If y is not 1d. |
ValueError
|
If x is not 1d or 2d. |
ValueError
|
If y and x have different lengths. |
ValueError
|
If there are <= 2 valid y values. |
Source code in xicorpy/foci.py
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
|
select_features(num_features=None, init_selection=None, get_conditional_dependency=False)
Selects features based on the Feature Ordering based on Conditional Independence (FOCI) algorithm in: Azadkia and Chatterjee (2021). "A simple measure of conditional dependence", Annals of Statistics
Parameters:
Name | Type | Description | Default |
---|---|---|---|
num_features |
int
|
Maximum number of features to select. Defaults to the number of features in x. |
None
|
init_selection |
list
|
Initial selection of features. |
None
|
get_conditional_dependency |
bool
|
If True, returns conditional dependency. Defaults to False |
False
|
Returns:
Name | Type | Description |
---|---|---|
list |
Union[List[Union[int, str]], Tuple[List[Union[int, str]], List[float]]]
|
List of selected features.
If x was |
list |
Union[List[Union[int, str]], Tuple[List[Union[int, str]], List[float]]]
|
Conditional Dependency measure as each feature got selected Only when get_conditional_dependency is True |
Source code in xicorpy/foci.py
48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 |
|
select_features_using_foci(y, x, num_features=None, init_selection=None, get_conditional_dependency=False)
Implements the FOCI algorithm for feature selection.
Azadkia and Chatterjee (2021). "A simple measure of conditional dependence", Annals of Statistics. https://arxiv.org/abs/1910.12327.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
y |
npt.ArrayLike
|
The dependent variable. A single list or 1D array or a pandas Series. |
required |
x |
npt.ArrayLike
|
The independent variables. A single list or list of lists or 1D/2D numpy array or pd.Series or pd.DataFrame. |
required |
num_features |
int
|
Max number of features to select. Defaults to ALL features. |
None
|
init_selection |
list
|
Initial selection of features.
If |
None
|
get_conditional_dependency |
bool
|
If True, returns conditional dependency |
False
|
Returns:
Name | Type | Description |
---|---|---|
list |
Union[List[Union[int, str]], Tuple[List[Union[int, str]], List[float]]]
|
List of selected features.
If x was |
list |
Union[List[Union[int, str]], Tuple[List[Union[int, str]], List[float]]]
|
Conditional Dependency measure as each feature got selected Only when get_conditional_dependency is True |
Raises:
Type | Description |
---|---|
ValueError
|
If y is not 1d. |
ValueError
|
If x is not 1d or 2d. |
ValueError
|
If y and x have different lengths. |
ValueError
|
If there are <= 2 valid y values. |
Source code in xicorpy/foci.py
103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 |
|