Skip to content

Workflow module

ConditionalDependence

Class containing methods for calculating conditional dependence.

__init__(y, z)

Initialize and Validate a ConditionalDependence object.

You can then pass any X values to compute_conditional_dependence and compute_conditional_dependence_1d

Parameters:

Name Type Description Default
y npt.ArrayLike

A single list or 1D array or a pandas Series.

required
z npt.ArrayLike

A single list or list of lists or 1D/2D numpy array or pd.Series or pd.DataFrame.

required

Raises:

Type Description
ValueError

If y is not 1d.

ValueError

If z is not 1d or 2d.

ValueError

If y and z have different lengths.

ValueError

If there are <= 2 valid y values.

compute_conditional_dependence(x=None)

Compute conditional dependence coefficient based on

Azadkia and Chatterjee (2021). "A simple measure of conditional dependence", Annals of Statistics

If X is passed, computes T(Y, Z|X) where T is the conditional dependence coefficient. Otherwise, computes T(Y, Z).

Conditional Dependence Coefficient lies between 0 and 1, and is

0 if Y is completely independent of Z|X
1 if Y is a measurable function of Z|X

Parameters:

Name Type Description Default
x npt.ArrayLike

A single list or list of lists or 1D/2D numpy array or pd.Series or pd.DataFrame.

None

Returns:

Name Type Description
float

Conditional Dependence Coefficient.

Raises:

Type Description
ValueError

If x is passed, and not same number of rows as y.

compute_conditional_dependence_1d(x=None)

Computes conditional dependence of y on each column of z individually.

Use when you want to compute T(Y, Z_j|X) for each column of Z.

Parameters:

Name Type Description Default
x npt.ArrayLike

A single list or list of lists or 1D/2D numpy array or pd.Series or pd.DataFrame.

None

Returns:

Name Type Description
dict

Keys are column names (or indices if x is not a pandas object), and values are conditional dependence coefficients.

Raises:

Type Description
ValueError

If x is passed, and does not have same number of rows as y.

compute_conditional_dependence(y, z, x=None)

Compute conditional dependence coefficient based on

Azadkia and Chatterjee (2021). "A simple measure of conditional dependence", Annals of Statistics

If X is passed, computes T(Y, Z|X) where T is the conditional dependence coefficient. Otherwise, computes T(Y, Z).

Conditional Dependence Coefficient lies between 0 and 1, and is

0 if Y is completely independent of Z|X
1 if Y is a measurable function of Z|X

Parameters:

Name Type Description Default
y npt.ArrayLike

A single list or 1D array or a pandas Series.

required
z npt.ArrayLike

A single list or list of lists or 1D/2D numpy array or pd.Series or pd.DataFrame.

required
x npt.ArrayLike

A single list or list of lists or 1D/2D numpy array or pd.Series or pd.DataFrame.

None

Returns:

Name Type Description
float float

Conditional Dependence Coefficient.

Raises:

Type Description
ValueError

If y is not 1d.

ValueError

If z is not 1d or 2d.

ValueError

If y and z have different lengths.

ValueError

If there are <= 2 valid y values.

ValueError

If x is passed, and not same number of rows as y.

compute_conditional_dependence_1d(y, z, x=None)

Computes conditional dependence of y on each column of z individually.

Use when you want to compute T(Y, Z_j|X) for each column of Z.

Parameters:

Name Type Description Default
y npt.ArrayLike

A single list or 1D array or a pandas Series.

required
z npt.ArrayLike

A single list or list of lists or 1D/2D numpy array or pd.Series or pd.DataFrame.

required
x npt.ArrayLike

A single list or list of lists or 1D/2D numpy array or pd.Series or pd.DataFrame.

None

Returns:

Name Type Description
dict Dict[Union[str, int], float]

Keys are column names (or indices if x is not a pandas object), and values are conditional dependence coefficients.

Raises:

Type Description
ValueError

If y is not 1d.

ValueError

If z is not 1d or 2d.

ValueError

If y and z have different lengths.

ValueError

If there are <= 2 valid y values.

ValueError

If x is passed, and does not have the same number of rows as y.