Learn
- lassobbn.learn.do_learn(df_path: str, nodes: List[str], seen: Dict[str, List[str]], ordering_map: Dict[str, List[str]], n_way=3, ignore_neg_gt=- 0.1, ignore_pos_lt=0.1, n_regressions=10, solver='liblinear', penalty='l1', C=0.2, robust_threshold=0.9) None
Recursively learns parents or robust independent variables associated with each variable.
- Parameters
df_path – CSV path.
nodes – List of variables.
seen – Dictionary storing processed/seen variables.
ordering_map – Ordering map.
n_way – Number of n-way interactions. Default is 3.
ignore_neg_gt – Threshold for ignoring negative coefficients.
ignore_pos_lt – Threshold for ignoring positive coefficients.
n_regressions – The number of regressions to do. Default is 10.
solver – Solver. Default is
liblinear
.penalty – Penalty. Default is
l1
.C – Regularization strength. Default is
0.2
.robust_threshold – Robustness threshold. Default is
0.9
.
- Returns
None.
- lassobbn.learn.do_regression(X_cols: List[str], y_col: str, df: pandas.core.frame.DataFrame, solver='liblinear', penalty='l1', C=0.2) sklearn.linear_model._logistic.LogisticRegression
Performs regression.
- Parameters
X_cols – Independent variables.
y_col – Dependent variable.
df – Data frame.
solver – Solver. Default is liblinear.
penalty – Penalty. Default is
l1
.C – Strength of regularlization. Default is
0.2
.
- Returns
Logistic regression model.
- lassobbn.learn.do_robust_regression(X_cols: List[str], y_col: str, df_path: str, n_way=3, ignore_neg_gt=- 0.1, ignore_pos_lt=0.1, n_regressions=10, solver='liblinear', penalty='l1', C=0.2, robust_threshold=0.9) Dict[str, Union[str, List]]
Performs robust regression.
- Parameters
X_cols – List of independent variables.
y_col – Dependent variable.
df_path – Path of CSV file.
n_way – Number of n-way interactions. Default is 3.
ignore_neg_gt – Threshold for ignoring negative coefficients.
ignore_pos_lt – Threshold for ignoring positive coefficients.
n_regressions – The number of regressions to do. Default is 10.
solver – Solver. Default is
liblinear
.penalty – Penalty. Default is
l1
.C – Regularization strength. Default is
0.2
.robust_threshold – Robustness threshold. Default is
0.9
.
- Returns
A dictionary storing parents of a child. The parents are said to be robust.
- lassobbn.learn.expand_data(df_path: str, parents: Dict[str, List[str]]) pandas.core.frame.DataFrame
Expands data with additional columns defined by parent-child relationships.
- Parameters
df_path – CSV path.
parents – Parent-child relationships.
- Returns
Data frame.
- lassobbn.learn.extract_meta(meta_path: str) Tuple[Dict[str, List[str]], List[str]]
Extracts meta data. :param meta_path: Metadata path (JSON file). :return: Tuple; (ordering map, start nodes).
- lassobbn.learn.extract_model_params(independent_cols: List[str], y_col: str, model: sklearn.linear_model._logistic.LogisticRegression) Dict[str, Union[str, float]]
Extracts parameters from models (e.g. coefficients).
- Parameters
independent_cols – List of independent variables.
y_col – Dependent variable.
model – Logistic regression model.
- Returns
Parameters (e.g. coefficients of each independent variable).
- lassobbn.learn.get_data(df_path: str, X_cols: List[str], y_col: str, n_way=3) pandas.core.frame.DataFrame
Gets a data frame with additional columns representing the n-way interactions.
- Parameters
df_path – Path to CSV file.
X_cols – List of variables.
y_col – The dependent variable.
n_way – Number of n-way interactions. Default is
3
.
- Returns
Data frame.
- lassobbn.learn.get_graph(parents: Dict[str, List[str]]) networkx.classes.digraph.DiGraph
Gets a graph
nx.DiGraph
.- Parameters
parents – Dictionary; keys are children, values are list of parents.
- Returns
Graph.
- lassobbn.learn.get_n_way(X_cols: List[str], n_way=3) List[Tuple[str, ...]]
Gets up to all n-way interactions.
- Parameters
X_cols – List of variables.
n_way – Maximum n-way interactions. Default is
3
.
- Returns
List of n-way interactions.
- lassobbn.learn.get_ordering_map(meta: Dict[str, any]) Dict[str, List[str]]
Gets a dictionary specifying ordering. A key is a variable, a value is a list of variables that comes before.
- Parameters
meta – Metadata.
- Returns
Ordering.
- lassobbn.learn.get_robust_stats(robust: pandas.core.frame.DataFrame, robust_threshold=0.9) pandas.core.frame.DataFrame
Computes the robustness statistics.
- Parameters
robust – Data frame of robustness indicators.
robust_threshold – Threshold for robustness. Default is
0.9
.
- Returns
Data frame of variables that are robust.
- lassobbn.learn.get_start_nodes(meta: Dict[str, any]) List[str]
Gets a list of start variables/nodes to kick off the algorithm.
- Parameters
meta – Metadata.
- Returns
Start nodes.
- lassobbn.learn.learn_parameters(df_path: str, pas: Dict[str, List[str]]) Tuple[Dict[str, List[str]], networkx.classes.digraph.DiGraph, Dict[str, List[float]]]
Gets the parameters.
- Parameters
df_path – CSV file.
pas – Parent-child relationships (structure).
- Returns
Tuple; first item is dictionary of domains; second item is a graph; third item is dictionary of probabilities.
- lassobbn.learn.learn_structure(df_path: str, meta_path: str, n_way=3, ignore_neg_gt=- 0.1, ignore_pos_lt=0.1, n_regressions=10, solver='liblinear', penalty='l1', C=0.2, robust_threshold=0.9) Dict[str, List[str]]
Kicks off the learning process.
- Parameters
df_path – CSV path.
meta_path – Metadata path.
n_way – Number of n-way interactions. Default is 3.
ignore_neg_gt – Threshold for ignoring negative coefficients.
ignore_pos_lt – Threshold for ignoring positive coefficients.
n_regressions – The number of regressions to do. Default is 10.
solver – Solver. Default is
liblinear
.penalty – Penalty. Default is
l1
.C – Regularization strength. Default is
0.2
.robust_threshold – Robustness threshold. Default is
0.9
.
- Returns
Dictionary where keys are children and values are list of parents.
- lassobbn.learn.posteriors_to_df(jt: pybbn.graph.jointree.JoinTree) pandas.core.frame.DataFrame
Converts posteriors to data frame.
- Parameters
jt – Join tree.
- Returns
Data frame.
- lassobbn.learn.to_bbn(d: Dict[str, List[str]], s: networkx.classes.digraph.DiGraph, p: Dict[str, List[float]]) pybbn.graph.dag.Bbn
Converts the structure and parameters to a BBN.
- Parameters
d – Domain of each variable.
s – Structure.
p – Parameter.
- Returns
BBN.
- lassobbn.learn.to_join_tree(bbn: pybbn.graph.dag.Bbn) pybbn.graph.jointree.JoinTree
Converts a BBN to a Join Tree.
- Parameters
bbn – BBN.
- Returns
Join Tree.
- lassobbn.learn.to_robustness_indication(params: pandas.core.frame.DataFrame, ignore_neg_gt=- 0.1, ignore_pos_lt=0.1) pandas.core.frame.DataFrame
Checks if each coefficient value is “robust”. A coefficient is NOT robust if it is less
ignore_neg_gt
or if it is less thanignore_pos_lt
.- Parameters
params – Data frame of parameters.
ignore_neg_gt – Threshold. Default is
-0.1
.ignore_pos_lt – Threshold. Default is
0.1
.
- Returns
Data frame (all 1’s and 0’s) indicating robustness.
- lassobbn.learn.trim_parents(parents: List[str]) List[str]
Prunes or trims down the list of parents. There might be duplicates as a result of compound or n-way interactions.
- Parameters
parents – List of parents.
- Returns
List of (pruned/trimmed) parents.
- lassobbn.learn.trim_relationships(rels: Dict[str, List[str]]) Dict[str, List[str]]
Trims/prune parent-child relationships.
- Parameters
rels – Dictionary of parent-child relationships.
- Returns
Dictionary of trimmed parent-child relationships.