Given a fixed dependency graph $G$ that describes a Bayesian network of binary variables $X_1, \dots, X_n$, our main result is a tight bound on the mutual information $I_c(Y_1, \dots, Y_k) = \sum_{j=1}^k H(Y_j)/c - H(Y_1, \dots, Y_k)$ of an observed subset $Y_1, \dots, Y_k$ of the variables $X_1, \dots, X_n$. Our bound depends on certain quantities that can be computed from the connective structure of the nodes in $G$. Thus it allows to discriminate between different dependency graphs for a probability distribution, as we show from numerical experiments.
Bayesian networks, causal Markov condition, information theory, information inequalities, common ancestors, causal inference
60A08, 62B09