The ππππππ R wrapper package makes efficient Rust implementations of graph adjustment identification distances (AID) available in R. These distances (based on ancestor, optimal, and parent adjustment) count how often the respective adjustment identification strategy leads to causal inferences that are incorrect relative to a ground-truth graph when applied to a candidate graph instead. There is also a ππππππ Python wrapper package.
If you publish research using ππππππ, please cite our UAI paper
@inproceedings{henckel2024adjustment,
    title = {{Adjustment Identification Distance: A gadjid for Causal Structure Learning}},
    author = {Leonard Henckel and Theo WΓΌrtzen and Sebastian Weichwald},
    booktitle = {{Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence (UAI)}},
    year = {2024},
    doi = {10.48550/arXiv.2402.08616},
} Evaluating graphs learned by causal discovery algorithms is difficult: The number of edges that differ between two graphs does not reflect how the graphs differ with respect to the identifying formulas they suggest for causal effects. We introduce a framework for developing causal distances between graphs which includes the structural intervention distance for directed acyclic graphs as a special case. We use this framework to develop improved adjustment-based distances as well as extensions to completed partially directed acyclic graphs and causal orders. We develop new reachability algorithms to compute the distances efficiently and to prove their low polynomial time complexity. In our package ππππππ, we provide implementations of our distances; they are orders of magnitude faster with proven lower time complexity than the structural intervention distance and thereby provide a success metric for causal discovery that scales to graph sizes that were previously prohibitive.
install.packages("gadjid")A source package install requires the rust toolchain to be installed.
Just install.packages("gadjid") to install the latest
release of ππππππ
library(gadjid)
?ancestor_aid
g_true <- rbind(c(0, 1, 1, 1),
                c(0, 0, 1, 1),
                c(0, 0, 0, 1),
                c(0, 0, 0, 0))
g_guess <- rbind(c(0, 1, 0, 0),
                 c(0, 0, 1, 0),
                 c(0, 0, 0, 1),
                 c(0, 0, 0, 0))
ancestor_aid(g_true, g_guess, edge_direction = "from row to column")
oset_aid(g_true, g_guess, edge_direction = "from row to column")
parent_aid(g_true, g_guess, edge_direction = "from row to column")
shd(g_true, g_guess)
sid(g_true, g_guess, edge_direction = "from row to column")ππππππ uses rayon
for parallelism using, per default, as many threads as there are
physical CPU cores. The number of threads to use can be set via the
environment variable RAYON_NUM_THREADS. We recommend to do
so and to set the number of threads manually, not least to be explicit
and to avoid the small runtime overhead for determining the number of
physical CPU cores.
ancestor_aid(g_true, g_guess, edge_direction)oset_aid(g_true, g_guess, edge_direction)parent_aid(g_true, g_guess, edge_direction)shd(g_true, g_guess)sid(g_true, g_guess, edge_direction) β only for
DAGs!where g_true and g_guess are adjacency
matrices of a DAG or CPDAG and edge_direction determines
whether a 1 at r-th row and c-th column of an adjacency
matrix codes the edge r β c
(edge_direction="from row to column") or c β r
(edge_direction="from column to row"); see the
documentation pages, such as ?ancestor_aid, for more
information. The functions are not symmetric in their inputs: To
calculate a distance, identifying formulas for causal effects are
inferred in the graph g_guess and verified against the
graph g_true. Distances return a 2-element vector
c(normalised_distance, mistake_count) of the fraction of
causal effects inferred in g_guess that are wrong relative to g_true,
normalised_distance, and the number of wrongly inferred
causal effects, mistake_count. There are \(p(p-1)\) pairwise causal effects to infer
in graphs with \(p\) nodes and we
define normalisation as
normalised_distance = mistake_count / p(p-1).
You may also calculate the SID between DAGs via
parent_aid(DAG_true, DAG_guess, edge_direction), but we
recommend ancestor_aid and oset_aid and for
CPDAG inputs the parent_aid does not coincide with the SID
(see also our UAI
paper).
ππππππ is available in source code form at https://github.com/CausalDisco/gadjid.
This Source Code Form is subject to the terms of the Mozilla Public License, v. 2.0. If a copy of the MPL was not distributed with this file, You can obtain one at https://mozilla.org/MPL/2.0/.
See also the MPL-2.0 FAQ.