summary#

scio.eval.summary(confs_ind, confs_oods, *, scores_and_layers=None, oods_title=None, metrics=None, topk=0, baseline=None, legend=True, convex_hull=False, show=True, block=None, **hist_kw)[source]#

Print evaluation table, plot and show histograms and ROCs.

Parameters:

topk (int) – Use to prune the summary. If metrics is provided and 0 < topk <= n_scores, only the scores achieving top topk performance for at least one OoD scenario and one metric are shown. See topk_evals() for more details and the interaction with baseline ─ which is passed. Defaults to 0, showing all the results.
[...] – For other arguments specification, refer to compute_metrics(), summary_table() and summary_plot().

Note

If metrics is not provided, no evaluation table is computed, in which case this is equivalent to a simpler summary_plot() call.

Tip

When evaluating many scores at once, we recommend using the topk argument with multiple complementary metrics, that will capture every behaviour of interest, such as:

metrics = (AUC(kind="convex_hull"), TPR(max_fpr=0.05), TNR(min_tpr=0.95), MCC())

The “complementarity” of metrics aims at avoiding to hide a suboptimal score which would only be “above average” in many OoD scenarios but in fact provide a good compromise. The resulting summary should be easier to read and analyze.

Example

summary(
    confs_ind,
    confs_oods,
    scores_and_layers=scores_and_layers,
    oods_title=oods_title,
    metrics=metrics,
    baseline=0,
)