Skip to content

Don't warn about slightly large khats #234

@avehtari

Description

@avehtari

When loo package was created, I did assume we may need to be worried if 0.5<khat<0.7 and loo is giving warnings like

Warning: Some Pareto k diagnostic values are slightly high. See help('pareto-k-diagnostic') for details.

If we look at the diagnostics, we can see something like

Pareto k diagnostic values:
                         Count Pct.    Min. n_eff
(-Inf, 0.5]   (good)     34    85.0%   558       
(0.5, 0.7]   (ok)        6    15.0%   226       
   (0.7, 1]   (bad)       0     0.0%   <NA>      
   (1, Inf)   (very bad)  0     0.0%   <NA>      

All Pareto k estimates are ok (k < 0.7).
See help('pareto-k-diagnostic') for details.

Based on the newer results in PSIS paper version 2022, I would drop category ok and extend good beyond 0.7. As we are anyway recommending to do additional computation like moment matching only in case of bad or very bad, we are currently giving warnings that can be considered unnecessary.

The best option would be to replace 0.7 threshold with the khat_threshold=min(1-1/log10(S), 0.7), where S is the sample size (ESS would be better, but as each fold may have different ESS, this would be difficult). This threshold is justified in PSIS paper (2022 or later) Section 3.2.3. With default Stan 4000 draws, the threshold would be the same 0.7, so most people would not see a change in there. With fewer draws, e.g. S=100, the threshold would be 0.5. It's unlikely someone would try to do loo with less than 100 draws.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions