Skip to content

Rely on posterior for pareto smooth tails#290

Open
VisruthSK wants to merge 12 commits intomasterfrom
posterior-pareto-smooth-tail
Open

Rely on posterior for pareto smooth tails#290
VisruthSK wants to merge 12 commits intomasterfrom
posterior-pareto-smooth-tail

Conversation

@VisruthSK
Copy link
Copy Markdown
Member

@VisruthSK VisruthSK commented Jun 12, 2025

Rely on posterior for pareto smoothening tails.

@VisruthSK
Copy link
Copy Markdown
Member Author

Make sure to bump version in DESCRIPTION when posterior gets updated

@VisruthSK
Copy link
Copy Markdown
Member Author

Updated snaps just to see if tests pass, but there are a lot of (repeated) warnings in the new version.

@VisruthSK VisruthSK marked this pull request as ready for review April 7, 2026 17:26
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 7, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 92.76%. Comparing base (23f2117) to head (b16435a).
⚠️ Report is 8 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #290      +/-   ##
==========================================
- Coverage   92.78%   92.76%   -0.02%     
==========================================
  Files          31       31              
  Lines        2992     2998       +6     
==========================================
+ Hits         2776     2781       +5     
- Misses        216      217       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@VisruthSK VisruthSK requested a review from jgabry April 7, 2026 17:36
VisruthSK and others added 2 commits April 7, 2026 13:06
Co-authored-by: Jonah Gabry <jgabry@gmail.com>
@avehtari
Copy link
Copy Markdown
Member

avehtari commented Apr 8, 2026

Unfortunately posterior has many warnings and the progress on having better control of the warnings with warning levels has been slow stan-dev/posterior#309

@jgabry
Copy link
Copy Markdown
Member

jgabry commented Apr 8, 2026

Unfortunately posterior has many warnings and the progress on having better control of the warnings with warning levels has been slow stan-dev/posterior#309

Yeah and warnings add some overhead (even if suppressed), so I expect this to be a bit slower than the current implementation in the case that there are many warnings. I wish there was just a quiet=TRUE argument to posterior::ps_tail() that we could set when calling it from loo since we already have a better warning in loo for this. But it's probably not so much slower, although answering questions like that is what prompted my suggestion in #348.

@avehtari
Copy link
Copy Markdown
Member

avehtari commented Apr 8, 2026

Just in case if you had not noticed posterior::pareto_smooth() and posterior::ps_tail() were fixed about two weeks ago stan-dev/posterior#442

@jgabry
Copy link
Copy Markdown
Member

jgabry commented Apr 8, 2026

No I missed that PR, I hadn't seen that

@jgabry
Copy link
Copy Markdown
Member

jgabry commented Apr 9, 2026

@avehtari it would also be good if posterior::ps_tail() had an option to accept already sorted input. Right now we have to sort in loo to figure out if there are enough tail samples but then ps_tail() sorts again internally. Previously since we were already sorting in loo we could just reuse the sorted vector, but now we have to do two sorts, which adds overhead. I'm not sure how much, but #352 can maybe help figure that out.

@jgabry
Copy link
Copy Markdown
Member

jgabry commented Apr 9, 2026

I think this looks good now, the only question I have is whether we care about the extra sort that we need to do (see previous comment). I don't think there's a way to avoid it unless posterior::ps_tail is updated.

@VisruthSK
Copy link
Copy Markdown
Member Author

Seems like its probably worth opening a PR to modify ps_tail then to add two flags, sorted and silent?

@avehtari
Copy link
Copy Markdown
Member

avehtari commented Apr 9, 2026

Before adding silent flag to any function, I think we should decide whether we would in general prefer to use per function flags or global package options. See also stan-dev/posterior#309, stan-dev/cmdstanr#1175, and #353

@avehtari
Copy link
Copy Markdown
Member

avehtari commented Apr 9, 2026

Performance

Using the 4000x1359 log_lik_matrix from #340 (comment), current posterior::ps_tail() is 12% slower than current loo:::do_psis_i

> tic();q=replicate(10,apply(log_lik_matrix, 2, \(x) loo:::do_psis_i(x, 190)));toc()
6.654 sec elapsed
> tic();q=replicate(10,apply(log_lik_matrix, 2, \(x) ps_tail(x, 190, tail="right", are_log_weights=TRUE)));toc()
7.443 sec elapsed

I did not test with this PR. As that log_lik_matrix is considered to be added for tests in that PR, maybe the same can be used for performance testing.

(I'll add later comments about desired behavior which helps to clarify which warnings we want to see)

@VisruthSK
Copy link
Copy Markdown
Member Author

Thanks! I forgot about that example from that PR, good idea to add it to #352.

@avehtari
Copy link
Copy Markdown
Member

avehtari commented Apr 9, 2026

  • I rerun the speed experiment and now posterior::ps_tail() has the same speed (without any modifications)
  • In psis-loo, constant tail is problem only for computing Pareto-k diagnostic. Only problematic constant tail case would be if all log liks are -Inf or Inf, but that is very unlikely and would cause other errors. If log_liks are finite then constan tail means many clearly non-zero weights which is fine for importance sampling. Thus we don't need to give a warning about constant tail, we just return the tail as is, and Pareto-k as NA. It's a later problem how to handle that NA in diagnostic messages and print.
  • In posterior package, the ndraws_tail < 5 warning in ps_tail() is never happening as pareto_smooth() sets ndraws_tail to be at least 5. I think that the ndraws_tail < 5 warning in ps_tail() could be just removed.
  • Years ago I did use r_eff_i to compute tail_len_i, but I would not do that anymore. This would simplify the code as we would need to compute tail_len only once. We can get tail_len < 5 only if the size of the posterior sample S < 21, which is very unlikely. With just one tail_len, if S < 21, we could just return all log_liks as they are and do_psis_i doesn't need to check for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants