This page is not created by, affiliated with, or supported by Slack Technologies, Inc.

## 2019-11-21

## Channels

- # aleph (2)
- # announcements (2)
- # babashka (10)
- # beginners (117)
- # calva (11)
- # cider (19)
- # clj-kondo (27)
- # cljs-dev (24)
- # cljsjs (1)
- # clojure (73)
- # clojure-europe (3)
- # clojure-italy (2)
- # clojure-nl (47)
- # clojure-spec (23)
- # clojure-uk (28)
- # clojurescript (71)
- # cursive (7)
- # data-science (17)
- # datascript (1)
- # datomic (7)
- # duct (23)
- # emacs (23)
- # fulcro (6)
- # graalvm (41)
- # jobs (2)
- # luminus (1)
- # malli (1)
- # off-topic (151)
- # pathom (1)
- # portkey (10)
- # re-frame (12)
- # reitit (17)
- # shadow-cljs (158)
- # spacemacs (14)
- # sql (8)
- # tools-deps (17)
- # xtdb (9)

Does anyone here know of a tool for plotting 'confidence regions' of 2D probability distributions?

More precisely, I'd like to draw (posterior) probability densities as 2D heat maps, with 'contour lines' delineating regions of probability mass 95%, 99% etc.

Does that make sense, and does it have a name?

I think you can achieve something similar with ggplot2: https://ggplot2.tidyverse.org/reference/geom_contour.html

Might need to do something with `stat_contour`

to get the specific regions you’re interested in.
No idea about clj, I’m afraid

Thanks. Thinking out loud, I guess I could also find the appropriate density levels, either by numerical integration + dichotomic search, or by filling a 2D array with densities, sorting the values and searching for quantiles. Then draw the contours at the appropriate level lines.

I'm also wondering about the relevance of this approach for data analysis - are there alternative approaches to choosing / viewing 2D confidence regions that make this one uninteresting?

Yes, but the point is not to just plot contour lines, rather precisely those contour lines which delineate regions of prescribed probability mass (90%, 95%, 99%, etc.). Finding which density levels correspond to those regions may not be trivial!

This is not trivial. Contours are made out of kernel density estimator which is usually just gaussian blur (for 2d) or specific kernel function (for 1d). I don't see an easy way to estimate inverse CDF for such approach.

@U1EP3BZ3Q In this case, I can evaluate the density at any point, so it seems doable: https://clojurians.slack.com/archives/C0BQDEJ8M/p1574352117165200

Still integrating area is much more trickier than 1d range for symmetric distribution.

@val_waeselynck > by filling a 2D array with densities, sorting the values and searching for quantiles

to find quantiles you want to use icdf (cumulative density) not pdf (density). For 2d you want to find volume and area which covers say 95% of total density volume.

For distributions like multivariate normal some numerical algorithms exist but I suppose they can't be applied to general case and any distribution (especially multidimentional empirical)

> to find quantiles you want to use icdf (cumulative density) not pdf (density). Yes of course, just forgot to mention it :)

> For distributions like multivariate normal some numerical algorithms exist but I suppose they can't be applied to general case and any distribution (especially multidimentional empirical) Yes for 2d gaussians this can be solved analytically - once you have an eigen-decomposition of the covariance matrix you're good, and even that may not be mandatory.

Yes, but the point is not to just plot contour lines, rather precisely those contour lines which delineate regions of prescribed probability mass (90%, 95%, 99%, etc.). Finding which density levels correspond to those regions may not be trivial!