.. title: Function plotting and the inverse cdf
.. slug: function-plotting-and-the-inverse-cdf
.. date: 2024-03-06 23:19:41 UTC
.. tags: 
.. category: 
.. link: 
.. description: 
.. type: text
.. has_math: true

I previously wrote about `using the inverse-cdf
transform <link://post_path/posts/optimal-binning-and-the-inverse-cdf>`_
not just for its classic application of efficient random sampling, but
as an optimal histogramming-binning generator mapping exactly to
expected population quantiles.

(If you think of the random process that populates the histogram as
itself being done via uniform random sampling of the unit cdf
interval, mapped back into the coordinate value using inverse-cdf,
it's even clearer why a uniform binning of the cdf interval would be
optimal.)

In this post I'd like to show that we can usefully bend the inverse
cdf at least a little further, into applications where there isn't
even traditionally a positive-definite pdf. My motivation is efficient
function plotting.

In the `YODA <https://yoda.hepforge.org/>`_ new plotting system, I'd
like us to be able to overlay analytic -- or at least Pythony --
functional forms on top of data- or MC-populated histograms. But just
creating a `linspace()` of *lots* of points in $x$, and hoping that'll
be enough to smoothly render the most rapidly varying bits of a plot
-- the usual approach -- is unsatisfying.

If that plot has a mixture of curvy bits and straight sections, we
could distribute a given number of points more efficiently, as
straight parts need only one point at either end to draw a (Postscript
or PDF) line between, and the points budget would be better spent in
the curvy parts where straight lines are nowhere a good
approximation. If we're stuck with using straight-line segments --
which we more or less are with matplotlib -- then distributing their
endpoints more efficiently means we can get a good visual effect
everywhere on the plot with fewer points, and hence a smaller
file-size.

Let's think a little: as intimated above, the key issue is not the
gradient $df/dx$ of the function $f(x)$, as I can draw a
steep-gradient straight line just as effectively with two points as I
can a flat one, but the second derivative $d^2f/dx^2$, aka the
"curvature". So let's make a guess that what our brains want to see
proportionally mapped out is point density in proportion to the
curvature. So we can use the inverse-cdf method again, right: the cdf is
$$F(x) = \\int \\frac{d^2 f}{dx^2} dx = \\frac{df}{dx} ,$$
right?

Ah, but curvature can be both positive *and negative*: that messes
things up. We really need
$$F(x) = \\int \\left| \\frac{d^2 f}{dx^2} \\right| dx ,$$
but that's a more awkward object: we'd need to decompose into the
set of positive-curvature and negative curvature intervals, and
add up the integrals from each, e.g.
$$F(x) = \\sum_i \\pm \\int_{\\omega_i} \\frac{df}{dx} \\Big|_i ,$$
with appropriately identified $\\pm$ signs.

We could maybe do this for arbitrary functions with a symbolic algebra
library like sympy but it's awkward. And doesn't solve the
elephant-in-the-room issue that most monotonic functions, and hence
most cdfs, don't have an analytic inverse. (This problem is also the
major limit of the inverse-cdf method in general, of course.)

When in doubt, sample to victory! Let's just throw lots of evenly
spaced points at our function, numerically compute an array of second
derivatives and take their absolute values, then compute a numerical
cdf by taking the cumulative sum of the array:

.. code-block:: python
   :number-lines:

   xs = np.linspace(XMIN, XMAX, 1000)
   ys = f(xs)
   ypps = np.diff(ys, 2)
   ayps = np.cumsum(abs(ypps))
   ayps /= ayps[-1]

where the final line normalises to make sure the cdf really adds up to
1 as intended. Then we can throw a more modest number of points linearly
into the cdf interval, and numerically invert via straight-line interpolations:

.. code-block:: python
   :number-lines:

   def G(q):
       return np.interp(q, ayps, xs[1:-1])
   qs = np.linspace(0, 1, N)
   xs_opt = G(qs)
   ys_opt = f(xs_opt)

Let's see the result on a pretty nasty function,
$$f(x) = (2.5 - x) + e^{-2x} \\cos(20 x^{0.7}) ,$$
using just 50 points:

.. image:: /images/curvplot.png

Not bad, huh? The red line is this optimal sampling approach, the
green (with visibly bad-approximation straight lines in the wibbly
left-hand part) is the uniform straight-line approximation, and the
true function is hidden underneath in blue. The grey lines in the
background show the "optimal" sampling points, uniform in the $y$-axis
of the cdf, overlaid in orange.

Now, I've cheated a little here, because to my eyes that assumption
that the (magnitude of the) second derivative is proportional to the
optimal sampling density isn't quite correct. Unsurprisingly, our
visual cortexes are a bit better than that; in particular we seem to
be fairly good at compensating for scale differences, so we see
deviations from low-amplitude bits of smooth curve not so differently
from on high-amplitude versions of the same shape. In the full code,
reproduced below, I added a micture fraction $f$, so you can fade
smoothly between purely "optimal polling" and "uniform polling"
strategies. There are probably smarter ways.

I hope this was interesting. It feels to me like there's more
potential to play and improve here, but hopefully we'll deploy this feature
into YODA's plotting before too long!

.. listing:: curvplot.py python
