Design

Goals

Likelihood maximization frameworks tend to have lots of moving parts. There are many ways to peel this potato [1]. Here are some of the design goals that drive the design of csky:

Modularity: csky should be extensible, both internally by developers and externally by the user. Separation of concerns should be taken seriously.

Performance: csky should be fast. Profilers can guide our eyes towards computational “hot spots”; we should use that information to optimize for speed. It’s ok to introduce tighter coupling than desired between the various parts, for the sake of performance, as long as the ugliness can still be limited in scope.

Brevity: csky usage should be concise. If you are interested in quickly reproducing a result you saw in some slides, it should be as straightforward as possible to do so; ideally you could write it out without referring to an example script.

Lightning Tour

csky is organized into several modules with simple names. They are listed below, roughly in order of subjective importance (which is well-correlated with the extent to which they are deeply thought-out and thus likely to be relatively stable):

  • Analysis configuration is found in csky.conf, with several items imported directly under the top level of csky.

  • The signal vs. background discrimination PDFs are found in csky.pdf.

  • The likelihood and its log likelihood optimization routines are found in csky.llh.

  • Real data, scrambled data, and simulated signal injection are found in csky.inj.

  • Trial operations are implemented in csky.trial.

  • Various event selections are described in, and can be loading using, csky.selections.

  • The basic building blocks common to many analyses – background space PDFs, signal acceptance parameterization, and energy PDF ratios – are organized using csky.analysis.

  • Spectral hypotheses are characterized using csky.hyp. (TODO: it turned out that other features of signal hypotheses never fit well in this module; it should probably be renamed to e.g. csky.fluxes or csky.spectra.)

  • Test statistic distribution fitting is handled by csky.dists.

  • Data manipulation and random state management are implemented in csky.utils.

  • Inspection of implementation details is made a little easier by csky.inspect.

  • A few plotting helpers are implemented in csky.plotting.

  • A simple timing tool is given in csky.timing.

The following additional modules are highly stable, but don’t seem to belong near the top of the “lightning tour”:

  • Coordinate transformations are handled in csky.coord.

  • Some relatively generic bookkeeping assistance (useful for getting information out of cluster job outputs) are found in csky.bk.

  • Noisy healpy outputs are silenced using csky.quiet_healpy.

Likelihood Implementation

The core task of this library is to define and evaluate the source search likelihood; pretty much everything else serves only to get data into and out of that machinery. The likelihood implementation in csky is structured as follows:

  1. csky.pdf defines PDF ratio models (the specifications) and evaluators (plug-and-chug calculators).

  2. csky.llh defines log likelihood ratio (LLH) models and evaluators, largely in terms of the PDF ratio models and evaluators. This module also provides a framework for parameter fitting via likelihood maximization.

  3. csky.inj provides an interface for generating dataset realizations using actual or randomized (thus background-like) data and/or simulated signals.

  4. csky.trial gives the user-level interface for generating actual or randomized datasets and evaluating and/or maximizing likelihoods. This module also includes tools for performing batches of trials; estimating threshold quantities such as sensitivities, discovery potentials, and upper limits; and distributing trials over multiple local cores.