API Reference

Pure python library for calculating the weights of Monte Carlo simulation for IceCube.

SimWeights was designed with goal of calculating weights for IceCube simulation in a way that it is easy to combine combine datasets with different generation parameters into a single sample. It was also designed to be a stand alone project which does not depend on IceTray in any way so that it can be installed easily on laptops. SimWeights gathers all the information it needs form information in the hdf5 file so there is no need for access to the simulation production database. SimWeights works with files produced with corsika-reader, neutrino-generator, and genie-reader.

class simweights.CircleInjector(radius: float, cos_zen_min: float, cos_zen_max: float, colname: str | None = None)

Bases: object

Particles are generated on a circle perpendicular to the direction of the particle.

Spatial distribution when particles are injected on the surfused by older neutrino-generator versions where the particle is injected in a cylinder that is parallel to momentum vector of the primary. The etendue is just the area of the circle times the solid angle.

pdf(cos_zen: ArrayLike) → NDArray[np.float64]: The probability density function for the given zenith angle.

projected_area(cos_zen: float) → float: Returns the cross sectional area of the injection area in cm^2.

simweights.CorsikaWeighter(file_obj: Any, nfiles: float | None = None) → Weighter

Weighter for CORSIKA-in-ice simulation made with I3CORSIKAReader.

I3CORSIKAReader not use S-Frames and stores the surface information in an I3MapStringDouble so that the user does not know how many jobs contributed to the current sample, so it needs to know the number of files.

class simweights.FixedFractionFlux(fractions: Mapping[PDGCode, float], basis: CosmicRayFlux | None = None, normalized: bool = True)

Bases: CosmicRayFlux

Total energy per particle flux flux split among mass groups with a constant fraction.

By default, this takes as basis the flux from Gaisser H4a summed over all its mass groups, then multiplies it by the given fraction. This is a quick way to consider different weightings for systematic checks.

pdgids: Sequence[PDGCode] = ()

class simweights.GaisserH3a

Bases: CosmicRayFlux

Spectral fits from Gaisser.

Internal report[4] and in Astropart. Phys[5] by Tom Gaisser. The model H3a with a mixed extra-galactic population (Fig. 2) has all iron at the highest energy and would represent a scenario in which the cutoff is not an effect of energy loss in propagation over cosmic distances through the CMB but is instead just the extra-galactic accelerators reaching their highest energy.

pdgids: Sequence[PDGCode] = (PDGCode.PPlus, PDGCode.He4Nucleus, PDGCode.N14Nucleus, PDGCode.Al27Nucleus, PDGCode.Fe56Nucleus)

class simweights.GaisserH4a

Bases: CosmicRayFlux

Spectral fits from Gaisser.

Internal report[4] and in Astropart. Phys[5] by Tom Gaisser. In the model H4a, on the other hand, the extra-galactic component is assumed to be all protons.

pdgids: Sequence[PDGCode] = (PDGCode.PPlus, PDGCode.He4Nucleus, PDGCode.N14Nucleus, PDGCode.Al27Nucleus, PDGCode.Fe56Nucleus)

class simweights.GaisserH4a_IT

Bases: CosmicRayFlux

Variation of Gaisser’s H4a flux using only four components.

This is not a very physical flux: The Oxygen group is the sum of H4a’s Nitrogen and Aluminum groups. This is the flux used as an “a priori” estimate of mass-composition to produce the IceTop-only flux[6].

pdgids: Sequence[PDGCode] = (PDGCode.PPlus, PDGCode.He4Nucleus, PDGCode.O16Nucleus, PDGCode.Fe56Nucleus)

class simweights.GaisserHillas

Bases: CosmicRayFlux

Spectral fits from Gaisser.

From an internal report[4] and in Astropart. Phys[5] by Tom Gaisser.

pdgids: Sequence[PDGCode] = (PDGCode.PPlus, PDGCode.He4Nucleus, PDGCode.N14Nucleus, PDGCode.Al27Nucleus, PDGCode.Fe56Nucleus)

class simweights.GenerationSurface(*spectra: SurfaceTuple)

Bases: object

Represents a surface on which Monte Carlo simulation was generated on.

The number of events thrown, the spatial distribution, and the energy distribution are stored in this class. Each particle type is stored separately.

get_cos_zenith_range(pdgid: PDGCode | None) → tuple[float, float]: Return the cos zenith range for given particle type over all surfaces.

get_energy_range(pdgid: PDGCode | None) → tuple[float, float]: Return the energy range for given particle type over all surfaces.

get_epdf(pdgid: ArrayLike, **kwargs: ArrayLike) → NDArray[np.float64]

Get the extended pdf of an event.

The pdf is the probability that an event with these parameters is generated. The pdf is multiplied by the number of events.

get_keys() → list[str]: Get a list of the available keys needed for weighting this surface.

get_pdgids() → list[int | PDGCode]: Return a list of pdgids that this surface represents.

simweights.GenieWeighter(file_obj: Any) → Weighter

Weighter for GENIE simulation.

Reads I3GenieInfo from S-Frames and I3GenieResult from Q-Frames.

class simweights.GlobalFitGST

Bases: CosmicRayFlux

Spectral fits by Gaisser, Stanev and Tilav[10].

pdgids: Sequence[PDGCode] = (PDGCode.PPlus, PDGCode.He4Nucleus, PDGCode.N14Nucleus, PDGCode.Al27Nucleus, PDGCode.Fe56Nucleus)

class simweights.GlobalFitGST_IT

Bases: CosmicRayFlux

GlobalFitGST for four components [p, He, O, Fe].

The Oxygen group is the sum of Nitrogen and Aluminum groups of GlobalFitGST.

pdgids: Sequence[PDGCode] = (PDGCode.PPlus, PDGCode.He4Nucleus, PDGCode.O16Nucleus, PDGCode.Fe56Nucleus)

class simweights.GlobalSplineFit

Bases: GlobalSplineFitBase

Data-driven spline fit of the cosmic ray spectrum by Dembinski et. al. [#GSFDembinski].

groups: Sequence[tuple[int, int]]

pdgids: Sequence[PDGCode] = (PDGCode.PPlus, PDGCode.He4Nucleus, PDGCode.Li7Nucleus, PDGCode.Be9Nucleus, PDGCode.B11Nucleus, PDGCode.C12Nucleus, PDGCode.N14Nucleus, PDGCode.O16Nucleus, PDGCode.F19Nucleus, PDGCode.Ne20Nucleus, PDGCode.Na23Nucleus, PDGCode.Mg24Nucleus, PDGCode.Al27Nucleus, PDGCode.Si28Nucleus, PDGCode.P31Nucleus, PDGCode.S32Nucleus, PDGCode.Cl35Nucleus, PDGCode.Ar40Nucleus, PDGCode.K39Nucleus, PDGCode.Ca40Nucleus, PDGCode.Sc45Nucleus, PDGCode.Ti48Nucleus, PDGCode.V51Nucleus, PDGCode.Cr52Nucleus, PDGCode.Mn55Nucleus, PDGCode.Fe56Nucleus, PDGCode.Co59Nucleus, PDGCode.Ni59Nucleus)

class simweights.GlobalSplineFit5Comp

Bases: GlobalSplineFitBase

Sum of the flux of the GSF model for the standard 5 components injected by IceCube.

GSF is a Data-driven spline fit of the cosmic ray spectrum by Dembinski et. al. [#GSFDembinski].

groups: Sequence[tuple[int, int]] = ((1, 1), (2, 5), (6, 11), (12, 15), (16, 27))

pdgids: Sequence[PDGCode] = (PDGCode.PPlus, PDGCode.He4Nucleus, PDGCode.N14Nucleus, PDGCode.Al27Nucleus, PDGCode.Fe56Nucleus)

class simweights.GlobalSplineFit_IT

Bases: GlobalSplineFitBase

Sum of the flux of the GSF model for the standard 4 components injected by IceCube.

[(H), (He), (Li, Be, B, C, N, O, F, Ne), (Na, Mg, Al, Si, P, S, Cl, Ar, K, Ca, Sc, Ti, V, Cr, Mn, Fe, Co, Ni)] GSF is a Data-driven spline fit of the cosmic ray spectrum by Dembinski et. al. [#GSFDembinski].

groups: Sequence[tuple[int, int]] = ((1, 1), (2, 2), (3, 10), (11, 28))

pdgids: Sequence[PDGCode] = (PDGCode.PPlus, PDGCode.He4Nucleus, PDGCode.O16Nucleus, PDGCode.Fe56Nucleus)

class simweights.Hoerandel

Bases: CosmicRayFlux

All-particle spectrum (up to iron) after Hörandel[1], as implemented in dCORSIKA.

pdgids: Sequence[PDGCode] = (PDGCode.PPlus, PDGCode.He4Nucleus, PDGCode.Li7Nucleus, PDGCode.Be9Nucleus, PDGCode.B11Nucleus, PDGCode.C12Nucleus, PDGCode.N14Nucleus, PDGCode.O16Nucleus, PDGCode.F19Nucleus, PDGCode.Ne20Nucleus, PDGCode.Na23Nucleus, PDGCode.Mg24Nucleus, PDGCode.Al27Nucleus, PDGCode.Si28Nucleus, PDGCode.P31Nucleus, PDGCode.S32Nucleus, PDGCode.Cl35Nucleus, PDGCode.Ar40Nucleus, PDGCode.K39Nucleus, PDGCode.Ca40Nucleus, PDGCode.Sc45Nucleus, PDGCode.Ti48Nucleus, PDGCode.V51Nucleus, PDGCode.Cr52Nucleus, PDGCode.Mn55Nucleus, PDGCode.Fe56Nucleus)

class simweights.Hoerandel5

Bases: CosmicRayFlux

Hoerandel with only 5 components.

After Becherini et al.[2] (These are the same as used by Arne Schoenwald’s version[3]).

pdgids: Sequence[PDGCode] = (PDGCode.PPlus, PDGCode.He4Nucleus, PDGCode.N14Nucleus, PDGCode.Al27Nucleus, PDGCode.Fe56Nucleus)

class simweights.Hoerandel_IT

Bases: CosmicRayFlux

Modified 5-component Hoerandel spectrum with N and Al replaced by O.

pdgids: Sequence[PDGCode] = (PDGCode.PPlus, PDGCode.He4Nucleus, PDGCode.O16Nucleus, PDGCode.Fe56Nucleus)

class simweights.Honda2004

Bases: CosmicRayFlux

Spectrum used to calculate neutrino fluxes in Honda et al. (2004).

Table 1, with modification from the text [7].

Note

the E_k notation means energy per nucleon!

pdgids: Sequence[PDGCode] = (PDGCode.PPlus, PDGCode.He4Nucleus, PDGCode.N14Nucleus, PDGCode.Al27Nucleus, PDGCode.Fe56Nucleus)

simweights.IceTopWeighter(file_obj: Any) → Weighter: Weighter for IceTop CORSIKA simulation made with I3TopSimulator.cxx.

class simweights.NaturalRateCylinder(length: float, radius: float, cos_zen_min: float, cos_zen_max: float, colname: str | None = None)

Bases: CylinderBase

Angular distribution when zenith distribution matched the natural rate of an isotropic source.

For a given zenith angle the intensity of particles thrown was proportional to the cross-sectional area perpendicular to the direction of the particle. This is the distribution generated by the icetray class I3Surfaces::Cylinder and is what is used for triggered CORSIKA in I3PrimaryInjector. It is also what CORSIKA will generate with the VOLUMECORR option, when the keyword DETCFG is set to \(l/(2r)\).

The Monte Carlo must have been generated with the following zenith angle intensity:

\[I \propto \pi\cdot r^2\cdot\sin\theta\cdot(\cos\theta+2/\pi\cdot l/r\cdot\sin\theta)\]

pdf(cos_zen: ArrayLike) → NDArray[np.float64]: The probability density function for the given zenith angle.

projected_area(cos_zen: ArrayLike) → NDArray[np.float64]

Cross sectional area of a cylinder in cm^2.

As seen from the angle described by cos_zen.

simweights.NuGenWeighter(file_obj: Any, nfiles: float) → Weighter

Weighter for neutrino-generator (NuGen) simulation.

Does not use S-Frames and stores the surface information in an I3MapStringDouble so that the user does not know how many jobs contributed to the current sample, so it needs to know the number of files. NuGen calculates the event weight in a column called TotalWeight which takes into account the neutrino cross-section, detector density, and distance traveled through the generation volume.

class simweights.PDGCode(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: IntEnum

Enumeration of the PDG particle numbering scheme for cosmic-ray primaries.

The Particle Data Group (PDG) assigns a unique code to each type of particle. The numbering includes all known elementary particles, composite particles, and atomic nuclei. However this enumeration is only used for cosmic-ray flux models and is limited to particle types in these models.

Al27Nucleus = 1000130270

Ar40Nucleus = 1000180400

B11Nucleus = 1000050110

Be9Nucleus = 1000040090

C12Nucleus = 1000060120

Ca40Nucleus = 1000200400

Cl35Nucleus = 1000170350

Co59Nucleus = 1000270590

Cr52Nucleus = 1000240520

F19Nucleus = 1000090190

Fe56Nucleus = 1000260560

Gamma = 22

He4Nucleus = 1000020040

K39Nucleus = 1000190390

Li7Nucleus = 1000030070

Mg24Nucleus = 1000120240

Mn55Nucleus = 1000250550

MuMinus = 13

MuPlus = -13

N14Nucleus = 1000070140

Na23Nucleus = 1000110230

Ne20Nucleus = 1000100200

Ni59Nucleus = 1000280590

NuE = 12

NuEBar = -12

NuMu = 14

NuMuBar = -14

NuTau = 16

NuTauBar = -16

O16Nucleus = 1000080160

P31Nucleus = 1000150310

PPlus = 2212

S32Nucleus = 1000160320

Sc45Nucleus = 1000210450

Si28Nucleus = 1000140280

Ti48Nucleus = 1000220480

V51Nucleus = 1000230510

as_integer_ratio()

Return a pair of integers, whose ratio is equal to the original int.

The ratio is in lowest terms and has a positive denominator.

>>> (10).as_integer_ratio()
(10, 1)
>>> (-10).as_integer_ratio()
(-10, 1)
>>> (0).as_integer_ratio()
(0, 1)

bit_count()

Number of ones in the binary representation of the absolute value of self.

Also known as the population count.

>>> bin(13)
'0b1101'
>>> (13).bit_count()
3

bit_length()

Number of bits necessary to represent self in binary.

>>> bin(37)
'0b100101'
>>> (37).bit_length()
6

conjugate(): Returns self, the complex conjugate of any int.

denominator: the denominator of a rational number in lowest terms

from_bytes(byteorder='big', *, signed=False)

Return the integer represented by the given array of bytes.

bytes: Holds the array of bytes to convert. The argument must either support the buffer protocol or be an iterable object producing bytes. Bytes and bytearray are examples of built-in objects that support the buffer protocol.
byteorder: The byte order used to represent the integer. If byteorder is ‘big’, the most significant byte is at the beginning of the byte array. If byteorder is ‘little’, the most significant byte is at the end of the byte array. To request the native byte order of the host system, use `sys.byteorder’ as the byte order value. Default is to use ‘big’.
signed: Indicates whether two’s complement is used to represent the integer.

imag: the imaginary part of a complex number

is_integer(): Returns True. Exists for duck type compatibility with float.is_integer.

numerator: the numerator of a rational number in lowest terms

real: the real part of a complex number

to_bytes(length=1, byteorder='big', *, signed=False)

Return an array of bytes representing an integer.

length: Length of bytes object to use. An OverflowError is raised if the integer is not representable with the given number of bytes. Default is length 1.
byteorder: The byte order used to represent the integer. If byteorder is ‘big’, the most significant byte is at the beginning of the byte array. If byteorder is ‘little’, the most significant byte is at the end of the byte array. To request the native byte order of the host system, use `sys.byteorder’ as the byte order value. Default is to use ‘big’.
signed: Determines whether two’s complement is used to represent the integer. If signed is False and a negative integer is given, an OverflowError is raised.

class simweights.PowerLaw(g: float, a: float, b: float, colname: str | None = None)

Bases: object

A power-law continuous probability distribution.

This has a similar interface to the probability distribution classes found in scipy.stats. However, it has several differences needed for weighting Monte Carlo simulation:

The support is defined from a to b rather than from 0 to 1.
Negative values of the power-law index are allowed.
No shape or location parameters are supported.

The probability density function for a PowerLaw is defined as:

\[pdf(x, \gamma) = A x^{\gamma}\quad\mathrm{for}\quad a \le x \le b.\]

Parameters:

g (float) – Power-law index
a (float) – Lower bound of the support of the distribution.
b (float) – Upper bound of the support of the distribution.

cdf(x: ArrayLike) → NDArray[np.float64]

Cumulative distribution function.

Parameters:: x (array_like) – quantiles
Returns:: Cumulative distribution function evaluated at x
Return type:: array_like

pdf(x: ArrayLike) → NDArray[np.float64]

Probability density function.

Parameters:: x (array_like) – quantiles
Returns:: Probability density function evaluated at x
Return type:: array_like

ppf(q: ArrayLike) → NDArray[np.float64]

Percent point function (inverse of cdf) at q.

Parameters:: q (array_like) – lower tail probability
Returns:: quantile corresponding to the lower tail probability q.
Return type:: array_like

rvs(size: Any = None, random_state: SeedType = None) → NDArray[np.float64]

Random variates.

Parameters:

size (int or tuple of ints, optional) – Defining number of random variates (Default is 1).
random_state ({None, int, ~np.random.RandomState, ~np.random.Generator}, optional) – This parameter defines the object to use for drawing random variates. If random_state is None the ~np.random.RandomState singleton is used. If random_state is an int, a new RandomState instance is used, seeded with random_state. If random_state is already a RandomState or Generator instance, then that object is used. Default is None.

class simweights.TIG1996

Bases: CosmicRayFlux

Spectrum used to calculate prompt neutrino fluxes in Enberg et al.

Paper: (2008)[8] (Eq. 30). The parameterization was taken directly from an earlier paper by Thunman et al[9]. Only the nucleon flux was given, so for simplicity we treat it as a proton-only flux.

pdgids: Sequence[PDGCode] = (PDGCode.PPlus,)

class simweights.UniformSolidAngleCylinder(length: float, radius: float, cos_zen_min: float, cos_zen_max: float, colname: str | None = None)

Bases: CylinderBase

Events are generated uniformly on the surface of a sphere.

A Cylinder where the the angular distribution was sampled as if it were uniform on the surface of a sphere. The area of the location surface is proportional to the cross section of the cylinder perpendicular to the direction of the primary.

The Monte Carlo must have been generated with the following zenith angle intensity:

\[I \propto \cos\theta\]

pdf(cos_zen: ArrayLike) → NDArray[np.float64]: The probability density function for the given zenith angle.

projected_area(cos_zen: ArrayLike) → NDArray[np.float64]

Cross sectional area of a cylinder in cm^2.

As seen from the angle described by cos_zen.

class simweights.Weighter(data: Iterable[Any], surface: GenerationSurface)

Bases: object

Abstract base class from which all weighers derive.

Weighters will take a file object as input and calculate the weights of the events in the file for a given flux. As well as helper functions for all columns in the file. Weighters will keep track of the generation surface for the Monte Carlo in question. Weighters can be added together to form samples with different simulation parameters

add_weight_column(name: str, column: ArrayLike) → None: Add a new column to be passed as parameters to flux models.

effective_area(energy_bins: ArrayLike, cos_zenith_bins: ArrayLike, mask: ArrayLike | None = None) → NDArray[np.float64]

Calculate The effective area for the given energy and zenith bins.

This is accomplished by histogramming the generation surface the simulation sample in energy and zenith bins and dividing by the size of the energy and solid angle of each bin. If mask is passed as a parameter, only events which are included in the mask are used. Effective areas are given units of \(\mathrm{m}^2\)

Note

If the sample contains more than one type of primary particle, then the result will be averaged over the number of particles. This is usually what you want. However, it can cause strange behavior if there is a small number of one type. In this case, the mask should be used to select the particle types individually.

Parameters:

energy_bins – array_like A length N+1 array of energy bin edges
cos_zenith_bins – array_like A length M+1 array of cos(zenith) bin edges
mask – array_like boolean array where 1 indicates to use the event in the calculation. Must have the same length as the simulation sample.

Returns:

array_like: An NxM array of effective areas. Where N is the number of energy bins and M is the number of cos(zenith) bins.

get_column(table: str, column: str) → NDArray[np.float64]: Helper function to get a specific column from the file.

get_weight_column(name: str) → NDArray[np.float64]: Helper function to get a column needed in the weight calculation.

get_weights(flux: Any) → NDArray[np.float64]

Calculate the weights for the sample for the given flux.

Parameters:: flux (Callable | FluxFunction | ArrayLike) – A model describing the flux of the primaries to weight against

The Flux argument can be one of several types:

An instance of nuflux.FluxFunction from nuflux.
A callable where the names of the arguments match the weight objects weighting columns.
An iterable with the same length as the data sample. If you have other means of calculating the flux for each event than the above options this can be useful.
A scalar number, this calculates the unrealistic scenario where all events have the same flux, this can be useful for testing or calculating effective areas. For neutrinos, If the value is 1 then the return value will be the well known quantity OneWeight.

tostring(flux: None | object | Callable[[Any], ArrayLike] | ArrayLike = None) → str

Creates a string with important information about this weighting object.

Generation surface, event map, number of events, and effective area. If optional flux is provided the event rate and livetime are added as well.

simweights.corsika_to_pdg(cid: ArrayLike) → NDArray[np.float64]

Convert CORSIKA particle code to particle data group (PDG) Monte Carlo numbering scheme.

Note

This function will only convert codes that correspond to nuclei needed for the flux models in this module. That includes PPlus (14) and He4Nucleus (402) through Fe56Nucleus (5626).

Parameters:: cid (array_like) – CORSIKA codes
Returns:: PDG codes
Return type:: array_like

simweights.generation_surface(pdgid: int | PDGCode, *dists: Dist) → GenerationSurface: Convenience function to generate a GenerationSurface for a single particle type.

References.