How to Combine Datasets

The syntax for combining multiple datasets is quite simple. Use the addition operator to combine two or more weighter object. The resulting object will correctly weight the combined sample.

# load the Medium Energy file
ME_file = tables.File("Level2_IC86.2016_corsika.021746.hdf5", "r")
ME_weighter = simweights.CorsikaWeighter(ME_file)

# load the High Energy file
HE_file = tables.File("Level2_IC86.2016_corsika.021745.hdf5", "r")
HE_weighter = simweights.CorsikaWeighter(HE_file)

# A combined weighter is created by summing two weighters
combined_weighter = ME_weighter + HE_weighter

# create a flux object and calculate the weights for all three weighters
flux_model = simweights.GaisserH3a()
ME_weights = ME_weighter.get_weights(flux_model)
HE_weights = HE_weighter.get_weights(flux_model)
combined_weights = combined_weighter.get_weights(flux_model)

# use Weighter.get_weight_column() to get the MC truth energy for each sample
ME_energy = ME_weighter.get_weight_column("energy")
HE_energy = HE_weighter.get_weight_column("energy")
combined_energy = combined_weighter.get_weight_column("energy")

# Histogram all three samples
Ebins = np.geomspace(3e4, 1e10, 64)
plt.hist(ME_energy, bins=Ebins, weights=ME_weights, histtype="step", label="Medium Energy")
plt.hist(HE_energy, bins=Ebins, weights=HE_weights, histtype="step", label="High Energy")
plt.hist(
    combined_energy,
    bins=Ebins,
    weights=combined_weights,
    histtype="step",
    color="k",
    ls="--",
    label="Combined",
)
_images/combined_mctruth.svg

To obtain values from reconstructions with combined datasets the get_column() function is quite useful. It will get the corresponding column from each hdf5 file and append them into a single numpy array.

# use get_column() to return the Qtot for each sample
HE_Qtot = HE_weighter.get_column("Homogenized_QTot", "value")
ME_Qtot = ME_weighter.get_column("Homogenized_QTot", "value")
Combined_Qtot = combined_weighter.get_column("Homogenized_QTot", "value")

# histogram the Qtot
Qbins = np.geomspace(10, 1e6, 64)
plt.hist(ME_Qtot, bins=Qbins, weights=ME_weights, histtype="step", label="Medium Energy")
plt.hist(HE_Qtot, bins=Qbins, weights=HE_weights, histtype="step", label="High Energy")
plt.hist(
    Combined_Qtot,
    bins=Qbins,
    weights=combined_weights,
    histtype="step",
    label="Combined",
    ls="--",
    color="k",
)
_images/combined_qtot.svg