.. _data_reading: =============== Reading in Data =============== As with most cosmic ray experiments, the analysis is all about data crunching. In this project, there are two types of files that are expected, simulated files and measured data files. Because this project lives in icetray, it is only natural to lean heavily on the existing implementation of I3 files and the *tray* structure. The only external capability needed is to interface, potentially, with the output of third-party simulations programs (i.e. CoREAS). This section describes the file readers implemented in the project, located in ``./radcube/readers/``. --------------------- On I3 Files For Radio --------------------- The backbone of icetray is the frame. There are a few types of standard frames, G - geometry, C - calibration, D - detector status, Q - data acquisition, and P - physics. The first three are used to store a description of the detector status for a specified time interval. These frames, particularly the calibration one are quite large and are thus typically stored separately from the other frames. The Q and P frames are, in a sense, the events, themselves. Within the :ref:`radcube` project, all simulated and real data will be put into the Q frame in an identical format. This is done such that any analaysis routine can be run on either simulated or real data without having to make any changes to the code. That being said, simulated data will obviously also include various items describing the cosmic ray "truth". .. _secCoreasSimulations: ------------------------ CoREAS Simulations ------------------------ So far the only simulation package for which radcube has implemented readers is the the `CoREAS`_ package which is a subpackage of the `CORSIKA`_ framework. The general flow of performing a simulation with CoREAS is to provide the program three files - A steering (.inp) file which defines the initial conditions of the primary cosmic ray - A list (.list) file of the antennas locations and (arbitrary) names - A CoREAS steering (.reas) file that specifies the granularity of the CoREAS output and core location Likewise, outputs of the simulation are: - A *particle file* or *ground file* with the name DATXXXXXX, where XXXXXX is the event ID - A longitudinal profile description (.long) - A directory filled with 3D electric field traces ``./DATXXXXXX_coreas/`` at the location specified in the .list file - Information that gets printed to the screen. This information MUST be stored as a file named DATXXXXXX.log Note that it is *required* that each shower be in its own directory named ``./XXXXXX/`` and that all (or at least most, see below) of these files are inside. It is sometimes standard practice to *resample* a simulation or to have one ground file contain multiple showers. However, this is not possible with CoREAS since the shower geometry must be totally defined with respect to the location of the antennas beforehand (in the .reas file). The radcube project needs a few of the these files to fully read in the simulation. The .reas file is needed to read in the true shower initial conditions, such as energy, direction, etc. These are typically done via reading in the ground file, however, this requires a dedicated parser of the CORSIKA binaries. The .list file is read in to find the detector geometry. The .log file is read in to get some of the other shower properties like Xmax, which cannot be determined before the start time. .. note:: Since every simulation has this file that directly defines the detector positions, simulated GCD files for the antenna array can always be computed on the fly with minimal extra computational effort. Finally, the electric field waveforms are read in. Within the ``./DATXXXXXX_coreas/`` will be one file for each antenna in the .list file. Each file is an ASCII file where each row includes a column for the timestamp, Ex, Ey, Ez. The time step between each row is determined by the settings in the .reas file. .. Warning:: All outputs of CoREAS/CORSIKA are in gcs units (yuck) meaning that lengths are given in centimeters, mass in grams, and time in seconds. But to make things even worse, the base unit of energy is not in g cm^2/s^2 = 0.001 J as you would expect, but in GeV. So some care must be taken. Further, note that the base unit of voltage in gcs units is the *statvolt* which has no direct unit conversion to the volt, in the strict sense. Coulomb's Law in CGS is :math:`F = Q_1Q_2 / r^2`, i.e. :math:`k=1`. Thus one volt is :math:`V_{SI} = c \times V_{Stat}` where c is the speed of light in units of meters per microsecond (~300). The electric field in CGS has units of statvolt per cm. The readers for the CoREAS output files are in ``./radcube/readers`` and include all of the machinery necessary to parse the output of the simulations. They will return an icetray or radcube object with the respective values, for instance, an ``I3Particle`` for the primary. In this way, they can be used independently of any tray, but it is highly recommended to simply use the radcube module ``CoreasReader`` which will call these functions in a meaningful way. There are functions to make a G (``UpdateGeometryFromListLine()``), C (``UpdateCalibrationFromListLine()``), and D (``UpdateDetectorStatusFromListLine()``) frame. However, the C and D frames will essentially be empty. For the G frame, the .list file is read and antennas are placed at each location. The poles of the antenna are aligned with the CORSIKA coordinate system (see :ref:`rad_coords`) where the pole of channel 1 is along magnetic north and the pole of channel 2 is along magnetic east. These are then rotated into the IC coordinate system. ------------------- Measured Data Files ------------------- Data has started arriving from the pole via a soft trigger from the DAQ board. These traces are in a potentially-not-yet-final binary format. The data can be read in using the project taxi-reader. This project parses the binary files and arranges the radio data into IC data classes. For more information on the data, see `the wiki `_. The scintillator data is not currently being processed and the DAQ software is guaranteed to be augmented in the future. However, the classes unique to this project should be able to handle the data organization for future formats. .. _Coreas: http://www.timhuege.de/coreas .. _CORSIKA: https://www.ikp.kit.edu/corsika/