diff --git a/.gitignore b/.gitignore index 40a5112..025e696 100644 --- a/.gitignore +++ b/.gitignore @@ -21,3 +21,7 @@ gitmeta.tex # OS-specific .DS_Store + +#githubworkflow +./github/workflows/* + diff --git a/HighEnergyObsCoreExt.tex b/HighEnergyObsCoreExt.tex index e8cecf9..262a2b8 100644 --- a/HighEnergyObsCoreExt.tex +++ b/HighEnergyObsCoreExt.tex @@ -142,7 +142,7 @@ \section{High Energy Astrophysics Data} Observations of the universe at the highest energies are based on techniques that are radically different compared to the UV through radio domains. \gls{HEA} observatories\footnote{For example, Chandra, XMM-Newton, Fermi, H.E.S.S., MAGIC, VERITAS, HAWC, LHAASO, IceCube, ANTARES, Auger, and soon CTAO, KM3NeT, and SWGO.} are generally designed to detect particles ({\em e.g.\/}, individual photons, cosmic-rays, or neutrinos) with the ability to estimate multiple observables for those particles. These detection techniques all rely on {\em event counting\/}\footnote{As opposed to signal integrating ({\em e.g.\/}, using a detector that accumulates the total photon signal during an exposure).}, where an event has some probability of being due to the interaction of a particle from an astrophysical source with the detectors, but also has some probability of being from instrumental or background effects. The data corresponding to an event are first an instrumental signal, which is then calibrated and processed to estimate physical quantities such as a time of arrival, point-of-origin on the sky, and an energy proxy associated with the event. Several other intermediate and qualifying characteristics may be associated with a detected event, depending on the detection technique. The ensemble of events detected over a given time interval and spatial field-of-view is referred to as an {\em event list\/}, which we designate an {\bf event-list} in this document. -Though {\bf event-list}s {\em may\/} include estimators for calibrated physical values, they typically still have to be corrected for the photometric, spectral, spatial, and/or temporal responses of the telescope and detector combination to yield scientifically interpretable information. The mappings between physical measurements of the source properties and the observables are called Instrument Response Functions (\glspl{IRF}\footnote{We try to avoid using the term \gls{IRF} in a normative sense since historical usage across the broad \gls{HEA} community (and from facility to facility) varies. In some cases, \gls{IRF} has been used to mean specifically the product of the \gls{ARF} and \gls{RMF}, whereas in other cases \gls{IRF} has been used more generally to mean any instrumental response function regardless of type.}). Some \glspl{IRF} are probabilistic in nature\footnote{For example, the energy matrix is a probability density function.}, and in addition may depend on the set of events selected for analysis by the end user. They are usually not invertible, so methods such as forward-folding fitting (using source models with any combination of spectral, spatial, temporal, and/or polarization components that are estimated) are needed to estimate physical properties, such as the true flux of particles from a source arriving at the instrument, given the measured observable quantities. The \glspl{IRF} generally evolve over time with the instrument and observation characteristics, and are usually defined for a specific time interval and may be decomposed into a standard set of independent components (see \S~3.1.5 of \citealt{2024ivoa.note.heig}), such as the spatial point-spread function or the energy-migration matrix or different messenger particle types, where each component may be stored or computed separately. Since both \glspl{IRF} and {\bf event-list}s are required to analyze \gls{HEA} data, some \gls{IVOA} standards must be modified in order to expose both of them via the \gls{VO}. +Though {\bf event-list}s {\em may\/} include estimators for calibrated physical values, they typically still have to be corrected for the photometric, spectral, spatial, and/or temporal responses of the telescope and detector combination to yield scientifically interpretable information. The mappings between physical measurements of the source properties and the observables are called Instrument Response Functions (\glspl{IRF}\footnote{We try to avoid using the term \gls{IRF} in a normative sense since historical usage across the broad \gls{HEA} community (and from facility to facility) varies. In some cases, \gls{IRF} has been used to mean specifically the product of the \gls{ARF} and \gls{RMF}, whereas in other cases \gls{IRF} has been used more generally to mean any instrumental response function regardless of type.}). Some \glspl{IRF} are probabilistic in nature\footnote{For example, the energy matrix is a probability density function.}, and in addition may depend on the set of events selected for analysis by the end user. They are usually not invertible, so methods such as forward-folding fitting (using source models with any combination of spectral, spatial, temporal, and/or polarization components that are estimated) are needed to estimate physical properties, such as the true flux of particles from a source arriving at the instrument, given the measured observable quantities. The \glspl{IRF} generally evolve over time with the instrument and observation characteristics, and are usually defined for a specific time interval and may be decomposed into a standard set of independent components (see \S~3.1.5 of \citealt{2024ivoa.note.heig}), such as the spatial point-spread function or the energy-migration matrix or different messenger particle types, where each component may be stored or computed separately. Since both \glspl{IRF} and {\bf event-list}s are required to analyze \gls{HEA} data, some \gls{IVOA} standards must be modified expose both of them via the \gls{VO}. In the following, the current ObsCore standard will be discussed in \S~\ref{sec:obscore}, focusing on attributes that need to be modified. Then, we propose the creation of a \gls{HEA} extension of ObsCore in \S~\ref{sec:obscoreext}, as some attributes are very specific to our domain. In these two sections, the discussion focuses on the attribute definitions rather on the attribute values. In \S~\ref{sec:voc}, enhancement of vocabulary is proposed for some ObsCore attributes, DataLink semantics, UCDs, and MIME-types. @@ -169,7 +169,7 @@ \subsection{{\em dataproduct\_type}} We propose to add the following {\em dataproduct\_type\/} terms to ObsCore to better define a \gls{HEA} {\bf event-list} and an {\bf event-bundle} that includes the {\bf event-list} and associated data: \begin{quote} -{\bf event-list}: a dataset that records a collection of observed particle-detection events, such as incoming high-energy particles, where an event is typically characterized by a spatial position, a time, and a spectral value ({\em e.g.\/}, an energy, a channel, a pulse height). +{\bf event-list}: a dataset that records a collection of observed particle-detection events, such as incoming high-energy particles, where an event is typically characterized by a spatial position, a time, and a spectral value ({\em e.g.\/}, an energy, a channel, a height). {\bf event-bundle}: a compounded dataset containing an {\bf event-list} and multiple files or other substructures that are products necessary to analyze the event-list. Data in an {\bf event-bundle} may thus be used to produce higher level data products calibrated in physical units when containing \glspl{IRF} or other data products that can be used to construct \glspl{IRF}. \end{quote} @@ -253,13 +253,13 @@ \subsection{{\em o\_ucd}} For an {\bf event-list}, we can consider that all measures stored in column values are observables. This is {\em the\/} fundamental difference between \gls{HEA} {\bf event-list}s and typical pixelated datasets. The current ObsCore Recommendation suggests that {\em o\_ucd\/} be set to ``NULL'' for event lists. However this significantly hampers data discovery for \gls{HEA} datasets. Since the data content of {\bf event-list}s may vary significantly from facility to facility, meaningful discovery of \gls{HEA} datasets {\em requires\/} the user be able to query the UCDs of the set of observables included in an {\bf event-list}. -A natural way of doing this that is consistent with current usage would be to extend {\em o\_ucd\/} to allow specification of {\em multiple\/} observables for {\bf event-list}s (and {\bf event-bundle}s), for example, {\em o\_ucd\/} = {\em `pos.eq\#time\#instr.event.pulse\-Height'\/}. We propose using the {\em hash symbol\/} (`\#') to separate UCDs for the multiple observables to distinguish from the case where multiple UCD words separated by semicolons may be needed to define the UCD for a single observable. This follows a suggestion from the EPN-TAP Recommendation \citep{2022ivoa.spec.0822E} to use the hash symbol as a separator. Doing so can simplify ADQL queries since ADQL includes a {\tt ivo\_hashlist\_has} IVOA-standardized user defined function that can be used to validate if a particular UCD is included. One can also perform an ADQL query similar to ``o\_ucd LIKE `\%string\%'\null'' if all that is desired is to verify the presence of a specific UCD `string'. +A natural way of doing this that is consistent with current usage would be to extend {\em o\_ucd\/} to allow specification of {\em multiple\/} observables for {\bf event-list}s (and {\bf event-bundle}s), for example, {\em o\_ucd\/} = {\em `pos.eq\#time\#instr.pulse;arith.sum'\/}. We propose using the {\em hash symbol\/} (`\#') to separate UCDs for the multiple observables to distinguish from the case where multiple UCD words separated by semicolons may be needed to define the UCD for a single observable. This follows a suggestion from the EPN-TAP Recommendation \citep{2022ivoa.spec.0822E} to use the hash symbol as a separator. Doing so can simplify ADQL queries since ADQL includes a {\tt ivo\_hashlist\_has} IVOA-standardized user defined function that can be used to validate if a particular UCD is included. One can also perform an ADQL query similar to ``o\_ucd LIKE `\%string\%'\null'' if all that is desired is to verify the presence of a specific UCD `string'. We note that extending {\em o\_ucd\/} to allow specification of multiple observables would require similar adjustments to the other observable axis attributes {\em o\_unit\/}, {\em o\_calib\_status\/}, and {\em o\_stat\_err\/}. Note that real {\bf event-list}s may include an extensive set of columns ({\em e.g.\/}, a Chandra ACIS Level~1 {\bf event-list} includes $\sim\!20$ columns, depending on observing mode) and several columns may represent similar (but not identical) observables ({\em e.g.\/}, event position in detector pixel coordinates, projected onto the focal surface, corrected for geometric distortions, corrected for spacecraft dither motion, mapped to world coordinates). Currently defined UCDs are not sufficiently fine-grained to be able to differentiate between these various cases. But that is very likely not necessary, since for data discovery purposes the user is typically interested in the ``most calibrated'' properties in each of the spatial/spectral/time(/polarization) axes ({\em e.g.\/}, world coordinates in the above example). -In the example {\em o\_ucd\/} above, the UCD {\em instr.event.pulseHeight\/} is used to represent the detector Pulse Height Amplitude (PHA)\null. There is currently no UCD defined for a raw measure like PHA, but we propose the addition of {\em instr.event.pulseHeight\/} to the UCDs list vocabulary, together with other UCDs that are relevant for \gls{HEA} data, in \S~\ref{sec:UCDs}. Several additional UCDs, including electromagnetic spectrum, physical quantities, and statistical parameters UCDs, are also proposed in \S~\ref{sec:UCDs} that are relevant for \gls{HEA} data products but could also be of use for other domains such as cosmology. +In the example {\em o\_ucd\/} above, the UCD {\em instr.pulse;arith.sum\/} is used to represent the detector Pulse Height Amplitude (PHA)\null. There is currently no UCD defined for a raw measure like PHA, but we propose the addition of {\em instr.pulse\/} to the UCDs list vocabulary, together with other UCDs that are relevant for \gls{HEA} data, in \S~\ref{sec:UCDs}. Several additional UCDs, including electromagnetic spectrum, physical quantities, and statistical parameters UCDs, are also proposed in \S~\ref{sec:UCDs} that are relevant for \gls{HEA} data products but could also be of use for other domains such as cosmology. Advanced data products may similarly record multiple observables that can only be differentiated through their UCDs. For example, a Chandra Source Catalog {\bf pdf} dataset for a detection may include multiple marginalized probability density functions computed using a Bayesian X-ray aperture photometry algorithm in units of net counts, net count rates, photon fluxes, and energy fluxes in multiple apertures. The observables recorded in the different MPDFs may be distinguished by their UCDs which then become relevant for data discovery when a user is searching for specific aperture photometry datasets. @@ -290,7 +290,7 @@ \subsection{{\em t\_intervals}} \subsection{{\em energy\_min\/}/{\em energy\_max\/}} -The existing attributes {\em em\_min\/} and {\em em\_max\/} that define the coverage of the spectral axis (defined as wavelength expressed in units of m) are not user friendly for \gls{HEA} where datasets are generally selected according to an energy range ({\em i.e.\/}, inverse wavelength) in units of eV (or scaled units of eV, for example keV, MeV, GeV, TeV, PeV). Unlike the radio domain where $\lambda = c/\nu$, where $c$ is an almost universally remembered physical constant, the conversion $\lambda = hc/E$ is not simple for the user to express. As the spectral range covered by \gls{HEA} data is many decades larger than for other wavebands, the accurate numerical representations of typical \gls{HEA} spectral ranges as {\em em\_min\/}/{\em em\_max\/} requires quantities with many digits of precision and exponents ranging from $\sim\!10^{-5}$--$10^{-22}$, and are misleading when used for energy ranges of massive particles. Since specification of the spectral range is largely fundamental to data discovery in the \gls{HEA} regime, we propose to add attributes {\em energy\_min\/} and {\em energy\_max\/} that specify the minimum and maximum spectral range values in units of eV\null. Note that the sense of these attributes is {\em opposite\/} that of {\em em\_min\/} and {\em em\_max\/} because of the inverse wavelength relationship between energy and wavelength, so numerical comparisons must be transposed ({\em e.g.\/}, $E>E_{\rm thresh}$ becomes $\lambdaE_{\rm thresh}$ becomes $\lambda100$ TeV) \cr -Q & {\em instr.event\/} & Particle event detection \cr -Q & {\em instr.event.grade\/} & Particle event grade \cr -Q & {\em instr.pulseHeight\/} & Pulse height amplitude measure \cr -Q & {\em instr.event.type\/} & Particle event type \cr +Q & {\em instr.detection\/} & Particle event detection \cr +%Q & {\em instr.event.grade\/} & Particle event grade \cr +Q & {\em instr.pulse\/} & Pulse height amplitude measure \cr +%Q & {\em instr.event.type\/} & Particle event type \cr E & {\em phot.count.density\/} & Count flux density (dimensionality: $\rm [L^{-2}\,T^{-1}\,E^{-1}]$) \cr E & {\em phot.count.density.sb\/} & Count flux density surface brightness (dimensionality: $\rm [L^{-2}\,T^{-1}\,E^{-1}\,\hbox{sr}^{-1}]$) \cr E & {\em phot.count.radiance\/} & Count flux radiance (dimensionality: $\rm [L^{-2}\,T^{-1}\,\hbox{sr}^{-1}]$) \cr @@ -577,16 +595,19 @@ \subsubsection{Evolution of UCD list} E & {\em phot.flux.particle.sb\/} & Particle flux surface brightness (dimensionality: $\rm [L^{-2}\,T^{-1}\,\hbox{sr}^{-1}]$) \cr S & {\em phys.particle.antiprotron\/} & Related to anti-proton \cr S & {\em phys.particle.cosmicray\/} & Related to cosmic rays particles \cr -S & {\em phys.particle.electron\/} & Related to electron \cr +%S & {\em phys.particle.electron\/} & Related to electron \cr S & {\em phys.particle.photon\/} & Related to photon \cr S & {\em phys.particle.positron\/} & Related to positron \cr -S & {\em phys.particle.pdgid\/} & Particle Data Group Identifier \cr -S & {\em phys.particle.pdgid$\pm$XX\/} & Related to a particle with PDG ID $\pm$XX \cr -P & {\em stat.distribution\/} & Type or shape of statistical distribution \cr -P & {\em stat.error.negative\/} & Negative statistical error \cr -P & {\em stat.error.positive\/} & Positive statistical error \cr -P & {\em stat.lowerlimit\/} & Lower limit \cr -P & {\em stat.upperlimit\/} & Upper limit \cr +% Mireille we cannot have these terms in the UCD tree ; that would imply importing all possible encoding of any kind +%S & {\em phys.particle.pdgid\/} & Particle Data Group Identifier \cr +%S & {\em phys.particle.pdgid$\pm$XX\/} & Related to a particle with PDG ID $\pm$XX \cr +%mireille +S& {\em stat.distribution\/} & Related to a statistical distribution \cr +%mireille update to latest discussed term +P & {\em stat.error.minus\/} & Negative statistical error \cr +P & {\em stat.error.plus\/} & Positive statistical error \cr +P & {\em stat.lowerlimit\/} & Lower limit value \cr +P & {\em stat.upperlimit\/} & Upper limit value \cr \sptablerule \caption{Proposed New UCD Entries} \label{tab:he_ucds} @@ -602,6 +623,7 @@ \subsubsection{Evolution of UCD list} E & {\em phot.fluence\/} & Radiant photon energy received by a surface per unit area or irradiance of a surface integrated over time of irradiation (dimensionality: $\rm [L^{-2}]$) \cr Q & {\em phot.flux.bol\/} & Bolometric flux (dimensionality: $\rm [M\,T^{-3}]$) \cr E & {\em phot.radiance\/} & Radiance as energy flux per solid angle (dimensionality: $\rm [M\,T^{-3}\,\hbox{sr}^{-1}]$) \cr +%mir the case of electron would be in a VEP to discuss the backward compatibility of this change S & {\em phys.electron\/} & Electron (not recommended/deprecate) \cr S & {\em stat.min\/} & Minimum value \cr S & {\em stat.max\/} & Maximum value \cr @@ -610,6 +632,10 @@ \subsubsection{Evolution of UCD list} \label{tab:upgrade_he_ucds} \end{longtable} +Note that the introduction of dimensional equation in the definition of UCD terms is a useful feature +to compare quantities across various spectral domains. + + \subsection{MIME-type Enhancements}\label{sec:mimetypes} Data files used in the \gls{HEA} domain should have appropriate MIME-types, so that they can be included in ObsCore tables or elsewhere. @@ -627,13 +653,12 @@ \section{Proposed ivoa.obscore\_hea Table Attributes}\label{sec:ibscoreext} \begin{landscape} \begin{center} -%start mireille ucd update \begin{longtable}{ | m{0.15\linewidth} | m{0.23\linewidth} | m{0.07\linewidth} | m{0.07\linewidth} | m{0.4\linewidth} | m{0.05\linewidth} |} \hline {\centering \bf Column Name} &{\centering \bf UCD} &{\centering \bf Unit} &{\centering \bf Type} &{\centering \bf Description} &{\centering \bf MAN}\\ \hline - ev\_xel & \ucd{meta.number;obs.event} & unitless & int & {Number of events in an event\_list }& NO \\ + ev\_xel & \ucd{meta.number;instr.detection} & unitless & int & {Number of events in an event\_list }& NO \\ \hline s\_ref\_energy & \ucd{meta.ref;em.energy;pos} & eV & float & {Energy at which the ObsCore spatial characterization attributes s\_fov , s\_region, s\_resolution are defined} & NO \\ \hline @@ -655,57 +680,15 @@ \section{Proposed ivoa.obscore\_hea Table Attributes}\label{sec:ibscoreext} \hline analysis\_mode & \ucd{meta.code;obs.param} & unitless & string &{Data reduction/analysis mode}& NO \\ \hline - event\_type & \ucd{meta.code.qual;obs.event} & unitless & string &{Data quality flag of the events ({\em e.g.\/}, ``good psf'', ``good rejection'', ``Nhit (100,200)''} & NO \\ + event\_type & \ucd{meta.code.qual;instr.detection} & unitless & string &{Data quality flag of the events ({\em e.g.\/}, ``good psf'', ``good rejection'', ``Nhit (100,200)''} & NO \\ \hline - messenger & \ucd{TBD} & unitless & string &{Messenger particle type ({\em e.g.\/}, ``photon'', ``cosmic-ray'', ``neutrino'', ``pdgid-13'')} & NO \\ + messenger & \ucd{meta.name;phys.particle} & unitless & string &{Messenger particle type ({\em e.g.\/}, ``photon'', ``cosmic-ray'', ``neutrino'', ``pdgid-13'')} & NO \\ \hline \end{longtable} %\end{center} -% end mireille ucd update \end{center} \end{landscape} -%\begin{landscape} -%\begin{center} -%%\begin{longtable}{ | m{2.5cm} | m{3em} | m{3em} | m{3em} | m{6cm} | m{2.3em} |} -%\begin{longtable}{ | p{0.125\linewidth} | p{0.075\linewidth} | p{0.075\linewidth} | p{0.075\linewidth} | p{0.6\linewidth} | p{0.05\linewidth} |} -%\hline -%{\centering \bf Column Name} &{\centering \bf UType} &{\centering \bf Unit} &{\centering \bf Type} &{\centering \bf Description} &{\centering \bf MAN}\\ -%\hline -%{\em ev\_xel\/} & TBD & unitless & integer & Number of events in an event list & NO \\ -%\hline -%{\em s\_ref\_energy\/} & TBD & eV & double & Energy at which the ObsCore spatial characterization attributes {\em s\_fov\/}, {\em s\_region\/}, {\em s\_resolution\/} are defined & NO \\ -%\hline -%{\em em\_ref\_energy\/} & TBD & eV & double & Energy at which the ObsCore spectral characterization attributes {\em em\_res\_power\/}, {\em em\_resolution} are defined & NO \\ -%\hline -%{\em s\_ref\_oaa\/} & TBD & deg & double & Off-axis angle ({\em i.e.\/}, the angular separation of the target or source from the telescope optical axis) at which the ObsCore spatial characterization attributes {\em s\_fov\/}, {\em s\_region\/}, {\em s\_resolution\/} are defined & NO \\ -%\hline -%{\em em\_ref\_oaa\/} & TBD & deg & double & Off-axis angle ({\em i.e.\/}, the angular separation of the target or source from the telescope optical axis) at which the ObsCore spectral characterization attributes {\em em\_res\_power\/}, {\em em\_resolution\/} are defined & NO \\ -%\hline -%{\em t\_intervals\/} & TBD & unitless & string & List of observation intervals or stable/good time intervals describing the exact observation time coverage as a TMOC & NO \\ -%\hline -%{\em energy\_min\/} & TBD & eV & double & Energy associated to the ObsCore attribute {\em em\_max\/}, describing the minimum energy of the dataset & NO \\ -%\hline -%{\em energy\_max\/} & TBD & eV & double & Energy associated to the ObsCore attribute {\em em\_min\/}, describing the maximum energy of the dataset & NO \\ -%\hline -%{\em obs\_mode\/} & TBD & unitless & string & Observation mode of an observation & NO \\ -%\hline -%{\em tracking\_type\/} & TBD & unitless & string & Tracking type of an observation & NO \\ -%\hline -%{\em scan\_mode\/} & TBD & unitless & string & Scan mode of an observation & NO \\ -%\hline -%{\em pointing\_mode\/} & TBD & unitless & string & Pointing mode of an observation & NO \\ -%\hline -%{\em analysis\_mode\/} & TBD & unitless & string & Data reduction/analysis mode & NO \\ -%\hline -%{\em event\_type\/} & TBD & unitless & string & Event subset indicator ({\em e.g.\/}, data quality flag for the events) & NO \\ -%\hline -%\caption{Attributes for the \gls{HEA} Extension of ObsCore} -%\label{tab:hea_ext_attr} -%\end{longtable} -%\end{center} -%\end{landscape} - \pagebreak %\printglossaries