Skip to content

Conversation

@wdconinc
Copy link
Contributor

Briefly, what does this PR introduce?

This PR adds a datatype to record the "truthiness" (as mathematically defined...) for a reconstructed event; where truthiness is the "quality of seeming or being felt to be true, even if not necessarily true," or in this case also "the amount of confidently proclaiming the wrong thing to be true."

Mathematically, truthiness is a non-negative value that is zero only for perfectly reconstructed events (positive-definite), and is radially increasing in the error of the reconstruction (greater error leads to greater truthiness).

It is possible to define truthiness in multiple ways, but we will typically use some combination of the following components:

  • a χ2 measure on associated reconstructed and generated particles, with normalization given by the determined uncertainty in the reconstruction (if available) or 1 GeV otherwise,
  • a positive penalty term for discrete reconstruction errors, such as PID mis-identification (where weighting can be used to penalize some mis-identification more than others),
  • a positive penalty term for generated particles that should have been reconstructed, but weren't,
  • a positive penalty term for reconstructed particles that were not part of the original event record.

There are non-reconstruction reasons why the truthiness will be non-zero in realistic scenarios:

  • multiple-scattering effects will cause the event to lose momentum starting from the true value, deviating both in direction and magnitude in a consistent direction,
  • secondary particles will be generated in materials or along bent trajectories, leading to additional reconstructed particles corresponding to e.g. hard bremsstrahlung gammas in the electromagnetic calorimeters,
  • primary particles (in particular are low energies) may be absorbed in support structures, leading to their absence in the reconstructed event.

Nevertheless, the decrease of the overall average event truthiness for the same geometry and input hit collections is intended to indicate an improved reconstruction, and converse.

What kind of change does this PR introduce?

  • Bug fix (issue #__)
  • New feature (issue: store truthiness for event reconstruction)
  • Documentation update
  • Other: __

Please check if this PR fulfills the following:

  • Tests for the changes have been added
  • Documentation has been added / updated
  • Changes have been communicated to collaborators

Does this PR introduce breaking changes? What changes might users need to make to their code?

No.

Does this PR change default behavior?

No.

@wdconinc wdconinc requested a review from a team as a code owner October 29, 2025 17:12
Copy link
Contributor

@ruse-traveler ruse-traveler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting! I'm really intrigued by this! I'm curious about how this type would get used in practice, so for my own understanding: the vector members should be 1-to-1 with the relations in the associations field, correct?

@wdconinc
Copy link
Contributor Author

wdconinc commented Nov 4, 2025

Interesting! I'm really intrigued by this! I'm curious about how this type would get used in practice, so for my own understanding: the vector members should be 1-to-1 with the relations in the associations field, correct?

It's not intended for analyzers but for reconstruction development (so absolutely not for selecting for your analysis only those events that are close to the truth). But I would imagine we can use this to select events that are particularly poorly reconstructed and look at what went wrong, or compare before and after PRs to make sure we don't make things worse.

@ruse-traveler
Copy link
Contributor

It's not intended for analyzers but for reconstruction development (so absolutely not for selecting for your analysis only those events that are close to the truth). But I would imagine we can use this to select events that are particularly poorly reconstructed and look at what went wrong, or compare before and after PRs to make sure we don't make things worse.

Gotcha! Makes sense! This could be really useful as a high level summary of the impact of a change...

@ruse-traveler
Copy link
Contributor

Then following up on the vectors: I partly ask because I wonder if it would make sense to define a Truthiness component just to make the interface a little easier...

For example, there's an overall, energy, momentum, and PID truthiness at both the event and association level. So we could define a component sort of like:

edm4eic::TruthinessContribution:
  Members:
    - float overall
    - float pid
    - float energy
    - float momentum

So that:

edm4eic::Truthiness:
  <... description ...>
  Members:
    - edm4eic::TruthinessContribution eventContribution
    - float unassociatedMCParticleContribution
    - float unassociatedRecoParticleContribution
  VectorMembers:
    - edm4eic::TruthinessContribution associationContributions
  <... relations ...>
  ```

Copy link
Contributor

@ruse-traveler ruse-traveler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only comments I have so far are the component-idea for consideration above, and some suggestions for naming consistency between this and other types' fields below!

ruse-traveler
ruse-traveler previously approved these changes Nov 6, 2025
Copy link
Contributor

@ruse-traveler ruse-traveler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙌 Thanks! I think this looks good!

Copy link
Contributor

@mdiefent mdiefent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR introduces edm4eic::TruthinessContribution, a useful metric for assessing reconstruction algorithms on an event-by-event basis. The new data type was discussed in detail during the ePIC Software & Computing meeting on November 12. Similar to the discussion in this PR, we considered potential improvements and concluded that it is appropriate to integrate the truthiness measure now, with the understanding that we can iterate on the implementation in the future. The implementation is solid, and the comments in the code are clear. The criteria for a data model addition, namely discussion and consensus in an ePIC Software & Computing meeting, have been met.

Copilot AI review requested due to automatic review settings December 1, 2025 18:47
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@veprbl veprbl enabled auto-merge (squash) December 1, 2025 18:56
@veprbl veprbl disabled auto-merge December 1, 2025 18:56
@veprbl veprbl enabled auto-merge (squash) December 1, 2025 18:56
@veprbl veprbl merged commit 916c65a into main Dec 1, 2025
3 checks passed
@veprbl veprbl deleted the truthiness branch December 1, 2025 18:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants