Skip to content

support index-based selection#502

Open
goblirsc wants to merge 13 commits intoHEP-FCC:masterfrom
goblirsc:MG_indexBasedSelCP
Open

support index-based selection#502
goblirsc wants to merge 13 commits intoHEP-FCC:masterfrom
goblirsc:MG_indexBasedSelCP

Conversation

@goblirsc
Copy link
Copy Markdown

@goblirsc goblirsc commented Feb 16, 2026

Add a few features to help with index-based object selection and truth-reco matching studies within FCCAnalyses:

  • Standard MCParticle selection structs now support both element-based and index-based selections
  • A new selByPredicate base struct is introduced to simplify adding future selection criteria without copy-pasting boilerplate code
  • Add a get_lists_of_stable_particles_from_decays function extending the existing get_list_of_stable_particles_from_decay to check multiple decaying particles in one call
  • Fix documentation of selRP_matched_to_list and add an index-based version. Add optional flag to also match metastable particles
  • Add index_range helper function to obtain range(coll.size()) for any RVec collection.
  • Add count_valid_indices helper function to count valid entries in a list of (selected) indices.
  • Add sel_byIndex helper function to move from index list to particle list
  • Add getVertex_matching_recoParticles function to VertexingUtils - find vertices that contain a user-specified list of recoparticles

The changes in this PR are designed to be transparent to all existing code - should require no adaptation by users.

@kjvbrt
Copy link
Copy Markdown
Contributor

kjvbrt commented Feb 16, 2026

Hi @goblirsc, nice to see these :)

I believe quickest way to fix the tests would be to rebase on the latest master.

@goblirsc
Copy link
Copy Markdown
Author

Hi @goblirsc, nice to see these :)

I believe quickest way to fix the tests would be to rebase on the latest master.

Thanks, missed the latest commits by a tiny bit :)
I see another change went in - should I wait and do one more rebase to be fully up to date? Or were the previous PR up to 500 sufficient for the pipeline?

@kjvbrt
Copy link
Copy Markdown
Contributor

kjvbrt commented Feb 16, 2026

PRs up to 500 should be sufficient for the pipeline, the later PRs do not affect analyzers.

@goblirsc
Copy link
Copy Markdown
Author

goblirsc commented Feb 16, 2026

after running clang_format as suggested on the main page, the entire set of classes got reformatted. Are we missing some formatting style settings that need to be configured prior to running this?

Or is this genuine and the entire codebase was violating the formatting rules?

@goblirsc
Copy link
Copy Markdown
Author

Dear @kjvbrt, can we retry the clang_format test to see if the format changes are expected or something went wrong with the formatter config?

@kjvbrt
Copy link
Copy Markdown
Contributor

kjvbrt commented Feb 17, 2026

Hi @goblirsc, the test tries to do a selective format, only of the edited lines.

One can do it with the help of git

git clang-format master

You can get the clang-format git sub-command by sourcing Key4hep stack.

@goblirsc
Copy link
Copy Markdown
Author

Hi @goblirsc, the test tries to do a selective format, only of the edited lines.

One can do it with the help of git

git clang-format master

You can get the clang-format git sub-command by sourcing Key4hep stack.

Thank you! The number of format changes looks a lot more reasonable now :)

@goblirsc
Copy link
Copy Markdown
Author

Hm, the CI seems to expect a different format to the one I get when I run clang_format locally.
Can I export the artifacts of the pipeline run as a patch to directly apply the changes it requests?

@goblirsc
Copy link
Copy Markdown
Author

Dear @kjvbrt,

I now applied the diff printed by the formatter by hand, after git clang-format master applied a different formatting than the CI expected and clang-format -i -style=file /path/to/file.cpp modified the entire files.

To avoid this overhead in the future, is there a recipe for either reproducibly applying the formatter locally with changes consistent to what the CI expects, or for automatically applying the suggested patch from the CI to my branch?

Cheers, Max

@goblirsc
Copy link
Copy Markdown
Author

Dear @kjvbrt,

in addition to the above, it seems the pipeline is set to only run when you manually trigger it. Could you please do so?
This PR already lost 10 days due to the formatter, which is a bit of a pity given there are many more cool functional improvements that could be made if we want to.

Cheers,
Max

@goblirsc
Copy link
Copy Markdown
Author

goblirsc commented Mar 2, 2026

next format update, please re-run the CI.

@goblirsc
Copy link
Copy Markdown
Author

goblirsc commented Mar 4, 2026

Dear @kjvbrt,

I think I finally figured out how to run the CI formatter stage locally :)
I added this to the Readme.md, so that other users won't run into the same issue.
Could you please trigger another CI run? Now we should be able to get past the formatter and be able to start review of the changes themselves 🤞

May I make another suggestion? Could the formatter check

  • either be automatically triggered when a new commit arrives (to remove the delay until a manual trigger) or
  • be run on the client side as a pre-commit hook to ensure that submitted changes already pass the requirement (with corresponding documentation in the readme)?

Cheers, Max

Comment thread README.md Outdated
```
git clang-format --style=file $(git merge-base upstream/master HEAD)
```
to only format the lines you changed (otherwise you will reformat the entire file).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for figuring this out :)

Maybe (otherwise you will reformat the entire file) is a bit redundant, as the same information is a few lines above.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In any case, at some point we need to run formatter across the whole repository.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed the description

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, maybe MR -> PR?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed (caught a long-term gitlab user :) )

bool operator() (ROOT::VecOps::RVec<edm4hep::MCParticleData> in);
};

/// @brief Helper struct to select entries matching a certain predicate.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also provide an example of usage?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, sure - would you prefer this in the doxygen doc or on the PR?
The replacements for the selectors below this definition can serve as "inline" examples to refer to.

The general pattern is

struct mySelection: selByPredicate{
    mySelection(): selByPredicate(some_selection_function){}
};

Where some_selection_function is a function returning a "keep" decision for a given input object. Using lambda captures, the function can be configured with constructor arguments - for example, a configurable cut threshold.

This object can then be used to obtain a copy-vector of passed objects from an input object list, or a set of passing indices, or a set of passing element for each of a list of input index vectors.

The main motivation is that we save a lot of common boilerplate code when implementing selection functions (loop over containers, output allocation, copy operations, ...). The pattern also ensures that deep-copies are avoided where possible. We also gain the ability to apply the same selection functor transparently on index-based or copy-based selection logic, avoiding a need to duplicate logic if both are to be supported.

It would actually be more elegant to generalise this to a template consuming an arbitrary input object type, rather than being restricted to the MCParticleData class.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice explanation, I think having it in Doxygen would be the best. Would you also add an example how to use this when working with the dataframe in Python? I mean a snipped a user might write.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added explanation for doxygen, including snippets. Also moved the selByPredicate to Utils, making it available more generally than before.

Comment thread analyzers/dataframe/FCCAnalyses/ReconstructedParticle2MC.h
Comment thread analyzers/dataframe/FCCAnalyses/ReconstructedParticle2Track.h
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants