Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Follow-up items to ML extension of analysis #130

Open
7 of 8 tasks
alexander-held opened this issue May 2, 2023 · 2 comments
Open
7 of 8 tasks

Follow-up items to ML extension of analysis #130

alexander-held opened this issue May 2, 2023 · 2 comments
Assignees
Labels
implementation concerns analysis implementation

Comments

@alexander-held
Copy link
Member

alexander-held commented May 2, 2023

Collecting follow-up items to #122 here that are not crucial to be addressed immediately in that PR but can be revisited. cc @ekauffma

  • understand large change in event yields with new cuts (almost an order of magnitude less events), though the new yields are consistent with what CMS had for the 2022 open data workshop in https://cms-opendata-workshop.github.io/workshop2022-lesson-ttbarljetsanalysis/02-coffea-analysis/index.html#plotting
  • investigate possibility of merging histogram-writing code into single function that avoids hardcoding information where possible
  • harmonize object names (includes also the ML training notebook and documentation probably) from e.g. "top_hadron jet" to "b_{had top}" etc, focusing the names of the b-tagged jet on "b" instead of "top"
  • where did the particle dependency come from?
  • the model_even, model_odd determination would probably make for a good utils function to remove that from the notebook
  • turn the first look at all the ML features into a grid of plots to save some space
  • make func_adl query depending on if inference is used (do not serve extra columns if not needed)
  • the last cabinetry part also needs a if USE_INFERENCE wrapping
@ekauffma
Copy link
Collaborator

The particle dependency is a mistake. It is used in the plotEvents.ipynb notebook that is used in the docs. I will remove this.

@ekauffma
Copy link
Collaborator

I have also realized that the func_adl query method was never updated to accommodate the new cuts. It will become a bit more complicated due to this. I am working on it now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
implementation concerns analysis implementation
Projects
None yet
Development

No branches or pull requests

2 participants