Skip to content

Add a new method to pre-selection and final-selection. Also add some features to do_plots#513

Open
tomfournier wants to merge 6 commits into
HEP-FCC:pre-edm4hep1from
tomfournier:pre-edm4hep1
Open

Add a new method to pre-selection and final-selection. Also add some features to do_plots#513
tomfournier wants to merge 6 commits into
HEP-FCC:pre-edm4hep1from
tomfournier:pre-edm4hep1

Conversation

@tomfournier
Copy link
Copy Markdown

Hi,
I have written a new method to run the pre-selection. This method has the same properties of RDFanalysis but takes the process_name as a second argument in analysers and return the dataframe and a list of histogram or TParameter that will be written in the output file in a TDirectory called custom_objects.

In custom_object, the objects to write will be put in sub-TDirectory depending on their class (TH1D, TParameter, etc.) to be more readable.

I called this new class RDFgraph because I wanted to take the advantages of RDFanalysis and build_graph and I named the TDirectory custom_objects but if you have more inspiration, you can change the names.

During the final-selection, a dictionary similar to histoList can optionally be added to include the histograms in the final-selection output file called customHist This new variable takes this form:

customHists = {
    'leps_iso': {'name':'ConeIsolation', 'title':'I_{rel}', 'xmin':0, 'xmax':10},
    'leps_no':  {'name':'n_leptons', 'title':'N_{leptons}'}
}

to modify the histogram. If you don't want to modify your histogram, you can just leave the dictionary empty.

I made this method to be able to easily make a cuflow and histograms before a filter since RDFanalysis only return the branches after all the filters.

For the final-selection, we could use an eventual cutflow that is in custom_objects/TH1D to add new cuts from cutList but I think it's too difficult for me to implement it.

I also modified do_plots to have a strict x-range to not display bins with zero content and to set grid on the plot.
To do this you can add strictRange = True and setGrid = True to the plot script.
This apply to all the variables but we could make a list like for rebin to do implement it on chosen variables.

We could also make a dictionary with rebin, grid, strictRange and other parameters for all the variables in this form:

variables = {
    'zll_m': {'rebin': 2, 'strict': False, 'grid': True},
    'zll_p': {'strict': True, 'grid': False}
}

with the keys optional.

I ran it on my setup and it works fine but I don't know if it works for other setup or if I followed the coding conventions. I timed RDFanalysis and my new method and they take the same time to run so I don't think there is much optimization to do.

If I was not clear enough or you need more information, I am available to answer your questions. I hope my modifications can be implemented as I think they could be useful for FCCAnalyses users.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant