Why should you use it?¶
Switching to uhepp as an intermediate format between your analysis framework and plotting code can boost your productivity.
Today, data analysis in HEP takes place in a team. Intermediate results are discussed in regular meetings. If you have presented your work in such an environment, you’ve probably received follow-up questions from your peers: Can you …?. Here are some examples.
Can you zoom in on the ratio plot axis to see the difference clearer?
On slide X, you are showing the proportion of process A and B. Can you also create the plot showing the ratio of process A and C?
Do you also have the plot with a finer/coarser/non-equidistant binning?
If you want to show this plot at the conference, you need to change the branding labels.
You could merge processes A, B, and C into other to make the plot a bit cleaner.
Can you also show the alternative Monte Carlo event generator B’s outline and compare it to generator A?
Have you looked at systematic uncertainty A? You could compare the variation to the nominal event yields to see if the effect is relevant/an artifact.
These are all fictional examples, but I am sure some sound familiar to you. Depending on your analysis framework, it might be rather tedious to satisfy these requests. In the worst case, you need to reprocess the full event dataset.
In the optimal case with uhepp, you don’t need to turn to the analysis framework anymore. You should have all the necessary data already in uhepp format. You can accomplish the requests in just a couple of lines of Python code or, in some cases, with a text editor. The best thing here is you can either change the settings only for a single plot or loop over all plots and apply a change universally.
Besides the optimization of your daily workflow, uhepp can also have a significant impact if you are a Ph.D. student. Typically, as a Ph.D. student in high-energy physics, you produce results over several years. Once you’ve reached a particular outcome, or after a predetermined time, you start writing the actual thesis. To document all your results of the last years, you need the plots from back then. Naturally, settings like the color schemes evolve. In your thesis, however, you want to present the material in a uniform way (something about a corporate identity).
If you are a Ph.D. student, ask yourself right now, how much time would it take to remake your old plots with a different color scheme. If the answer is more than 30 seconds, you should consider using uhepp.
The most prevailing storage and data format for histograms in high-energy
physics is ROOT’s
TH1. Although there is also a
THStack object contain styling options, you still need a
significant amount of plotting code to go from stored histograms to the actual
graphics file. The plotting code often contains hard-coded histogram names.
The bad thing about this is that you need both the binary data
and the plotting code to reproduce old plots.
You might track your plotting code with version control software (e.g.,
git), however, you probably don’t track
the binary ROOT files (you shouldn’t). It’s not unlikely that the plotting code
has evolved and is incompatible with the root files’ content or the naming
As a member of a large collaboration, for example, ATLAS
you need to adhere to labeling and branding rules. Internal plots that have not
been reviewed need to contain the label
Internal. Plots presented by students
at a national conference need to include the label
Work in progress. Plots
that end up in a publication should not contain any labels beside the
branding. Plots used in a thesis that are not taken from a publication must not
have any label or branding. It is, therefore, a common task to change these
labels. A single plot might be used in these different contexts with four
A secret “trick” circulating among students is to save the plots in
EPS format. When you need to change the labels, open the file in a text editor,
and replace the text. Although this might work1, you should not lock
yourself into a graphics format to have the flexibility to change the
labels. With Uhepp, labels can be revised naturally, either with a text editor in
JSON or YAML, or with the
uhepp Python package.
If you are not yet convinced, there is more to discover in the uhepp ecosystem: the central hub at https://uhepp.org/. The central hub offers a REST API. You can upload new plots or download existing plots in uhepp format via the API. At a superficial level, you can view the hub as a simple centralized storage service.
In high-energy physics, the dataset sizes often goes beyond what a single computer can handle. It is customary to employ a local computing cluster or the Worldwide LHC computing grid. Analysis results can be scattered across many different computers and file systems. Pushing results from the computing nodes to a centralized hub gives you a better overview of the real-time results.
On the other hand, the centralized hub serves as an archive for plots. If you intend to use uhepp during the whole period as a Ph.D. student, you have a single place where your plots are stored. You don’t need to worry that the plots might be scattered over several locations (the computing cluster, your laptop, the desktop machine in your office, your resource center). You do the same with software. With every new commit or revision, you push the changes to a central hosting service.
The web interface at https://uhepp.org/ probably creates the most significant impact. The service previews an interactive version of each plot. Sharing plots with your colleagues is an essential task in a large collaboration. With uhepp, sharing plots becomes a trivial task. All you need to do is send them a link. You can even link to collections of plots which makes classical plot books in PDF format obsolete. Do you remember the Can you …? follow up questions from above? If your peers know uhepp, they could satisfy their curiosity themself. This creates a lot more transparency and trust in the results.
If you are convinced, or even if you are still not convinced, head over to the Getting started guide to get a more hands-on feeling of what it is like to work with uhepp.
EPS files frequently contain a low-resolution preview of the rendered vector graphic. Changing the label in plain text will not update the preview.