Module reference

Universal HEP plot defines a universal interchange format of plots used in high-energy contexts.

class uhepp.Graph(x_values, y_values, graphtype='points', label=None, **style)

Representation of arbitrary graph lines or points

The styles can be set or retrieved via properties. While setting an attribute, values are converted to the internal format, i.e. color names are converted to hex. If a style is not set, the corresponding attribute returns None.

The complete set of styles can be retrieved via the style attribute. Unset attributes do not appear in the dict.

Possible styles are:
  • linewidth (float)

  • linestyle (– or - or -, or :)

  • color (matplotlib compliant)

  • makersize

  • marker

Any legal matplotlib color might be used to set color attributes, although the uhepp standard only allows hex string. Internally, all colors are converted to hex strings.

property graphtype

Type of the graph

label

Name used in the legend of the plot or None

to_data()

Return the graph object as python native dicts and lists

property x_errors

Return a copy of the x errors

property x_values

Return a copy of the x values

property y_errors

Return a copy of the y errors

property y_values

Return a copy of the y values

class uhepp.HLine(pos, stretch=None, **style)

Horizontal line

property pos_y

The position on the y-axis of the line

to_data()

Return a uhepp compatible dict/list version

class uhepp.Line(pos, stretch=None, **style)

Representation of vertical or horizontal lines

The styles can be set or retrieved via properties. While setting an attribute, values are converted to the internal format, i.e. color names are converted to hex. If a style is not set, the corresponding attribute returns None.

The complete set of styles can be retrieved via the style attribute. Unset attributes do not appear in the dict.

Possible styles are:
  • linewidth (float)

  • linestyle (– or - or -, or :)

  • color (matplotlib compliant)

  • edgecolor (matplotlib compliant)

Any legal matplotlib color might be used to set color attributes, although the uhepp standard only allows hex string. Internally, all colors are converted to hex strings.

pos

Position of the line

stretch

Start and end values on the parallel axes

class uhepp.PushReceipt(api_url, ui_url, uuid)

A PushReceipt stores the confirmation that a plot has been uploaded to a central server. The object also contains the human-readable web endpoint at the central server.

class uhepp.RatioItem(numerator, denominator=None, bartype='step', error='stat', den_error=None, x_errorbar=True, keep_zero=False, **style)

Representation of an item drawn in the ratio plot

The RatioItem object stores a list of yield names for the numerator and the denominator. The binned sum of all referenced yields is the content of the numerator and denominator of this stack item. Additionally, each stack item object stores style options used during plotting, a bar type defining the representation (step or points) and a method to computed the uncertainty, see Stack.

The styles can be set or retrieved via properties. While setting an attribute, values are converted to the internal format, i.e. color names are converted to hex. If a style is not set, the corresponding attribute returns None.

The complete set of styles can be retrieved via the style attribute. Unset attributes do not appear in the dict.

Possible styles are:
  • linewidth (float)

  • linestyle (– or - or -, or :)

  • color (matplotlib compliant)

  • edgecolor (matplotlib compliant)

  • markersize

  • marker

Any legal matplotlib color might be used to set color attributes, although the uhepp standard only allows hex string. Internally, all colors are converted to hex strings.

base()

See Yield.base()

denominator

List of internal yield names used to form the denominator

Uncertainties form multiple yield objects are added in quadrature assuming the statistical uncertainties are independent.

keep_zero

If True, draw points even if y-value s zero

numerator

List of internal yield names used to form the numerator

Uncertainties form multiple yield objects are added in quadrature assuming the statistical uncertainties are independent.

stat()

See Yield.stat()

syst()

See Yield.syst()

to_data()

Return a uhepp compliant version using dict and lists

var_down()

See Yield.var_down()

var_up()

See Yield.var_up()

vary()

See Yield.vary()

x_errorbar

If True, draw x-error bars spanning full bin width

class uhepp.Stack(content, bartype='stepfilled', error='stat', x_errorbar=True, keep_zero=False)

Representation of a collection of stacked items of the main plot

A stack defines a collection of StackItems. In each bin, the bar heights of all stack items are added and drawn vertically on top of each other to form an overall bar. The items of the stack collection are refered to as content.

A stack can be of different types: “step”, “stepfilled” or “points”. The type defines the plotting type. The default bar type is “stepfilled”.

The error property defnes how the total uncertainty band if the stack is computed. Possible values are: “no”, “stat”, “syst”, “env”, “stat+syst”, “stat+env”, “syst+env” or “stat+syst+env”. Depending on the value, the uncertainty includes the statistical uncertainties of the yield objects, the total systematic uncertinaties of the yield objects, the quadratic sum of all variations of the yield objects, or combinations thereof. Combinations of different uncertainties are added in quadrature. The default error computation method is “stat”.

Content, type and error are accessible via properties.

base()

See Yield.base()

property content

Return list of StackItems

keep_zero

If True, draw points even if y-value s zero

stat()

See Yield.stat()

syst()

See Yield.syst()

to_data()

Return a uhepp compatible dict/list version

var_down()

See Yield.var_down()

var_up()

See Yield.var_up()

vary()

See Yield.vary()

x_errorbar

If True, draw x-error bars spanning full bin width

class uhepp.StackItem(yield_names, label, **style)

Representation of an item within a stack of the main plot

The StackItem object stores a list of names referring to yield objects. The binned sum of all referenced yields is the content of this stack item. Additionally, each stack object stores a label and style options used during plotting.

The styles can be set or retrieved via properties. While setting an attribute, values are converted to the interal format, i.e. color names are converted to hex. If a style is not set, the corresponding attribute returns None.

The complete set of styles can be retrieved via the style attribute. Unset attributes do not appear in the dict.

Possible styles are:
  • linewidth (float)

  • linestyle (– or - or -, or :)

  • color (matplotlib compliant)

  • edgecolor (matplotlib compliant)

  • markersize

  • marker

Any legal matplotlib color might be used to set color attributes, althout the uhepp standard only allows hex string. Internally, all colors are converted to hex strings.

label

Name used in the legend of the plot or None

to_data()

Return a uhepp compliant version using dict and lists

yield_names

List of internal yield names merged to form this stack item

Uncertainties form multiple yield objects are added in quadrature assuming the statistical uncertainties are independent.

exception uhepp.UHepParseError

Invalid input data

class uhepp.UHepPlotModel

Empty base class

class uhepp.UHeppHist(symbol, bin_edges, stacks=None, yields=None)

Uhepp Histogram class

The class represents a typical stacked, HEP histogram including the style information and the raw bin contents and its uncertainty. The actual objects is composed of Stacks, RatioItems, Lines and Yield objects.

property atlas

True if the plot is ATLAS-branded

author

Name of the author of the plot or None

bin_edges

List of bin edges matching the input Yields objects

brand

Collaboration name used to brand the plot or None

brand_label

Suffix printed after the collaboration name or None

clone()

Return a deep copy of the histogram.

code_revision

Version of the analysis code used to create the plot or None

date

Plot creation date as an ISO8601 string

density_width

Normalize bin height to given width

If not None, normalize the height of the bars/points to the given bin width. This is especially useful if the histogram has non-equidistant bins.

energy

Center-of-mass energy in TeV or None

event_selection

Custom analysis-specific event selection string or None

figure_size

Tuple of the plot (width, height) in inches or None

filename

Filename used a title or default name during rendering

get_base(name)

Return the rebinned yields for name

get_stat(name)

Return the rebinned stat uncert for name

get_syst(name)

Return the rebinned syst uncert for name

graphs

List of Graph objects

h_lines

List of horizontal line objects

include_overflow

If True, merge entries in the overflow bin with last bin

Uncertainties are added in quadrature, assuming the uncertainties are statistically independent.

include_underflow

See include_overflow

lumi

Luminosity of the data shown in the plots in 1/fb or None

producer

Name of the software used to create the uhepp data file or None

push(collection_id, api_url=None, api_key=None)

Upload the plot object to a central server

ratio

List of RatioItem objects shwon in the bottom panel

ratio_diff

If True show the difference in the bottom panel, other show ratio.

If this option is set to True, the bottom panel shows the difference between the numerators and the denominators. Please note, that the property names ‘numerator’ and ‘denominator’ are keep for historic reasons, even though the names are not descriptive in this case.

If set to False, the bottom panel shows the ratio.

ratio_fraction

Fraction of the plot height occupied by the ratio plot or None

ratio_label

Y-axis label in the bottom panel or None

ratio_log

If True, use log scale for the y-axis in the bottom panel

ratio_max

Upper limit of the y-axis in the bottom panel or None

ratio_min

Lower limit of the y-axis in the bottom panel or None

rebin_edges

List of subset of bin edges used for plotting or None

render(filename=None)

Render the universal plot.

The methods return the axes objecft. If the optional argument filename is set, the plot is written to the file.

show()

Render the plot and show it

stacks

List of Stack objects shown in the main panel

subtext

Text printed below the branding or None

The string can include n line breaks. Each line is printed separately.

symbol

Mathematical symbol of the x-axis quantity

tags

Custom key-value tags

to_data()

Convert to uhepp compliant dicts and lists

to_json(filename)

Convert the hist to a json-encoded file

to_jsons()

Convert the hist to a json-encoded string

to_yaml(filename)

Convert the hist to a yaml-encoded file

to_yamls()

Convert the hist to a yaml-encoded string

unit

Unit of the x-axis quantity or None

unit_in_brackets

If True, prints unit as [unit] instead of x / unit

v_lines

List of vertical line objects

variable

Name of the x-axis quantity or None

property version

The version of the uhepp specification

x_log

If True, use log scale for the x-axis

y_append_unit

If True, custom y_label is appended by ‘ / bin-width unit’

y_label

Overwrite default y-axis label in the main panel or None

If the option is None, the default label is “Events / bin-width unit”.

y_log

If True, use log scale for the y-axis in the main panel

y_max

Upper limit of the y-axis in the main panel or None

y_min

Lower limit of the y-axis in the main panel or None

yields

Dictionary mapping internal yield names to Yield objects

class uhepp.VLine(pos, stretch=None, **style)

A vertical line

property pos_x

The position on the x-axis of the line

to_data()

Return a uhepp compatible dict/list version

class uhepp.Yield(base, stat=None, syst=None, var_up=None, var_down=None)

Collection of yields and uncertainties of a single process

A yield object stores binned yields for a process including underflow and overflow bins. This means the number of bins is the number bin boundaries plus one: n - 1 + 1 (for underflow) + 1 (for overflow) = n + 1.

Additionally, the object stores the statistical uncertainty of each bin. The values stored as uncertainties correspond to the 1-sigma deviations form the central value. Optionally, the yield object can store a precomputed, overall systematic uncertainty for each bin. The central value is referred to as base.

Besides the bin-by-bin statistical and systematic uncertainties, the yield object also provides a way to store binned systematic variations of the base histogram. Variations are identified by a string key. For each key, an up-variation and down-variation can be set as a replacement for the base values. Variations are stored as absolute yields and not as deviations from base. If a down-variation is not set, the up-variation is symmetrized such that the absolute differences to the base values are identical.

The option to store variations can be used during plotting. A total systematic uncertainty can be computed with the “env” option assuming that all variations are independent. Please note that in that case the computed uncertainties are not generally statistically independent between the bins. Alternatively, it is possible to use variations as histogram items in their own right, for example to compare the shape of a variation to nominal.

The class provides overload arithmetic operations. Yields objects can be multiplied and divided by integers and floats. Scaling a yield object will also scale the uncertainties and variations. Two yield objects can also be added, subtracted, multiplied and divided bin-wise. Statistical and systematic Uncertainties are propagated under the assumption that the two involved histograms are independent. If a variation is present in both yield objects, the varied arrays are added, subtracted, multiplied or divided. If a variation is absent in one of the yield objects, the base values are used as a fallback for the arithmetic operation.

Please note that the yields stored in a yield object are “number of events per bin”, not normalized to the bin width. Merging two adjacent bins yields a bin with the sum of the two yields. Normalizing the yield to the bin width, should happen during visualization.

The class is intended and well-suited for specific use-cases. The following is a list of limitations what this class cannot do:

  1. A yield object does not store the bin boundaries. This means a standalone yield object, i.e. a yield object without binning information or a UHeppHist, does not make sense.

  2. When adding variations or using arithmetics, only the number of bins is checked not the actual binning. Mismatching bin edges lead to nonsensical results.

  3. The class does not provide a way to construct the yield arrays. For example, use numpy.histogram() instead to convert an array of events to a histogram.

  4. The variations cannot store an uncertainty. This is assumed to be a secondary effect. If an uncertainty is required, assess whether taking the statistical uncertainty of the base yield instead is applicable.

  5. The class does not store process names or style information. This is handled by a UHeppHist object.

  6. Asymmetric statistical and systematic uncertainties are not supported. The up- and down-variations, however, might be asymmetric.

  7. Except for adding variations, Yield objects are considered to be immutable.

add_var(var_name, var_up=None, var_down=None)

Add a new variation or set yield of an existing variation

If var_up (or var_down) is None, the var_up (or var_down) dict is left untouched. If a variation exists, it is overwritten.

property base

List of base yields of the process including under- and overflow

iter_vars()

Return an iterator for the tuples of (name, up, down)

rebin(orig_edges, new_edges)

Return a rebinned version of the yield object

Merged bins are added for the base yield and variation yields. The statistical and systematic uncertainties of merged bins are added in quadrature. This is only correct, if the statistical and systematic uncertainty is not correlated between bins.

property stat

Return the binned absolute, statistical uncertainty of the process

sum()

Return the sum of all base bins

property syst

Return the binned precomputed systematic uncertainty of the process

to_data()

Return a uhepp compatible dict/list version

total()

Return the total sum of yields of all bins for base and variations

var_down(var_name)

Return the binned, down-varied yield for the given variation

If the variation is present in the down dict, return that variation. If the variation is only present in the up dict, compute and return the up variation as a symmetrized variation. If the variation is not present in neither dict, return base.

property var_down_names

List of variation names present as down variation

var_up(var_name)

Return the binned, up-varied yield for the given variation

If the variation is present in the up dict, return that variation. If the variation is only present in the down dict, compute and return the down variation as a symmetrized variation. If the variation is not present in neither dict, return base.

property var_up_names

List of variation names present as up variation

property variations

List of variation names present as up and/or down variation

vary(**variations)

Returns the yield given a dict of variation pulls.

The keyword arguments must be of the form variation_name=pull_value. The pull_value defines the direction and amount of a variation. If pull_value is 1, the var_up yields are used, if pull_value is -1, the var_down yields are used, if the pull_value is 0 (default for not listed variations) the base yields are used.

For any other value, the method interpolates between the variations.

A variation passed to the method which is not found in var_down nor var_up is not an error. It will not effect the return value.

uhepp.from_caf(sample_folder, path, cut_stage, variable, include_bins=False)

Create and return a Yield object from a histogram in CAF.

The base yield and statistical uncertainty are copied from the TH1F returned by the sample folder.

Under and overflow bins are considered.

If the include_overflow argument is True, the return value is a tuple of yield object and bin edges.

The from_th1() method does not provide a method to set the total systematic uncertainty or variations.

uhepp.from_coffea(coffea_hist, stacks=False)

Convert from coffea, return lists of UHeppHist and lists of datasets

The first argument must be a coffea histogram object. If the optional stacks argument is True, the method populates the stacks property using all samples as individual StackItems.

uhepp.from_data(data)

Build and return a UHepPlot from uhepp compliant dicts and lists

uhepp.from_file(filename)

Build and return a UHepPlot from a uhepp compliant JSON or YAML file

uhepp.from_json(filename)

Build and return a UHepPlot from a uhepp compliant json file

uhepp.from_jsons(json_string)

Build and return a UHepPlot from a uhepp compliant json string

uhepp.from_th1(base_th1, var_up=None, var_down=None, include_bins=False)

Create and return a Yield object from a ROOT TH1 object

The base yield and statistical uncertainty are copied from a TH1 object given as first object. Optionally, var_up and var_down dicts with variation name and TH1 mapping are used to extract he variation yield.

Under and overflow bins are considered.

If the include_overflow argument is True, the return value is a tuple of yield object and bin edges.

The from_th1() method does not provide a method to set the total systematic uncertainty.

uhepp.from_yaml(filename)

Build and return a UHepPlot from a uhepp compliant yaml file

uhepp.from_yamls(yaml_string)

Build and return a UHepPlot from a uhepp compliant yaml string

uhepp.pull(uuid, api_url=None, api_key=None, full_link=None)

Retrieve a UHeppPlot from a central server

uhepp.pull_collection(collection_id, api_url=None, api_key=None)

Retrieve a collection of UHepPlot from a central server

uhepp.to_python(data)

Convert the data to pure python list and numbers