Building expressions

In order to efficiently process the input files, bamboo builds up an object representation of the expressions (cuts, weights, plotted variables) needed to fill the histograms, and dynamically generates C++ code that is passed to RDataFrame. The expression trees are built up throug proxy classes, which mimic the final type (there are e.g. integer and floating-point number proxy classes that overload the basic mathematical operators), and generate a new proxy when called. As an example: t.Muon[0].charge gives an integer proxy to the operation corresponding to Muon_charge[0]; when the addition operator is called in t.Muon[0].charge+t.Muon[1].charge, an integer proxy to (the object representation of) Muon_charge[0]+Muon_charge[1] is returned.

The proxy classes try to behave as much as possible as the objects they represent, so in most cases they can be used as if they really were a number, boolean, momentum fourvector… or a muon, electron, jet etc.—simple ‘struct’ types for those are generated when decorating the tree, based on the branches that are found. Some operations, however, cannot easily be implemented in this way, for instance mathematical functions and operations on containers. Therefore, the bamboo.treefunctions module provides a set of additional helper methods ( such that the user does not need to know about the implementation details in the bamboo.treeoperations and bamboo.treeproxies modules). In order to keep the analysis code compact, it is recommended to import it with

from bamboo import treefunctions as op

inside every analysis module. The available functions are listed below.

List of functions

bamboo.treefunctions.typeOf(arg)[source]

Get the inferred C++ type of a bamboo expression (proxy or TupleOp)

bamboo.treefunctions.c_bool(arg)[source]

Construct a boolean constant

bamboo.treefunctions.c_int(num, typeName='int', cast=None)[source]
Construct an integer number constant (static_cast inserted automatically if not ‘int’,

a boolean can be passed to ‘cast’ to force or disable this)

bamboo.treefunctions.c_float(num, typeName='double', cast=None)[source]
Construct a floating-point number constant (static_cast inserted automatically if not ‘double’,

a boolean can be passed to ‘cast’ to force or disable this)

bamboo.treefunctions.NOT(sth)[source]

Logical NOT

bamboo.treefunctions.AND(*args)[source]

Logical AND

bamboo.treefunctions.OR(*args)[source]

Logical OR

bamboo.treefunctions.switch(test, trueBranch, falseBranch, checkTypes=True)[source]

Pick one or another value, based on a third one (ternary operator in C++)

Example

>>> op.switch(runOnMC, mySF, 1.) ## incomplete pseudocode
bamboo.treefunctions.multiSwitch(*args)[source]

Construct arbitrary-length switch (if-elif-elif-…-else sequence)

Example

>>> op.multiSwitch((lepton.pt > 30, 4.), (lepton.pt > 15 && op.abs(lepton.eta) < 2.1, 5.), 3.)

is equivalent to:

>>> if lepton.pt > 30:
>>>     return 5.
>>> elif lepton.pt > 15 and abs(lepton.eta) < 2.1:
>>>     return 4.
>>> else:
>>>     return 3.
bamboo.treefunctions.extMethod(name, returnType=None)[source]

Retrieve a (non-member) C(++) method

Parameters
  • name – name of the method

  • returnType – return type (otherwise deduced by introspection)

Returns

a method proxy, that can be called and returns a value decorated as the return type of the method

Example

>>> phi_0_2pi = op.extMethod("ROOT::Math::VectorUtil::Phi_0_2pi")
>>> dphi_2pi = phi_0_2pi(a.Phi()-b.Phi())
bamboo.treefunctions.extVar(typeName, name)[source]

Use a variable or object defined outside bamboo

Parameters
  • typeName – C++ type name

  • name – name in the current scope

Returns

a proxy to the variable or object

bamboo.treefunctions.construct(typeName, args)[source]

Construct an object

Parameters
  • typeName – C++ type name

  • args – constructor argumnts

Returns

a proxy to the constructed object

bamboo.treefunctions.static_cast(typeName, arg)[source]

Compile-time type conversion

mostly for internal use, prefer higher-level functions where possible

Parameters
  • typeName – C++ type to cast to

  • arg – value to cast

Returns

a proxy to the casted value

bamboo.treefunctions.initList(typeName, valueType, elements)[source]

Construct a C++ initializer list

mostly for internal use, prefer higher-level functions where possible

Parameters
  • typeName – C++ type to use for the proxy (note that initializer lists do not have a type)

  • valueType – C++ type of the elements in the list

  • elements – list elements

Returns

a proxy to the list

bamboo.treefunctions.array(valueType, *elements)[source]

Helper to make a constructing a std::array easier

Parameters
  • valueType – array element C++ type

  • elements – array elements

Returns

a proxy to the array

bamboo.treefunctions.define(typeName, definition, nameHint=None)[source]

Define a variable as a symbol with the interpreter

Parameters
  • typeName – result type name

  • definition – C++ definition string, with <<name>> instead of the variable name (which will be replaced by nameHint or a unique name)

  • nameHint – (optional) name for the variable

Caution

nameHint (if given) should be unique (for the sample), otherwise an exception will be thrown

bamboo.treefunctions.defineOnFirstUse(sth)[source]

Construct an expression that will be precalculated (with an RDataFrame::Define node) when first used

This may be useful for expensive function calls, and should prevent double work in most cases. Sometimes it is useful to explicitly insert the Define node explicitly, in that case bamboo.analysisutils.forceDefine() can be used.

bamboo.treefunctions.abs(sth)[source]

Return the absolute value

Example

>>> op.abs(t.Muon[0].p4.Eta())
bamboo.treefunctions.sign(sth)[source]

Return the sign of a number

Example

>>> op.sign(t.Muon[0].p4.Eta())
bamboo.treefunctions.sum(*args, **kwargs)[source]

Return the sum of the arguments

Example

>>> op.sum(t.Muon[0].p4.Eta(), t.Muon[1].p4.Eta())
bamboo.treefunctions.product(*args)[source]

Return the product of the arguments

Example

>>> op.product(t.Muon[0].p4.Eta(), t.Muon[1].p4.Eta())
bamboo.treefunctions.sqrt(sth)[source]

Return the square root of a number

Example

>>> m1, m2 = t.Muon[0].p4, t.Muon[1].p4
>>> m12dR = op.sqrt( op.pow(m1.Eta()-m2.Eta(), 2) + op.pow(m1.Phi()-m2.Phi(), 2) )
bamboo.treefunctions.pow(base, exp)[source]

Return a power of a number

Example

>>> m1, m2 = t.Muon[0].p4, t.Muon[1].p4
>>> m12dR = op.sqrt( op.pow(m1.Eta()-m2.Eta(), 2) + op.pow(m1.Phi()-m2.Phi(), 2) )
bamboo.treefunctions.exp(sth)[source]

Return the exponential of a number

Example

>>> op.exp(op.abs(t.Muon[0].p4.Eta()))
bamboo.treefunctions.log(sth)[source]

Return the natural logarithm of a number

Example

>>> op.log(t.Muon[0].p4.Pt())
bamboo.treefunctions.log10(sth)[source]

Return the base-10 logarithm of a number

Example

>>> op.log10(t.Muon[0].p4.Pt())
bamboo.treefunctions.sin(sth)[source]

Return the sine of a number

Example

>>> op.sin(t.Muon[0].p4.Phi())
bamboo.treefunctions.cos(sth)[source]

Return the cosine of a number

Example

>>> op.cos(t.Muon[0].p4.Phi())
bamboo.treefunctions.tan(sth)[source]

Return the tangent of a number

Example

>>> op.tan(t.Muon[0].p4.Phi())
bamboo.treefunctions.asin(sth)[source]

Return the arcsine of a number

Example

>>> op.asin(op.c_float(3.1415))
bamboo.treefunctions.acos(sth)[source]

Return the arccosine of a number

Example

>>> op.ascos(op.c_float(3.1415))
bamboo.treefunctions.atan(sth)[source]

Return the arctangent of a number

Example

>>> op.atan(op.c_float(3.1415))
bamboo.treefunctions.max(a1, a2)[source]

Return the maximum of two numbers

Example

>>> op.max(op.abs(t.Muon[0].eta), op.abs(t.Muon[1].eta))
bamboo.treefunctions.min(a1, a2)[source]

Return the minimum of two numbers

Example

>>> op.min(op.abs(t.Muon[0].eta), op.abs(t.Muon[1].eta))
bamboo.treefunctions.in_range(low, arg, up)[source]

Check if a value is inside a range (boundaries excluded)

Example

>>> op.in_range(10., t.Muon[0].p4.Pt(), 20.)
bamboo.treefunctions.withMass(arg, massVal)[source]

Construct a Lorentz vector with given mass (taking the other components from the input)

Example

>>> pW = withMass((j1.p4+j2.p4), 80.4)
bamboo.treefunctions.invariant_mass(*args)[source]

Calculate the invariant mass of the arguments

Example

>>> mElEl = op.invariant_mass(t.Electron[0].p4, t.Electron[1].p4)

Note

Unlike in the example above, bamboo.treefunctions.combine() should be used to make N-particle combinations in most practical cases

bamboo.treefunctions.invariant_mass_squared(*args)[source]

Calculate the squared invariant mass of the arguments using ROOT::Math::VectorUtil::InvariantMass2

Example

>>> m2ElEl = op.invariant_mass2(t.Electron[0].p4, t.Electron[1].p4)
bamboo.treefunctions.deltaPhi(a1, a2)[source]

Calculate the difference in azimutal angles (using ROOT::Math::VectorUtil::DeltaPhi)

Example

>>> elelDphi = op.deltaPhi(t.Electron[0].p4, t.Electron[1].p4)
bamboo.treefunctions.Phi_mpi_pi(a)[source]

Return an angle between -pi and pi

bamboo.treefunctions.Phi_0_2pi(a)[source]

Return an angle between 0 and 2*pi

bamboo.treefunctions.deltaR(a1, a2)[source]

Calculate the Delta R distance (using ROOT::Math::VectorUtil::DeltaR)

Example

>>> elelDR = op.deltaR(t.Electron[0].p4, t.Electron[1].p4)
bamboo.treefunctions.rng_len(sth)[source]

Get the number of elements in a range

Parameters

rng – input range

Example

>>> nElectrons = op.rng_len(t.Electron)
bamboo.treefunctions.rng_sum(rng, fun=<function <lambda>>, start=None)[source]

Sum the values of a function over a range

Parameters
  • rng – input range

  • fun – function whose value should be used (a callable that takes an element of the range and returns a number)

  • start – initial value (0. by default)

Example

>>> totalMuCharge = op.rng_sum(t.Muon, lambda mu : mu.charge)
bamboo.treefunctions.rng_count(rng, pred=None)[source]

Count the number of elements passing a selection

Parameters
  • rng – input range

  • pred – selection predicate (a callable that takes an element of the range and returns a boolean)

Example

>>> nCentralMu = op.rng_count(t.Muon, lambda mu : op.abs(mu.p4.Eta() < 2.4))
bamboo.treefunctions.rng_product(rng, fun=<function <lambda>>)[source]

Calculate the production of a function over a range

Parameters
  • rng – input range

  • fun – function whose value should be used (a callable that takes an element of the range and returns a number)

Example

>>> overallMuChargeSign = op.rng_product(t.Muon, lambda mu : mu.charge)
bamboo.treefunctions.rng_max(rng, fun=<function <lambda>>)[source]

Find the highest value of a function in a range

Parameters
  • rng – input range

  • fun – function whose value should be used (a callable that takes an element of the range and returns a number)

Example

>>> mostForwardMuEta = op.rng_max(t.Muon. lambda mu : op.abs(mu.p4.Eta()))
bamboo.treefunctions.rng_min(rng, fun=<function <lambda>>)[source]

Find the lowest value of a function in a range

Parameters
  • rng – input range

  • fun – function whose value should be used (a callable that takes an element of the range and returns a number)

Example

>>> mostCentralMuEta = op.rng_min(t.Muon. lambda mu : op.abs(mu.p4.Eta()))
bamboo.treefunctions.rng_max_element_index(rng, fun=<function <lambda>>)[source]

Find the index of the element for which the value of a function is maximal

Parameters
  • rng – input range

  • fun – function whose value should be used (a callable that takes an element of the range and returns a number)

Returns

the index of the maximal element in the base collection if rng is a collection, otherwise (e.g. if rng is a vector or array proxy) the index of the maximal element in rng

Example

>>> i_mostForwardMu = op.rng_max_element_index(t.Muon. lambda mu : op.abs(mu.p4.Eta()))
bamboo.treefunctions.rng_max_element_by(rng, fun=<function <lambda>>)[source]

Find the element for which the value of a function is maximal

Parameters
  • rng – input range

  • fun – function whose value should be used (a callable that takes an element of the range and returns a number)

Example

>>> mostForwardMu = op.rng_max_element_by(t.Muon. lambda mu : op.abs(mu.p4.Eta()))
bamboo.treefunctions.rng_min_element_index(rng, fun=<function <lambda>>)[source]

Find the index of the element for which the value of a function is minimal

Parameters
  • rng – input range

  • fun – function whose value should be used (a callable that takes an element of the range and returns a number)

Returns

the index of the minimal element in the base collection if rng is a collection, otherwise (e.g. if rng is a vector or array proxy) the index of the minimal element in rng

Example

>>> i_mostCentralMu = op.rng_min_element_index(t.Muon. lambda mu : op.abs(mu.p4.Eta()))
bamboo.treefunctions.rng_min_element_by(rng, fun=<function <lambda>>)[source]

Find the element for which the value of a function is minimal

Parameters
  • rng – input range

  • fun – function whose value should be used (a callable that takes an element of the range and returns a number)

Example

>>> mostCentralMu = op.rng_min_element_by(t.Muon. lambda mu : op.abs(mu.p4.Eta()))
bamboo.treefunctions.rng_mean(rng)[source]

Return the mean of a range

Parameters

rng – input range

Example

>>> pdf_mean = op.rng_mean(t.LHEPdfWeight)
bamboo.treefunctions.rng_stddev(rng)[source]

Return the (sample) standard deviation of a range

Parameters

rng – input range

Example

>>> pdf_uncertainty = op.rng_stddev(t.LHEPdfWeight)
bamboo.treefunctions.rng_any(rng, pred=<function <lambda>>)[source]

Test if any item in a range passes a selection

Parameters
  • rng – input range

  • pred – selection predicate (a callable that takes an element of the range and returns a boolean)

Example

>>> hasCentralMu = op.rng_any(t.Muon. lambda mu : op.abs(mu.p4.Eta()) < 2.4)
bamboo.treefunctions.rng_find(rng, pred=<function <lambda>>)[source]

Find the first item in a range that passes a selection

Parameters
  • rng – input range

  • pred – selection predicate (a callable that takes an element of the range and returns a boolean)

Example

>>> leadCentralMu = op.rng_find(t.Muon, lambda mu : op.abs(mu.p4.Eta()) < 2.4)
bamboo.treefunctions.select(rng, pred=<function <lambda>>)[source]

Select elements from the range that pass a cut

Parameters
  • rng – input range

  • pred – selection predicate (a callable that takes an element of the range and returns a boolean)

Example

>>> centralMuons = op.select(t.Muon, lambda mu : op.abs(mu.p4.Eta()) < 2.4)
bamboo.treefunctions.sort(rng, fun=<function <lambda>>)[source]

Sort the range (ascendingly) by the value of a function applied on each element

Parameters
  • rng – input range

  • fun – function by whose value the elements should be sorted

Example

>>> muonsByCentrality = op.sort(t.Muon, lambda mu : op.abs(mu.p4.Eta()))
bamboo.treefunctions.map(rng, fun, valueType=None)[source]

Create a list of derived values for a collection

This is useful for storing a derived quantity each item of a collection on a skim, and also for filling a histogram for each entry in a collection.

Parameters
  • rng – input range

  • fun – function to calculate derived values

  • valueType – stored return type (optional, fun(rng[i]) should be convertible to this type)

Example

>>> muon_absEta = op.map(t.Muon, lambda mu : op.abs(mu.p4.Eta()))
bamboo.treefunctions.rng_pickRandom(rng, seed=0)[source]

Pick a random element from a range

Parameters
  • rng – range to pick an element from

  • seed – seed for the random generator

Caution

empty placeholder, to be implemented

bamboo.treefunctions.combine(rng, N=None, pred=<function <lambda>>, samePred=<function <lambda>>)[source]

Create N-particle combination from one or several ranges

Parameters
  • rng – range (or iterable of ranges) with basic objects to combine

  • N – number of objects to combine (at least 2), in case of multiple ranges it does not need to be given (len(rng) will be taken; if specified they should match)

  • pred – selection to apply to candidates (a callable that takes the constituents and returns a boolean)

  • samePred – additional selection for objects from the same base container (a callable that takes two objects and returns a boolean, it needs to be true for any sorted pair of objects from the same container in a candidate combination). The default avoids duplicates by keeping the indices (in the base container) sorted; None will not apply any selection, and consider all combinations, including those with the same object repeated.

Example

>>> osdimu = op.combine(t.Muon, N=2, pred=lambda mu1,mu2 : mu1.charge != mu2.charge)
>>> firstosdimu = osdimu[0]
>>> firstosdimu_Mll = op.invariant_mass(firstosdimu[0].p4, firstosdimu[1].p4)
>>> oselmu = op.combine((t.Electron, t.Muon), pred=lambda el,mu : el.charge != mu.charge)
>>> trijet = op.combine(t.Jet, N=3, samePred=lambda j1,j2 : j1.pt > j2.pt)
>>> trijet = op.combine(
>>>     t.Jet, N=3, pred=lambda j1,j2,j3 : op.AND(j1.pt > j2.pt, j2.pt > j3.pt), samePred=None)

Note

The default value for samePred undoes the sorting that may have been applied between the base container(s) and the argument(s) in rng. The third and fourth examples above are equivalent, and show how to get three-jet combinations, with the jets sorted by decreasing pT. The latter is more efficient since it avoids the unnecessary comparison j1.pt > j3.pt, which follows from the other two. In that case no other sorting should be done, otherwise combinations will only be retained if sorted by both criteria; this can be done by passing samePred=None.

samePred=(lambda o1,o2 : o1.idx != o2.idx) can be used to get all permutations.

bamboo.treefunctions.systematic(nominal, name=None, **kwargs)[source]

Construct an expression that will change under some systematic variations

This is useful when e.g. changing weights for some systematics. The expressions for different variations are assumed (but not checked) to be of the same type, so this should only be used for simple types (typically a number or a range of numbers); containers etc. need to be taken into account in the decorators.

Example

>>> psWeight = op.systematic(tree.ps_nominal, name="pdf", up=tree.ps_up, down=tree.ps_down)
>>> addSys10percent = op.systematic(
>>>     op.c_float(1.), name="additionalSystematic1", up=op.c_float(1.1), down=op.c_float(0.9))
>>> importantSF = op.systematic(op.c_float(1.),
        mySF_systup=op.c_float(1.1), mySF_systdown=op.c_float(0.9),
        mySF_statup=1.04, mySF_statdown=.97)
Parameters
  • nominal – nominal expression

  • kwargs – alternative expressions. “up” and “down” (any capitalization) will be prefixed with name, if given

  • name – optional name of the systematic uncertainty source to prepend to “up” or “down”

bamboo.treefunctions.getSystematicVariations(expr)[source]

Get the list of systematic variations affecting an expression

bamboo.treefunctions.forSystematicVariation(expr, varName)[source]

Get the equivalent expression with a specific systematic uncertainty variation

Parameters
  • expr – an expression (or proxy)

  • varName – name of the variation (e.g. jesTotalup)

Returns

the expression for the chosen variation (frozen, so without variations)

class bamboo.treefunctions.MVAEvaluator(evaluate, returnType=None, toArray=False, toVector=True, useSlots=False)[source]

Small wrapper to make sure MVA evaluation is cached

bamboo.treefunctions.mvaEvaluator(fileName, mvaType=None, otherArgs=None, nameHint=None)[source]

Declare and initialize an MVA evaluator

The C++ object is defined (with bamboo.treefunctions.define()), and can be used as a callable to evaluate. The result of any evaluation will be cached automatically.

Currently the following formats are supported:

  • .xml (mvaType='TMVA') TMVA weights file, evaluated with a TMVA::Experimental::RReader

  • .pt (mvaType='Torch') pytorch script files (loaded with torch::jit::load).

  • .pb (mvaType='Tensorflow') tensorflow graph definition (loaded with Tensorflow-C).

    The otherArgs keyword argument should be (inputNodeNames, outputNodeNames), where each of the two can be a single string, or an iterable of them. In the case of multiple input nodes, the input values for each should also be passed as separate arguments when evaluating (see below). Input values for multi-dimensional nodes should be flattened (row-order per node, and then the different nodes). The output will be flattened in the same way if the output node has more than one dimension, or if there are multiple output nodes.

  • .json (mvaType='lwtnn') lwtnn json. The otherArgs keyword argument should be passed the lists of input and output nodes/values, as C++ initializer list strings, e.g. '{ { "node_0", "variable_0" }, { "node_0", "variable_1" } ... }' and '{ "out_0", "out_1" }'.

  • .onnx (mvaType='ONNXRuntime') ONNX file, evaluated with ONNX Runtime. The otherArgs keyword argument should the name of the output node (or a list of those)

  • .hxx (mvaType='SOFIE') ROOT SOFIE generated header file The otherArgs keyword argument should be the path to the .dat weights file (if not specified, it will taken by replacing the weight file extension from .hxx to .dat). Note: only available in ROOT>=6.26.04.

Parameters
  • fileName – file with MVA weights and structure

  • mvaType – type of MVA, or library used to evaluate it (Tensorflow, Torch, or lwtnn). If absent, this is guessed from the fileName extension

  • otherArgs – other arguments to construct the MVA evaluator (either as a string (safest), or as an iterable)

  • nameHint – name hint, see bamboo.treefunctions.define()

Returns

a proxy to a method that takes the inputs as arguments, and returns a std::vector<float> of outputs

For passing the inputs to the evaluator, there are two options

  • if a list of numbers is passed, as in the example below, they will be converted to an array of float (with a static_cast). The rationale is that this is the most common simple case, which should be made as convenient as possible.

  • if the MVA takes inputs in a different type than float or has multiple input nodes (supported for Tensorflow and ONNX Runtime), an array-like object of the correct type should be passed for each of the input nodes. No other conversions will be automatically inserted, so these should be done when constructing the inputs (e.g. with array() and initList())). This is a bit more work, but gives maximal control over the generated code.

Example

>>> mu = tree.Muon[0]
>>> nn1 = mvaEvaluator("nn1.pt")
>>> Plot.make1D("mu_nn1", nn1(mu.pt, mu.eta, mu.phi), hasMu)

Warning

By default the MVA output will be added as a column (Define node in the RDataFrame graph) when used, because it is almost always more efficient. In some cases, e.g. if the MVA should only be evaluated if some condition is true, this can cause problems. To avoid this, defineOnFirstUse=False should be passed when calling the evaluation, e.g. nn1(mu.pt, mu.eta, mu.phi, defineOnFirstUse=False) in the example above.