FAST-HEP
Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Back to homepage

Workflow Description

workflow example

Requirements for stages

To fit the carpenter stages into the proposed workflow they must satisfy some requirements.

StageRequirements
data importTranslate data config into a data mapping - mapping should be dictionary-like and provide n-D array access
data mappingcopy on write; accommodate new data; GPU compatible; aliases → Index; filtering → Mask
IndexCallable (order?); uproot path: dir1/tree1/var1; alias: dir1.tree1.var1; expression: dir1__dot__tree1__dot__var1 (or use a simplified version)
Mask(expression)Callable, Mergable; merge via minary AND, OR: (mask1 | mask2)mask3, (mask1 & mask2)mask4
Operations(config)Callable, Mergable; Types: Define → new data; Cutflow → creates Masks + cutflow; Binning → createst Hists, tables; DataOut → creates ntuples/CSV/binary format

Merging functionality

  • Define stage: independent new entries → data.update
  • Cutflow stage: merging counts across files and datasets
  • Binning stage: merge bin entries (preserve datasets?)

→ each type of merge needs its own rules

Multiplexing

For optimization, we need to be able to replicate stages across inputs - if applicable. Stage multiplexing will replicate a stage definition across inputs (previous stages, data import). Stages that merge data, will typically have different rules for muliplexing (none, reduce by N).

Under construction