sysvar package#
Submodules#
sysvar.api module#
- sysvar.api.add_weights_to_dataframe(df: DataFrame, correction_source: str | Path | dict, MC_production: str | None = None, prefix: str | None = None, weightname: str | None = None, overwrite: bool = False, Nvar: int = 0)[source]#
Add correction weights (and optional toy variations) to a pandas DataFrame in-place.
This function computes per-row correction weights using a correction object created from correction_source and writes them into df as a new column. The rows that receive a given correction value are determined by boolean query strings produced by correction.build_queries(prefix) and evaluated against df via df.eval().
If Nvar > 0, additional columns containing toy variations of the weights are added as well (one column per toy). The toy weights are produced via a Variator constructed from the correction.
- Parameters:
df – The DataFrame to modify in-place.
correction_source –
- Defines how to construct the correction:
Path: treated as a path to a CSV/TSV correction table.
str: either a correction identifier (YAML-based, legacy) or a path-like string to a CSV.
dict: a fully specified custom correction configuration.
MC_production – MC production tag required for YAML-based corrections (legacy). Not used for CSV-based inputs. May be ignored depending on correction_source.
prefix – Optional prefix used when building the dependent-variable column names used in the query strings (e.g. “trk” -> “trk_pt”). Passed through to correction.build_queries(prefix).
weightname – Base name of the weight column. The final column name is built by correction._build_column_name(prefix, weightname).
overwrite – If True and the target weight column already exists, it will be overwritten. If False and the column exists, the function logs a warning and does not modify the DataFrame.
Nvar – Number of toy-variation columns to add. Must be a non-negative integer. If 0, only the central weight column is added. If > 0, columns named “{column_name}_var_{j}” for j in [0, Nvar-1] are added.
- Returns:
The DataFrame is modified in-place.
- Return type:
None
- Raises:
ValueError – If Nvar is negative, or if correction_source / MC_production do not form a valid combination for constructing a correction.
Exception – Propagates any exception raised by create_correction_object, df.eval, or by the correction/variator internals.
Notes
The queries produced by correction.build_queries(prefix) are evaluated using df.eval(), so the DataFrame must contain the referenced columns.
For correctness, the correction’s binning (number of central values / queries) must match the internal structure of the correction and the Variator.
- sysvar.api.calculate_covariance_matrix(df: DataFrame, settings: Dict, syst_effect: str | Dict, binning: Dict, channels: List, input_cov: np.ndarray = None, save_cov: bool = False)[source]#
Calculate the covariance matrix for a given dataset.
This function computes a covariance matrix based on the input data, configuration settings, and systematic effects. It provides support for pre-defined systematics or custom-defined ones and allows the user to specify binning and channels. Optionally, it can save the covariance matrix to a file.
- Parameters:
df (DataFrame) – The input data to calculate the covariance matrix from.
settings (Dict) – Configuration settings, same as for the EigenDecomposer.
syst_effect (str | Dict) – The name of the systematic effect to consider for the covariance matrix. For systematics from YAML files the name is enough. If this is a custom systematic then a dictionary with for the custom systematic is expected similarly to the dictionary necessary for the custom correction object in the eigendecomposition.
binning (Dict) – Binning information for the covariance matrix. Keys should be the variable names present in the df and values lists of bin edges.
channels (List) – List of channels to consider for the covariance matrix.
save_cov (bool, optional) – If True, saves the covariance matrix. The path should be read from the settings dictionary. Defaults to False.
- Returns:
the covariance matrix from the covariance matrix calculator
- sysvar.api.eigendecompose(df: DataFrame, settings: dict[str, Any], systematic_source: str | Path | dict, title: str | None = None, cov_matrix_path: str | Path | None = None, criterion: str = 'max_differences', prc: float = 0.0001, max_variations: int | None = None, save_variations: bool = False, save_channel_covariance_matrices: bool = False, verbose: bool = True, seed: int = 8311311)[source]#
Run an eigendecomposition workflow and return the configured EigenDecomposer.
- This is a convenience wrapper around EigenDecomposer that:
constructs an EigenDecomposer,
generates template variations,
applies the requested precision / variation limits,
selects the important eigendimensions using criterion,
optionally persists variations and/or per-channel covariance matrices.
- Parameters:
df – Input DataFrame used by the decomposer (templates / channels / yields, as expected by EigenDecomposer).
settings – Configuration dictionary consumed by EigenDecomposer (e.g. channel definitions, output paths, variables to use, etc.).
systematic_source –
Source used to build the underlying correction / systematic definition. Typically one of:
str: a correction/systematic identifier (e.g. YAML key; legacy),
Path or path-like str: a CSV file describing the correction,
dict: an in-memory correction configuration.
The exact interpretation is delegated to EigenDecomposer.
None (title (str |) –
Custom identifier for CSV-based corrections.
If not provided, defaults to the CSV file stem (e.g. “track_eff” for “…/track_eff.csv”). The identifier is used to match this correction to the corresponding systematic configuration in the settings dictionary.
- Example:
title=”track_eff” must match the key/name used in settings[“systematics”][…].
optional) –
Custom identifier for CSV-based corrections.
If not provided, defaults to the CSV file stem (e.g. “track_eff” for “…/track_eff.csv”). The identifier is used to match this correction to the corresponding systematic configuration in the settings dictionary.
- Example:
title=”track_eff” must match the key/name used in settings[“systematics”][…].
cov_matrix_path – Optional path to an explicit covariance matrix to use instead of building it from uncertainties. If provided, it is passed through to EigenDecomposer.
criterion – Criterion used to select “important” eigendimensions. Must be understood by EigenDecomposer.find_important_eigendimension_indices. Default is “max_differences”.
prc – Precision threshold used to determine how many eigendimensions to keep. Interpreted by the decomposer. Default is 1e-4.
max_variations – Optional hard cap on the number of variations/eigendimensions to consider (after applying the precision criterion). If None, no cap is applied.
save_variations – If True, calls EigenDecomposer.save_template_variations(). This performs file I/O to whatever output location the decomposer/settings define.
save_channel_covariance_matrices – If True, calls EigenDecomposer.save_channel_covariance_matrices(). This performs file I/O.
verbose – If True, enables verbose output/logging in EigenDecomposer.
seed – Random seed forwarded to EigenDecomposer for reproducibility.
- Returns:
The initialized decomposer instance containing the decomposition results and selected eigendimensions.
- Return type:
Notes
This function has optional side effects (writing files) when save_variations and/or save_channel_covariance_matrices are enabled.
- sysvar.api.plot_analysis_corr_matrix(eigendecomposer_obj: EigenDecomposer, save: bool = False, filename: None | str = None) tuple[Figure, Axes][source]#
Plot the correlation matrix of an eigendecomposition analysis.
- Parameters:
eigendecomposer_obj (EigenDecomposer) – The decomposition object containing correlation data.
save (bool, optional) – If True, saves the plot to file. Defaults to False.
filename (str, optional) – Output file name if saving. Defaults to None.
- Returns:
None
- sysvar.api.plot_correction_cov_and_corr(eigendecomposer_obj: EigenDecomposer, save: bool = False, filename: None | str = None) tuple[Figure, Axes][source]#
Plot correction covariance and correlation matrices.
- Parameters:
eigendecomposer_obj (EigenDecomposer) – The decomposition object containing correction data.
save (bool, optional) – If True, saves the plot. Defaults to False.
filename (str, optional) – Output file name if saving. Defaults to None.
- Returns:
None
- sysvar.api.plot_correction_errors(eigendecomposer_obj: EigenDecomposer, save: bool = False, filename: None | str = None) tuple[Figure, Axes][source]#
Plot correction error comparisons.
- Parameters:
eigendecomposer_obj (EigenDecomposer) – The decomposition object containing correction information.
save (bool, optional) – If True, saves the plot. Defaults to False.
filename (str, optional) – Output file name if saving. Defaults to None.
- Returns:
None
- sysvar.api.plot_correction_variations_in_grid(eigendecomposer_obj: EigenDecomposer, nbins=21, save: bool = False, filename: None | str = None) tuple[Figure, Axes][source]#
Plot correction variations in a grid layout.
- Parameters:
eigendecomposer_obj (EigenDecomposer) – The decomposition object containing correction variations.
nbins (int, optional) – Number of bins to use in the grid plot. Defaults to 21.
save (bool, optional) – If True, saves the plots. Defaults to False.
filename (str, optional) – Output file name if saving. Defaults to None.
- Returns:
None
- sysvar.api.plot_cov_diff(eigendecomposer_obj: EigenDecomposer, save: bool = False, filename: None | str = None) tuple[Figure, Axes][source]#
Plot the normalized covariance difference between original and eigendecomposed covariance matrix for an initial truncation guess.
- Parameters:
eigendecomposer_obj (EigenDecomposer) – The decomposition object containing covariance data.
save (bool, optional) – If True, saves the plot. Defaults to False.
filename (str, optional) – Output file name if saving. Defaults to None.
- Returns:
None
- sysvar.api.plot_templates_relative_variations_in_grid(eigendecomposer_obj: EigenDecomposer, save: bool = False, filename: None | str = None) List[tuple[Figure, Axes]][source]#
Plot relative template variations in a grid layout.
- Parameters:
eigendecomposer_obj (EigenDecomposer) – The decomposition object containing templates.
save (bool, optional) – If True, saves the plots. Defaults to False.
filename (str, optional) – Output file name if saving. Defaults to None.
- Returns:
None
- sysvar.api.plot_up_and_down_variations(eigendecomposer_obj: EigenDecomposer, save: bool = False, filename: None | str = None) List[tuple[Figure, Axes]][source]#
Plot up/down variations for each template in the decomposition.
- Parameters:
eigendecomposer_obj (EigenDecomposer) – The decomposition object containing templates.
save (bool, optional) – If True, saves the plots. Defaults to False.
filename (str, optional) – Output file name if saving. Defaults to None.
- Returns:
None
- sysvar.api.register_saving_info(eigendecomposer_obj: EigenDecomposer, saving_info: Dict)[source]#
Register saving information in the eigendecomposer object.
- Parameters:
eigendecomposer_obj (EigenDecomposer) – The decomposition object to update.
saving_info (Dict) – Dictionary containing saving parameters.
- Returns:
None
- sysvar.api.save_existing_eigenvariations(df: DataFrame, settings: Dict, syst_effect: str, verbose=True)[source]#
Save existing eigenvariations for a specified systematic effect.
This function complements the nominal-template saving step: nominal templates should already be present in the configured output file before calling this function.
The saver will read variations from df and write eigenvariation histograms into the same ROOT file structure expected by cabinetry/pyhf so that the model can later be built including these systematic eigenvariations. Instead of using the nominal weights for the histogram filling, the nominal weight for the syst_effect is replaced by the variations present in df. The number of variations that should be present in df is read from the settings dictionary in the systematics part.
- sysvar.api.save_nominal_templates(df: DataFrame, settings: Dict, data=None)[source]#
Save nominal templates for an MC dataset.
Write nominal templates for a Monte Carlo (MC) dataset (and optional experimental data) to the output file configured in settings. Only ROOT (.root) files are currently supported. Channels, templates, signal extraction variables, binning, and other required configuration are read from the settings dictionary. The produced file structure is compatible with cabinetry and can be used to build a pyhf model including systematic uncertainties.
The function creates or recreates the configured output file on disk and therefore will overwrite any existing file at that location. Because the file is recreated, eigenvariation histograms saved before calling this function would be lost; call this function before saving eigenvariations for systematics.
- Parameters:
df (DataFrame) – MC dataset containing template information used to build the nominal templates (e.g., event records or pre-binned contents). This object is read but not modified by the function.
settings (Dict) – Configuration dictionary. Must include the output filename (currently a .root path) and the definitions for channels, templates, signal-extraction variables and binning required to produce the histograms.
data (optional) – Experimental (observed) dataset to be histogrammed and included in the output using the same channels, variables and binning as the MC templates. If None, no observed-data histograms are written.
- Returns:
- The handler object used to save the nominal
templates. This object can be returned for later use inspect saved templates).
- Return type:
sysvar.channel_template_handler module#
sysvar.corrections module#
- class sysvar.corrections.BaseCorrection(uncertainties: 'dict' = <factory>)[source]#
Bases:
ABC,SavableAttributesObject- add_uncertainty(unc_name, unc_values, unc_obj: Uncertainty, explicit_cov_matrix: ndarray | None = None) None[source]#
Add an uncertainty to the Correction.
- Parameters:
unc (Uncertainty) – The uncertainty to be added.
- Raises:
UncertaintyWithSameNameExists – If uncertainty with the same name has already been added to the variator.
- populate_uncertainties()[source]#
Populate the uncertainties attribute with uncertainty objects based on the provided information.
This method adds uncertainties either from a provided covariance matrix (explicitly correlated uncertainty), or from a dictionary of uncertainty types and values specified in the info attribute.
If a covariance matrix (cov_matrix) is available, it creates and adds a single ExplicitlyCorrelatedUncertainty. Otherwise, it looks up available uncertainty types, validates them, and populates the uncertainties accordingly.
- Raises:
UnknownUncertaintyType – If an unsupported uncertainty type is found in the input data.
Notes
The uncertainty type definitions are expected to be returned from get_uncertainty_types().
Each uncertainty must have a unique name.
- abstract property visual_labels#
- class sysvar.corrections.BaseCorrectionFromCSV(uncertainties: dict = <factory>, csv_path: str | None = None, title: str | None = None, cov_matrix_path: str | None = None)[source]#
Bases:
BaseCorrectionCSV-driven base class that standardizes reading corrections (central values and uncertainties) from a single, long-form CSV file.
- The CSV is expected to contain at least:
‘central_value’
- Optional uncertainty columns (if present they are auto-mapped):
‘stat_corr’, ‘sys_corr’ -> fully_correlated
‘stat_uncorr’, ‘sys_uncorr’-> uncorrelated
- Optional extra metadata columns that 1D/2D specializations will use:
‘dependant_variable’ or ‘dependant_variable_1’ and ‘dependant_variable_2’
‘{var}_unit’, ‘{var}_min’, ‘{var}_max’ for each dependant variable
- Optional extra cut columns for automatic query enhancement:
‘PDG’: PDG codes as strings in format “[521,-521]” or “[521]”
‘mcPDG’: MC truth PDG codes as strings in format “[521,-521]” or “[521]”
Note: Only string format like “[521,-521]” is supported.
- Optional explicit covariance matrix:
‘cov_matrix_path’: Path to file containing explicit covariance matrix
- class sysvar.corrections.BaseCorrectionFromYaml(uncertainties: 'dict' = <factory>, systematic: 'str' = None, MC_production: 'str' = None)[source]#
Bases:
BaseCorrection- build_table_path(suffix: None | str = None) str[source]#
Builds the path for a table file based on the given suffix.
- Parameters:
suffix (str) – The suffix to be added to the table name.
- Returns:
The full path of the table file.
- Return type:
- Raises:
ValueError – If the table extension is unknown.
- property table_dir: str#
Returns the directory path where the tables are stored.
- Returns:
The directory path where the tables are stored.
- Return type:
- property table_ext: str#
Returns the table extension by reading it from the config file.
- Returns:
The table extension.
- Return type:
- class sysvar.corrections.Correction1D(uncertainties: 'dict' = <factory>, systematic: 'str' = None, MC_production: 'str' = None, dependant_variable: 'str | None' = None, central_values: 'Iterable' = None, lower_bounds: 'Iterable' = None, upper_bounds: 'Iterable' = None)[source]#
Bases:
BaseCorrectionFromYaml- central_values: Iterable = None#
- lower_bounds: Iterable = None#
- read_corrections() ndarray[source]#
Reads correction values either directly from the config or from a table file.
This method checks whether the ‘corrections’ entry in the config (self.info) is a list/array or a dictionary. If it is a list or NumPy-compatible array, the correction values are loaded directly. If it is a dictionary, the method builds the table path using self.build_table_path(), reads the table, and extracts the correction values using the key specified in self.table_key.
- Returns:
An array of correction values.
- Return type:
np.ndarray
- Raises:
KeyError – If ‘corrections’ or self.table_key is missing from the config.
FileNotFoundError – If the table file does not exist at the constructed path.
ValueError – If the table does not contain the specified key.
- upper_bounds: Iterable = None#
- class sysvar.corrections.Correction1DFromCSV(uncertainties: dict = <factory>, csv_path: str | None = None, title: str | None = None, cov_matrix_path: str | None = None, dependant_variable: str | None = None, central_values: Iterable = None, lower_bounds: Iterable = None, upper_bounds: Iterable = None, unit: str | None = None, use_equality_queries: bool = False, variable_values: Iterable = None)[source]#
Bases:
BaseCorrectionFromCSV1D correction reader from a single CSV. Expects columns:
‘central_value’
‘dependant_variable’ or ‘dependant_variable_1’
‘{var}_unit’ (optional)
Either: ‘{var}_min’, ‘{var}_max’ for bin edges (continuous bins)
Or: column named ‘{var}’ with discrete integer values (uses equality queries)
- central_values: Iterable = None#
- get_unit() str[source]#
Return the unit associated with the dependent variable.
This property attempts to determine the unit column in the table using the following priority:
A column named “{dependent_variable}_unit”.
A column named “unit”.
The first column that ends with “_unit”.
If a matching unit column is found, the value from the first row of that column is returned. If no such column exists, an empty string is returned.
- Returns:
The unit string if found; otherwise an empty string.
- Return type:
- lower_bounds: Iterable = None#
- upper_bounds: Iterable = None#
- variable_values: Iterable = None#
- class sysvar.corrections.Correction2D(uncertainties: 'dict' = <factory>, systematic: 'str' = None, MC_production: 'str' = None)[source]#
Bases:
BaseCorrectionFromYaml- property central_values_table: DataFrame#
- property iterator#
- populate_uncertainties()[source]#
Populate the uncertainties attribute with uncertainty objects based on the provided information.
This method adds uncertainties either from a provided covariance matrix (explicitly correlated uncertainty), or from a dictionary of uncertainty types and values specified in the info attribute.
If a covariance matrix (cov_matrix) is available, it creates and adds a single ExplicitlyCorrelatedUncertainty. Otherwise, it looks up available uncertainty types, validates them, and populates the uncertainties accordingly.
- Raises:
UnknownUncertaintyType – If an unsupported uncertainty type is found in the input data.
Notes
The uncertainty type definitions are expected to be returned from get_uncertainty_types().
Each uncertainty must have a unique name.
- property stat_error_table: DataFrame#
- property sys_error_table: DataFrame#
- class sysvar.corrections.Correction2DCategorical(uncertainties: 'dict' = <factory>, systematic: 'str' = None, MC_production: 'str' = None, categorical_variable: 'str | None' = None, continuus_variable: 'str | None' = None, central_values: 'Iterable' = None, continuus_edges: 'Iterable' = None, categorical_values: 'Iterable' = None, categorical_label: 'str' = None)[source]#
Bases:
BaseCorrectionFromYaml- categorical_values: Iterable = None#
- central_values: Iterable = None#
- continuus_edges: Iterable = None#
- property iterator#
- property queries#
- class sysvar.corrections.Correction2DFromCSV(uncertainties: dict = <factory>, csv_path: str | None = None, title: str | None = None, cov_matrix_path: str | None = None, dependant_variable_1: str | None = None, dependant_variable_2: str | None = None, unit_1: str | None = None, unit_2: str | None = None, central_values: Iterable = None)[source]#
Bases:
BaseCorrectionFromCSV2D correction reader from a single CSV. Expects columns:
‘central_value’
‘dependant_variable_1’, ‘dependant_variable_2’
‘{var1}_unit’, ‘{var2}_unit’
‘{var1}_min’,’{var1}_max’,’{var2}_min’,’{var2}_max’
- central_values: Iterable = None#
- property iterator#
- class sysvar.corrections.Correction3DFromCSV(uncertainties: dict = <factory>, csv_path: str | None = None, title: str | None = None, cov_matrix_path: str | None = None, dependant_variable_1: str | None = None, dependant_variable_2: str | None = None, dependant_variable_3: str | None = None, unit_1: str | None = None, unit_2: str | None = None, unit_3: str | None = None, central_values: Iterable = None)[source]#
Bases:
BaseCorrectionFromCSV3D correction reader from a single CSV. Expects columns:
‘central_value’
‘dependant_variable_1’, ‘dependant_variable_2’, ‘dependant_variable_3’
‘{var1}_unit’, ‘{var2}_unit’, ‘{var3}_unit’
‘{var1}_min’,’{var1}_max’,’{var2}_min’,’{var2}_max’,’{var3}_min’,’{var3}_max’
- central_values: Iterable = None#
- property iterator#
- class sysvar.corrections.CorrectionBF(uncertainties: 'dict' = <factory>, systematic: 'str' = None, MC_production: 'str' = None, dependant_variable: 'str | None' = None, central_values: 'Iterable' = None, visual_labels: 'Iterable' = None)[source]#
Bases:
BaseCorrectionFromYaml- central_values: Iterable = None#
- populate_uncertainties(error_amplitudes: list)[source]#
Overrides the method of the base class method as the error amplitutes are calculated dynamically from the calculate_scaling_ratios method
- visual_labels: Iterable = None#
- class sysvar.corrections.CorrectionPID(uncertainties: 'dict' = <factory>, systematic: 'str' = None, MC_production: 'str' = None)[source]#
Bases:
BaseCorrectionFromYaml- property iterator#
- property queries#
- class sysvar.corrections.CorrectionTableFinder(particle_species, online_cut, base_table_path, variable, MC_production)[source]#
Bases:
objectFactory method class to get correction tables for kaons, pions, electrons and muons
- build_table_name(table_ids: str) list[source]#
Builds the efficiency and fake rate tables path names
Args: table_ids: efficiency or fake table id
Returns: list with the efficiency or fake rate table file names
- get_cut_type() str[source]#
Reads the yaml configuration file and extracts the cut type that have been applied in the online reconstuction
Args: species: Particle species, should be K+ or pi+
Returns: the cut type that has been applied online. Binary or global
- get_cut_value() str[source]#
Reads the yaml configuration file and extracts the cut type that have been applied in the online reconstuction
Args: species: Particle species, should be K+ or pi+
Returns: the cut type that has been applied online. Binary or global
- static make_pidvar_compatible(MC_production: str, table: DataFrame, max_uncertainty: float | None = 100.0)[source]#
Convert the pandas dataframes obtained Hadron ID CSV tables via into a format consistent with the format of the lepton ID tables which
PIDvarunderstands.In particular, convert the
chargecolumn from1/-1integer entries to+/-string entries and calculate thetheta_min/theta_maxcolumns.- Parameters:
table – Pandas dataframe obtained from
pandas.read_csvon HID tableinplace – Whether to modify the existing dataframe in place. Otherwise, a copy of the existing dataframe will be returned.
max_uncertainty – Drop rows in HID tables where any of the data-MC uncertainties (sys/stat up/down) exceed this value. Rationale: The HID tables contain rows with nonsense uncertainties > 10⁸, so it is meant to remove those entries. Therefore, the exact value is not important. Set to
Noneto disable dropping any columns.
- Returns:
Modified dataframe that can be used by
PIDvar.
- production_table_ids()[source]#
Updates hadron table IDs based on the MC production type.
- Raises:
ValueError – If an unsupported MC production type is provided.
- class sysvar.corrections.CustomCorrection(uncertainties: 'dict' = <factory>, info: 'InitVar[dict]' = None, dependant_variable: 'str | None' = None, central_values: 'Iterable' = None, query_targets: 'Iterable' = None)[source]#
Bases:
BaseCorrection- central_values: Iterable = None#
- query_targets: Iterable = None#
- sysvar.corrections.create_correction_object(correction_source: str | Path | dict, MC_production: str | None = None, title: str | None = None, cov_matrix_path: str | None = None) BaseCorrection[source]#
Retrieves and creates the appropriate correction object based on the systematic effect and MC production type.
- Parameters:
source (correcction) – The systematic effect identifier for YAML-based corrections, or CSV file path for CSV-based corrections.
MC_production (str, optional) – The Monte Carlo production type identifier. Required for YAML-based corrections.
title (str, optional) – Title for CSV-based corrections. If not provided, will use filename.
cov_matrix_path (str, optional) – Path to explicit covariance matrix file for CSV-based corrections.
- Returns:
The appropriate correction object based on the provided systematic effect and MC production type.
- Return type:
- Raises:
NotImplementedError – If the correction type specified in the configuration is not implemented.
ValueError – If invalid combination of arguments is provided.
Example
>>> # YAML-based correction >>> correction = create_correction_object("syst1", "MC1") >>> isinstance(correction, BaseCorrection) True >>> # CSV-based correction >>> correction = create_correction_object(csv_path="path/to/file.csv", csv_type="1D") >>> isinstance(correction, BaseCorrection) True >>> # CSV-based correction with explicit covariance matrix >>> correction = create_correction_object(csv_path="path/to/file.csv", cov_matrix_path="path/to/cov.txt") >>> isinstance(correction, BaseCorrection) True
sysvar.covariance_calculator module#
sysvar.eigendecomposer module#
- class sysvar.eigendecomposer.EigenDecomposer(df: DataFrame, settings: dict, systematic_source: str | dict | Path | None = None, cov_matrix_path: str | None = None, title: str | None = None, verbose: bool = True, seed: int = 8311311)[source]#
Bases:
ChannelTemplateHandler
- class sysvar.eigendecomposer.ExistingEigenVariationsSaver(df: DataFrame, settings: dict, verbose: bool = True)[source]#
Bases:
ChannelTemplateHandler
sysvar.fit_setup module#
sysvar.saver module#
- class sysvar.saver.PlotSaver(object_to_save, filename, saving_info)[source]#
Bases:
SaverSaves plot objects (e.g., figures) with additional handling for file extensions and logging.
Inherits from the Saver class and provides specific functionality for saving plot objects.
- Parameters:
- add_default_extensions()[source]#
Ensures default file extensions (png and pdf) are added if none are provided.
- static get_key_description(key: str) str[source]#
Provides a description of the specified saving info key.
- Parameters:
key (str) – The key for which the description is requested.
- Returns:
The description of the key.
- Return type:
Example
>>> PlotSaver.get_key_description('top_dir') 'The top directory that your objects should be saved in'
- log_missing_optional_field(key: str)[source]#
Logs a warning if an optional field is missing.
- Parameters:
key (str) – The missing optional field key.
- raise_missing_mandatory_field(key: str)[source]#
Raises an error if a mandatory field is missing.
- Parameters:
key (str) – The missing mandatory field key.
- Raises:
MissingMandatorySavingInfo – If the mandatory field is not found in saving_info.
- class sysvar.saver.Saver(object_to_save, filename, saving_info)[source]#
Bases:
ABCAbstract base class for saving objects with mandatory and optional saving information.
- Parameters:
- object_to_save#
The object that needs to be saved.
- abstract log_missing_optional_field(key)[source]#
Abstract method for logging missing optional fields.
- populate_mandatory_fields(saving_info: dict)[source]#
Populates the mandatory fields from the saving information.
- Parameters:
saving_info (dict) – The dictionary containing saving information.
- Raises:
The specific method raise_missing_mandatory_field if a mandatory field is missing. –
- populate_optional_fields(saving_info: dict)[source]#
Populates the optional fields from the saving information.
- Parameters:
saving_info (dict) – The dictionary containing saving information.
Logs missing optional fields using the log_missing_optional_field method.
sysvar.templates module#
- class sysvar.templates.Template(df: DataFrame, binning: dict, total_weight: str, syst_weight: None | str = None, prefices: None | str | list = None, correction: None | BaseCorrection = None, variator: None | Variator = None, verbose: bool = True)[source]#
Bases:
ABC,SavableAttributesObject- collect_weights(index)[source]#
Collects weights based on the provided index, handling different variations.
- Parameters:
index – Specifies the type of weight to collect. Can be None, a string such as “MC”, “up”, “down”, or variations like “up0”, “up1”, …, “up8”, “down0”, “down1”, …, “down8”, or an integer.
- Returns:
An array of weights based on the specified index.
- Return type:
- Raises:
NotImplementedError – If the provided index is not supported.
Notes
If index is None, returns the nominal total weight.
If index is “MC”, returns the square of the total weight.
For “up” and “down” variations, computes the weights with specific adjustments.
Handles both string and dictionary types for self.syst_weight.
Example
>>> self.collect_weights(None) array([...])
>>> self.collect_weights("MC") array([...])
>>> self.collect_weights("up1") array([...])
- property correction: BaseCorrection#
- plot_relative_variations_in_grid(title: str = '', save: bool = False, filename: str = '') tuple[Figure, Axes][source]#
- class sysvar.templates.Template1D(df: DataFrame, binning: dict, total_weight: str, syst_weight: str = None, prefices: str | list = None, correction: None | BaseCorrection = None, variator: None | Variator = None, verbose: bool = True)[source]#
Bases:
Template
sysvar.uncertainties module#
Bases:
UncertaintyBuild the correlation matrix of the explicitly correlated uncertainty.
- Returns:
The correlation matrix of the uncertainty.
- Return type:
np.ndarray
Build the covariance matrix of the explicitly correlated uncertainty.
- Returns:
The covariance matrix of the uncertainty.
- Return type:
np.ndarray
Bases:
UncertaintyRepresents a fully correlated uncertainty with a name and error values.
This class inherits from the Uncertainty base class and implements a fully correlated uncertainty with a correlation matrix of ones.
- Parameters:
name (str) – The name of the uncertainty.
errors (Iterable) – An iterable containing the error values.
The name of the uncertainty.
- Type:
An array containing the error values.
- Type:
np.ndarray
The covariance matrix of the uncertainty.
- Type:
np.ndarray
The correlation matrix of the uncertainty.
- Type:
np.ndarray
Build the covariance matrix of the fully correlated uncertainty.
- Returns:
The covariance matrix of the uncertainty.
- Return type:
np.ndarray
Bases:
UncertaintyRepresents a fully correlated uncertainty only in parts with a name and error values.
This class inherits from the Uncertainty base class and implements a fully correlated uncertainty with a correlation matrix that is full of ones in some regions and completely uncorrelated in other regions
- Parameters:
name (str) – The name of the uncertainty.
errors (Iterable) – An iterable containing the error values.
dimensions (part =) – dimensions of correlated/uncorrelated parts
The name of the uncertainty.
- Type:
An array containing the error values.
- Type:
np.ndarray
The covariance matrix of the uncertainty.
- Type:
np.ndarray
The correlation matrix of the uncertainty.
- Type:
np.ndarray
Build the covariance matrix of the fully correlated uncertainty.
- Returns:
The covariance matrix of the uncertainty.
- Return type:
np.ndarray
- class sysvar.uncertainties.Uncertainty(name: str, errors: ndarray, string_boundaries: List)[source]#
Bases:
ABC,SavableAttributesObjectAbstract base class for representing uncertainties with a name and error values.
- Parameters:
name (str) – The name of the uncertainty.
errors (np.ndarray) – An array containing the error values.
- errors#
An array containing the error values.
- Type:
np.ndarray
- cov_matrix#
The covariance matrix of the uncertainty.
- Type:
np.ndarray
Bases:
UncertaintyRepresents an uncorrelated uncertainty with a name and error values.
This class inherits from the Uncertainty base class and implements an uncorrelated uncertainty with a correlation matrix of identity.
- Parameters:
name (str) – The name of the uncertainty.
errors (Iterable) – An iterable containing the error values.
The name of the uncertainty.
- Type:
An array containing the error values.
- Type:
np.ndarray
The covariance matrix of the uncertainty.
- Type:
np.ndarray
The correlation matrix of the uncertainty.
- Type:
np.ndarray
Build the covariance matrix of the uncorrelated uncertainty.
- Returns:
The covariance matrix of the uncertainty.
- Return type:
np.ndarray
sysvar.utils module#
- class sysvar.utils.SavableAttributesObject[source]#
Bases:
objectA class for managing objects with savable attributes.
This class provides a mechanism to register and store saving information for various attributes.
- register_saving_info(saving_info: dict)[source]#
Registers the saving information for the object’s attributes.
- Parameters:
saving_info (dict) – A dictionary containing information about how attributes should be saved.
Example
>>> obj = SavableAttributesObject() >>> obj.register_saving_info({'attribute_name': 'save_path'})
- sysvar.utils.corr2cov(corr, var)[source]#
Calculates the covariance matrix from a given correlation matrix and a variance vector.
- Parameters:
corr (np.ndarray) – Correlation matrix of shape (n,n).
var (np.ndarray) – Variance vector of shape (n,).
- Returns:
out – Covariance matrix. Shape is (n,n).
- Return type:
np.ndarray
- sysvar.utils.cov2corr(covariance)[source]#
Compute the correlation matrix from the given covariance matrix.
- Parameters:
covariance (numpy.ndarray) – The covariance matrix.
- Returns:
The correlation matrix.
- Return type:
- sysvar.utils.load_covariance_matrix(config: dict, *, key: str = 'cov_matrix') ndarray[source]#
Load a covariance matrix from a config dict.
- Contract:
config MUST contain key (default: ‘cov_matrix’)
If config[key] is None: raise ValueError
If config[key] is a path (str/Path): load from file
If config[key] is array-like (list/tuple/np.ndarray): return np.ndarray
- sysvar.utils.read_yaml(cfg_name: str, deeper_dir: str = '')[source]#
Reads a yaml file file from the configuration directory of the repository and and returns a dictionary with the yaml data
Args: cfg_name: Name of the yaml file which is located in the configs directory deeper_dir: Specify the directory within the configs directory.
Returns: Dictionary read from the yaml file
sysvar.variations module#
- class sysvar.variations.Variator(correction: BaseCorrection, Nvar: int = 20, seed: int = 8311311)[source]#
Bases:
ABC,SavableAttributesObjectAbstract base class for generating variations on a correction. Then the variator will create a big covariance matrix of all the uncertainties. Through the get_variations_from_uncertainty method, one can examine the effect of the variations coming from a specific uncertainty.
- Parameters:
correction (BaseCorrection) – The correction object.
Nvar (int) – The number of variations to be generated. Defaults to 20.
- correction#
The correction object.
- Type:
- property cov_matrix: ndarray#
Build the total covariance matrix from all added uncertainties. Here all the covariance matrices from all uncertainties are summed up to a total covariance matrix Return the covariance matrix from one uncertainty type of only one is present
- Returns:
The total covariance matrix.
- Return type:
np.ndarray
- generate_variations(Nvar: int, covariance: ndarray, seed: int = 8311311) ndarray[source]#
Generate variations based on a covariance matrix. This is using a standard multivariate normal.
- Parameters:
Nvar (int) – The number of variations to generate.
covariance (np.ndarray) – The covariance matrix.
- Returns:
An array of variations.
- Return type:
np.ndarray
- get_correction_variations(seed: int = 8311311) ndarray[source]#
Get variations on the correction. This adds the generated variations from the total covariance matrix to the nominal values of the correction.
- Parameters:
Nvar (int) – The number of variations to generate.
- Returns:
An array of correction variations.
- Return type:
np.ndarray
- get_variations_from_uncertainty(Nvar: int, name: str, seed: int = 8311311) ndarray[source]#
Helper function to inspect variations coming from a single source of uncertainty. Creates new variations but likely this is okay as we’re interested in examining the sources qualitatevely.
sysvar.visualize module#
- class sysvar.visualize.CorrectionVisualizer(instance: BaseCorrection)[source]#
Bases:
Visualizer
- class sysvar.visualize.EigenDecomposerVisualizer(instance: EigenDecomposer)[source]#
Bases:
Visualizer- plot_corr_matrix(fig: plt.Figure | None = None, ax: plt.Axes | None = None, save: bool = False, filename: str = '') tuple[plt.Figure, plt.Axes][source]#
- class sysvar.visualize.TemplateVisualizer(instance: Template)[source]#
Bases:
Visualizer- plot_corr_matrix(fig: plt.Figure | None = None, ax: plt.Axes | None = None, save: bool = False, filename: str = '') tuple[plt.Figure, plt.Axes][source]#
- plot_cov_matrix(fig: plt.Figure | None = None, ax: plt.Axes | None = None, save: bool = False, filename: str = '') tuple[plt.Figure, plt.Axes][source]#
- plot_eigenvalues(fig: plt.Figure | None = None, ax: plt.Axes | None = None, save: bool = False, filename: str = '') tuple[plt.Figure, plt.Axes][source]#
- plot_nominal_template(fig: plt.Figure | None = None, ax: plt.Axes | None = None, save: bool = False, filename: str = '') tuple[plt.Figure, plt.Axes][source]#
- plot_relative_variations_in_grid(fig: plt.Figure | None = None, ax: plt.Axes | None = None, title: str = '', nbins: int = 11, save: bool = False, filename: str = '') tuple[plt.Figure, plt.Axes][source]#
- class sysvar.visualize.UncertaintyVisualizer(instance: Uncertainty)[source]#
Bases:
Visualizer
- class sysvar.visualize.VariatorVisualizer(instance: Variator)[source]#
Bases:
Visualizer- plot_corr_matrix(fig: plt.Figure | None = None, ax: plt.Axes | None = None, save: bool = False, filename: str = '') tuple[plt.Figure, plt.Axes][source]#
- plot_cov_matrix(fig: plt.Figure | None = None, ax: plt.Axes | None = None, save: bool = False, filename: str = '') tuple[plt.Figure, plt.Axes][source]#
- plot_gaussian_variations(save: bool = False, filename: str = '')[source]#
Plot Gaussian variations of the corrections.
Args:
- Returns:
None
- plot_relative_variations(value_edges: Iterable, Nvar: int = 5, save: bool = False, filename: str = '') tuple[Figure, Axes][source]#
Plots the relative variations of the templates. The Nvar argument specifies the number of variatios that will be plotted. Defaults to 5.
- Parameters:
Nvar (int, optional) – The number of variations to visualize.
- Returns:
A tuple containing the figure and axis objects.
- Return type:
Tuple[plt.Figure, Axis]
- class sysvar.visualize.Visualizer(instance: BaseCorrection)[source]#
Bases:
ABC- plot_cov_and_corr(fig: plt.Figure | None = None, ax: plt.Axes | None = None, save: bool = False, filename: str = '')[source]#
- static plot_variation_on_axis(ax: plt.Axes, x: np.ndarray, variation: np.ndarray, index: None | int = None, plot_func: str = 'step', save: bool = False, filename: str = '')[source]#
Plots a variation on a given axis. The absence of a value for the index arguments shows that this is a nominal template. The function creates the correct labels, colors and linestyle based on the value of the index.
- Parameters:
ax (plt.Axes) – The axis to plot on.
x (np.ndarray) – The x values for the plot. Should be the bin mid values as the method makes use of the matplotlib’s steps method
variation (np.ndarray) – The variation values to plot.
index (int, None) – The index of the variation (None for nominal).
plot_func (str) – name of the matplotlib.pyplot function to use for the plot
- Returns:
None