Postprocessing and reporting

Warning

message_ix.reporting is experimental in message_ix 1.2 and only supports Python 3. The API and functionality may change without advance notice or a deprecation period in subsequent releases.

The ix modeling platform provides powerful features to perform calculations and other postprocessing after a message_ix.Scenario has been solved by the associated model. The MESSAGEix framework uses these features to provide zero-configuration reporting of models built on the framework.

These features are accessible through Reporter, which can produce multiple reports from one or more Scenarios. A report is identified by a key (usually a string), and may…

  • perform arbitrarily complex calculations while intelligently handling units;
  • read and make use of data that is ‘exogenous’ to (not included in) a Scenario;
  • produce output as Python or R objects (in code), or to files or databases;
  • calculate only a requested subset of quantities; and
  • much, much more!

Terminology

ixmp.reporting handles numerical quantities, which are scalar (0-dimensional) or array (1 or more dimensions) data with optional associated units. ixmp parameters, scalars, equations, and time-series data all become quantities for the purpose of reporting.

Every quantity and report is identified by a key, which is a str or other hashable object. Special keys are used for multidimensional quantities. For instance: the MESSAGEix parameter resource_cost, defined with the dimensions (node n, commodity c, grade g, year y) is identified by the key 'resource_cost:n-c-g-y'. When summed across the grade/g dimension, it has dimensions n, c, y and is identified by the key 'resource_cost:n-c-y'.

Non-model [1] quantities and reports are produced by computations, which are atomic tasks that build on other computations. The most basic computations—for instance, resource_cost:n-c-g-y—simply retrieve raw/unprocessed data from a message_ix.Scenario and return it as a Quantity. Advanced computations can depend on many quantities, and/or combine quantities together into a structure like a document or spreadsheet. Computations are defined in ixmp.reporting.computations and message_ix.reporting.computations, but most common computations can be added using the methods of Reporter.

[1]i.e. quantities that do not exist within the mathematical formulation of the model itself, and do not affect its solution.

Basic usage

A basic reporting workflow has the following steps:

  1. Obtain a message_ix.Scenario object from an ixmp.Platform.
  2. Use from_scenario() to create a Reporter object.
  3. (optionally) Use Reporter built-in methods or advanced features to add computations to the reporter.
  4. Use get() to retrieve the results (or trigger the effects) of one or more computations.
>>> from ixmp import Platform
>>> from message_ix import Scenario, Reporter
>>>
>>> mp = Platform()
>>> scen = Scenario(scen)
>>> rep = Reporter.from_scenario(scen)
>>> rep.get('all')

Note

Reporter stores defined computations, but these are not executed until get() is called—or the results of one computation are required by another. This allows the Reporter to skip unneeded (and potentially slow) computations. A Reporter may contain computations for thousands of model quantities and derived quantities, but a call to get() may only execute a few of these.

Customization

A Reporter prepared with from_scenario() always contains a key scenario, referring to the Scenario to be reported.

The method Reporter.add() can be used to add arbitrary Python code that operates directly on the Scenario object:

>>> def my_custom_report(scenario):
>>>     """Function with custom code that manipulates the *scenario*."""
>>>     print('foo')
>>>
>>> rep.add('custom', (my_custom_report, 'scenario'))
>>> rep.get('custom')
foo

In this example, the function my_custom_report() could run to thousands of lines; read to and write from multiple files; invoke other programs or Python scripts; etc.

In order to take advantage of the performance-optimizing features of the Reporter, however, such calculations can be instead composed from atomic (i.e. small, indivisible) computations.

Reporters

class message_ix.reporting.Reporter(**kwargs)

Bases: ixmp.reporting.Reporter

MESSAGEix Reporter.

classmethod from_scenario(scenario, **kwargs)

Create a Reporter by introspecting scenario.

Returns:A reporter for scenario.
Return type:message_ix.reporting.Reporter

In addition to the keys automatically added by ixmp.reporting.Reporter.from_scenario(), keys are added for derived quantities specific to the MESSAGEix framework, as defined in PRODUCTS and DERIVED.

  • out: the product of output (output efficiency) and ACT (activity).
  • out_hist: output × ref_activity (historical reference activity),
  • in: input × ACT,
  • in_hist: input × ref_activity,
  • emi: emission_factor × ACT,
  • emi_hist: emission_factor × ref_activity,
  • inv: inv_cost × CAP_NEW,
  • inv_hist: inv_cost × ref_new_capacity,
  • fom: fix_cost × CAP,
  • fom_hist: fix_cost × ref_capacity,
  • vom: var_cost × ACT, and
  • vom_hist: var_cost × ref_activity.
  • tom: fom + vom.

Tip

Use full_key() to retrieve the full-dimensionality Key for these quantities.

Other added keys include:

  • <name>:pyam for the above quantities, plus:
    • cap:pyam (from CAP)
    • new_cap:pyam (from CAP_NEW)

…according to PYAM_CONVERT.

  • Standard reports according to REPORTS.
  • The report message:default, collecting all of the above reports.
as_pyam(quantities, year_time_dim, key=None, drop={}, collapse=None)

Add conversion of quantities to pyam.IamDataFrame.

Parameters:
  • quantities (str or Key or list of (str, Key)) – Quantities to transform to pyam format.
  • year_time_dim (str) – Label of the dimension use for the year or time column of the pyam.IamDataFrame. The column is labelled “Time” if year_time_dim is h, otherwise “Year”.
  • drop (iterable of str, optional) – Label of additional dimensions to drop from the resulting data frame. Dimensions h, y, ya, yr, and yv— except for the one named by year_time_dim—are automatically dropped.
  • collapse (callable, optional) – Callback to handle additional dimensions of the data frame.
Returns:

Keys for the reporting targets that create the IamDataFrames corresponding to quantities. The keys have the added tag ‘iamc’.

Return type:

list of Key

The IAMC data format includes columns named ‘Model’, ‘Scenario’, ‘Region’, ‘Variable’, ‘Unit’; one of ‘Year’ or ‘Time’; and ‘value’.

Using as_pyam() :

  • ‘Model’ and ‘Scenario’ are populated from the attributes of the Scenario identified by the key scenario;
  • ‘Variable’ contains the name(s) of the quantities;
  • ‘Unit’ contains the units associated with the quantities; and
  • ‘Year’ or ‘Time’ is created according to year_time_dim.

Additional dimensions of quantities pass through as_pyam() and appear as additional columns in the resulting IamDataFrame. While this is valid IAMC data, as_pyam() also supports dropping additional columns (with drop), and a custom callback (collapse) that can be used to manipulate values along other dimensions.

For example, here the values for the MESSAGEix technology and mode dimensions are appended to the ‘Variable’ column:

def m_t(df):
    """Callback for collapsing ACT columns."""
    # .pop() removes the named column from the returned row
    df['variable'] = Activity + '|' + df['t'] + '|' + df['m']
    return df

ACT = rep.full_key('ACT')
keys = rep.as_pyam(ACT, 'ya', collapse=m_t, drop=['t', 'm'])
write(key, path)

Write the report key to the file path.

In addition to the formats handled by ixmp.Reporter.write(), this version will write pyam.IamDataFrame to CSV or Excel files using built-in methods.

message_ix.reporting.PRODUCTS = (('out', ('output', 'ACT')), ('out_hist', ('output', 'ref_activity')), ('in', ('input', 'ACT')), ('in_hist', ('input', 'ref_activity')), ('emi', ('emission_factor', 'ACT')), ('emi_hist', ('emission_factor', 'ref_activity')), ('inv', ('inv_cost', 'CAP_NEW')), ('inv_hist', ('inv_cost', 'ref_new_capacity')), ('fom', ('fix_cost', 'CAP')), ('fom_hist', ('fix_cost', 'ref_capacity')), ('vom', ('var_cost', 'ACT')), ('vom_hist', ('var_cost', 'ref_activity')))

Basic derived quantities that are the product of two others.

message_ix.reporting.DERIVED = [('tom:nl-t-yv-ya', (<function add>, 'fom:nl-t-yv-ya', 'vom:nl-t-yv-ya')), ('tom:nl-t-ya', (<function sum>, 'tom:nl-t-yv-ya', None, ['yv']))]

Other standard derived quantities.

message_ix.reporting.PYAM_CONVERT = {'cap': ('CAP:nl-t-ya', 'ya', {'var': 'capacity'}), 'emis': ('emi:nl-t-ya-m-e', 'ya', {'kind': 'emi', 'var': 'emis'}), 'fom': ('fom:nl-t-ya', 'ya', {'var': 'fom cost'}), 'in': ('in:nl-t-ya-m-no-c-l', 'ya', {'kind': 'ene', 'var': 'in'}), 'inv': ('inv:nl-t-yv', 'yv', {'var': 'inv cost'}), 'new_cap': ('CAP_NEW:nl-t-yv', 'yv', {'var': 'new capacity'}), 'out': ('out:nl-t-ya-m-nd-c-l', 'ya', {'kind': 'ene', 'var': 'out'}), 'tom': ('tom:nl-t-ya', 'ya', {'var': 'total om cost'}), 'vom': ('vom:nl-t-ya', 'ya', {'var': 'vom cost'})}

Quantities to automatically convert to pyam format.

message_ix.reporting.REPORTS = {'message:costs': ['inv:pyam', 'fom:pyam', 'vom:pyam', 'tom:pyam'], 'message:emissions': ['emis:pyam'], 'message:system': ['out:pyam', 'in:pyam', 'cap:pyam', 'new_cap:pyam']}

Standard reports that collect quantities converted to pyam format.

reporting.configure(**config)

Configure reporting globally.

Valid configuration keys include:

  • units:
    • define: a string, passed to pint.UnitRegistry.define().
    • replace: a mapping from str to str, used to replace units before they are parsed by pints
Warns:UserWarning – If config contains unrecognized keys.
class ixmp.reporting.Reporter(**kwargs)

Class for generating reports on ixmp.Scenario objects.

A Reporter is used to postprocess data from from one or more ixmp.Scenario objects. The get() method can be used to:

  • Retrieve individual quantities. A quantity has zero or more dimensions and optional units. Quantities include the ‘parameters’, ‘variables’, ‘equations’, and ‘scalars’ available in an ixmp.Scenario.
  • Generate an entire report composed of multiple quantities. A report may:
    • Read in non-model or exogenous data,
    • Trigger output to files(s) or a database, or
    • Execute user-defined methods.

Every report and quantity (including the results of intermediate steps) is identified by a utils.Key; all the keys in a Reporter can be listed with keys().

Reporter uses a graph data structure to keep track of computations, the atomic steps in postprocessing: for example, a single calculation that multiplies two quantities to create a third. The graph allows get() to perform only the requested computations. Advanced users may manipulate the graph directly; but common reporting tasks can be handled by using Reporter methods:

add(key, computation[, strict]) Add computation to the Reporter under key.
add_file(path[, key]) Add exogenous quantities from path.
aggregate(qty, tag, dims_or_groups[, …]) Add a computation that aggregates qty.
apply(generator, *keys) Add computations from generator applied to key.
configure([path]) Configure the Reporter.
describe([key]) Return a string describing the computations that produce key.
disaggregate(qty, new_dim[, method, args]) Add a computation that disaggregates var using method.
finalize(scenario) Prepare the Reporter to act on scenario.
full_key(name) Return the full-dimensionality key for name.
get([key]) Execute and return the result of the computation key.
read_config(path) Configure the Reporter with information from a YAML file at path.
visualize(filename, **kwargs) Generate an image describing the reporting structure.
write(key, path) Write the report key to the file path.
graph = {'filters': None}

A dask-format graph.

add(key, computation, strict=False)

Add computation to the Reporter under key.

Parameters:
  • key (hashable) – A string, Key, or other value identifying the output of task.
  • computation (object) –

    One of:

    1. any existing key in the Reporter.
    2. any other literal value or constant.
    3. a task, i.e. a tuple with a callable followed by one or more computations.
    4. A list containing one or more of #1, #2, and/or #3.
  • strict (bool, optional) – If True (default), key must not already exist in the Reporter.
Raises:

KeyError – If key is already in the Reporter.

add() may be used to:

  • Provide an alias from one key to another:

    >>> r.add('aliased name', 'original name')
    
  • Define an arbitrarily complex computation in a Python function that operates directly on the ixmp.Scenario:

    >>> def my_report(scenario):
    >>>     # many lines of code
    >>>     return 'foo'
    >>> r.add('my report', (my_report, 'scenario'))
    >>> r.finalize(scenario)
    >>> r.get('my report')
    foo
    

Note

Use care when adding literal str values (2); these may conflict with keys that identify the results of other computations.

add_file(path, key=None)

Add exogenous quantities from path.

A file at a path like ‘/path/to/foo.ext’ is added at the key 'file:foo.ext'.

add_product(name, quantities, sums=True)

Add a computation that takes the product of quantities.

Parameters:
  • name (str) – Name of the new quantity.
  • sums (bool, optional) – If True, all partial sums of the new quantity are also added.
Returns:

The full key of the new quantity.

Return type:

Key

aggregate(qty, tag, dims_or_groups, weights=None, keep=True)

Add a computation that aggregates qty.

Parameters:
  • qty (Key or str) – Key of the quantity to be disaggregated.
  • tag (str) – Additional string to add to the end the key for the aggregated quantity.
  • dims_or_groups (str or iterable of str or dict) – Name(s) of the dimension(s) to sum over, or nested dict.
  • weights (xr.DataArray) – Weights for weighted aggregation.
Returns:

The key of the newly-added node.

Return type:

Key

apply(generator, *keys)

Add computations from generator applied to key.

Parameters:
  • generator (callable) – Function to apply to keys. Must yield a sequence (0 or more) of (key, computation), which are added to the graph.
  • keys (hashable) – The starting key(s)
configure(path=None, **config)

Configure the Reporter.

Accepts a path to a configuration file and/or keyword arguments. Configuration keys loaded from file are replaced by keyword arguments.

Valid configuration keys include:

  • default: the default reporting key; sets default_key.
  • filters: a dict, passed to set_filters().
  • files: a dict mapping keys to file paths.
  • alias: a dict mapping aliases to original keys.
Warns:UserWarning – If config contains unrecognized keys.
default_key = None

The default reporting key.

describe(key=None)

Return a string describing the computations that produce key.

If key is not provided, all keys in the Reporter are described.

disaggregate(qty, new_dim, method='shares', args=[])

Add a computation that disaggregates var using method.

Parameters:
  • var (hashable) – Key of the variable to be disaggregated.
  • new_dim (str) – Name of the new dimension of the disaggregated variable.
  • method (callable or str) – Disaggregation method. If a callable, then it is applied to var with any extra args. If then a method named ‘disaggregate_{method}’ is used.
  • args (list, optional) – Additional arguments to the method. The first element should be the key for a quantity giving shares for disaggregation.
Returns:

The key of the newly-added node.

Return type:

Key

finalize(scenario)

Prepare the Reporter to act on scenario.

The Scenario object scenario is associated with the key 'scenario'. All subsequent processing will act on data from this scenario.

classmethod from_scenario(scenario, **kwargs)

Create a Reporter by introspecting scenario.

Parameters:
  • scenario (ixmp.Scenario) – Scenario to introspect in creating the Reporter.
  • kwargs (optional) – Passed to Scenario.configure().
Returns:

A Reporter instance containing:

  • A ‘scenario’ key referring to the scenario object.
  • Each parameter, equation, and variable in the scenario.
  • All possible aggregations across different sets of dimensions.
  • Each set in the scenario.

Return type:

Reporter

full_key(name)

Return the full-dimensionality key for name.

An ixmp variable ‘foo’ indexed by a, c, n, q, and x is available in the Reporter at 'foo:a-c-n-q-x'. full_key('foo') retrieves this Key.

get(key=None)

Execute and return the result of the computation key.

Only key and its dependencies are computed.

Parameters:key (str, optional) – If not provided, default_key is used.
Raises:ValueError – If key and default_key are both None.
read_config(path)

Configure the Reporter with information from a YAML file at path.

See configure().

set_filters(**filters)

Apply filters ex ante (before computations occur).

filters has the same form as the argument of the same name to ixmp.Scenario.par() and analogous methods. A value of None will clear the filter for the named dimension.

visualize(filename, **kwargs)

Generate an image describing the reporting structure.

This is a shorthand for dask.visualize(). Requires graphviz.

write(key, path)

Write the report key to the file path.

Computations

message_ix.reporting.computations.add(a, b, fill_value=0.0)

Sum of a and b.

message_ix.reporting.computations.write_report(quantity, path)

Write the report identified by key to the file at path.

If quantity is a pyam.IamDataFrame and path ends with ‘.csv’ or ‘.xlsx’, use pyam methods to write the file to CSV or Excel format, respectively. Otherwise, equivalent to ixmp.reporting.computations.write_report().

message_ix.reporting.pyam.as_pyam(scenario, year_time_dim, quantities, drop=[], collapse=None)

Return a pyam.IamDataFrame containing quantities.

See also

Reporter.as_pyam()

message_ix.reporting.pyam.collapse_message_cols(df, var, kind=None)

as_pyam() collapse=… callback for MESSAGE quantities.

Parameters:
  • var (str) – Name for ‘variable’ column.
  • kind (None or 'ene' or 'emi', optional) –

    Determines which other columns are combined into the ‘region’ and ‘variable’ columns:

    • ’ene’: ‘variable’ is '<var>|<level>|<commodity>|<technology>|<mode>' and ‘region’ is '<region>|<node_dest>' (if var=’out’) or '<region>|<node_origin>' (if ‘var=’in’).
    • ’emi’: ‘variable’ is '<var>|<emission>|<technology>|<mode>'.
    • Otherwise: ‘variable’ is '<var>|<technology>'.

    The referenced columns are also dropped, so it is not necessary to provide the drop argument of as_pyam().

Computations from ixmp

Elementary computations for reporting.

aggregate(quantity, groups, keep) Aggregate quantity by groups.
disaggregate_shares(quantity, shares) Disaggregate quantity by shares.
load_file(path) Read the file at path and return its contents.
make_dataframe(*quantities) Concatenate quantities into a single pd.DataFrame.
write_report(quantity, path) Write the report identified by key to the file at path.
ixmp.reporting.computations.aggregate(quantity, groups, keep)

Aggregate quantity by groups.

ixmp.reporting.computations.disaggregate_shares(quantity, shares)

Disaggregate quantity by shares.

ixmp.reporting.computations.make_dataframe(*quantities)

Concatenate quantities into a single pd.DataFrame.

ixmp.reporting.computations.load_file(path)

Read the file at path and return its contents.

Some file formats are automatically converted into objects for direct use in reporting code:

  • csv: converted to xarray.DataArray. CSV files must have a ‘value’ column; all others are treated as indices.
ixmp.reporting.computations.sum(quantity, weights=None, dimensions=None)

Sum quantity over dimensions, with optional weights.

ixmp.reporting.computations.write_report(quantity, path)

Write the report identified by key to the file at path.

Utilities

class ixmp.reporting.utils.Key(name, dims=[], tag=None)

A hashable key for a quantity that includes its dimensionality.

Quantities in a Scenario can be indexed by one or more dimensions. For example, a parameter with three dimensions can be initialized with:

>>> scenario.init_par('foo', ['a', 'b', 'c'], ['apple', 'bird', 'car'])

Computations for this scenario might use the quantity foo in different ways:

  1. in its full resolution, i.e. indexed by a, b, and c;
  2. aggregated (e.g. summed) over any one dimension, e.g. aggregated over c and thus indexed by a and b;
  3. aggregated over any two dimensions; etc.

A Key for (1) will hash, display, and evaluate as equal to 'foo:a-b-c'. A Key for (2) corresponds to 'foo:a-b', and so forth.

Keys may be generated concisely by defining a convenience method:

>>> def foo(dims):
>>>     return Key('foo', dims.split(''))
>>> foo('a b')
foo:a-b
classmethod from_str_or_key(value, drop=None, append=None, tag=None)

Return a new Key from value.

iter_sums()

Yield (key, task) for all possible partial sums of the Key.

classmethod product(new_name, *keys)

Return a new Key that has the union of dimensions on keys.

Dimensions are ordered by their first appearance.

class ixmp.reporting.utils.AttrSeries(*args, **kwargs)

pandas.Series subclass imitating xarray.DataArray.

Future versions of ixmp.reporting will use xarray.DataArray as Quantity; however, because xarray currently lacks sparse matrix support, ixmp quantities may be too large for memory.

The AttrSeries class provides similar methods and behaviour to xarray.DataArray, such as an attrs dictionary for metadata, so that ixmp.reporting.computations methods can use xarray-like syntax.

ixmp.reporting.utils.Quantity

alias of ixmp.reporting.utils.AttrSeries

ixmp.reporting.utils.clean_units(input_string)

Tolerate messy strings for units.

Handles two specific cases found in MESSAGEix test cases:

  • Dimensions enclosed in ‘[]’ have these characters stripped.
  • The ‘%’ symbol cannot be supported by pint, because it is a Python operator; it is translated to ‘percent’.
ixmp.reporting.utils.collect_units(*args)

Return an list of ‘_unit’ attributes for args.

ixmp.reporting.utils.data_for_quantity(ix_type, name, column, scenario, filters=None)

Retrieve data from scenario.

Parameters:
  • ix_type ('equ' or 'par' or 'var') – Type of the ixmp object.
  • name (str) – Name of the ixmp object.
  • column ('mrg' or 'lvl' or 'value') – Data to retrieve. ‘mrg’ and ‘lvl’ are valid only for ix_type='equ', and ‘level’ otherwise.
  • scenario (ixmp.Scenario) – Scenario containing data to be retrieved.
  • filters (dict, optional) – Mapping from dimensions to iterables of allowed values along each dimension.
Returns:

Data for name.

Return type:

Quantity

ixmp.reporting.utils.keys_for_quantity(ix_type, name, scenario)

Iterate over keys for name in scenario.