VEDA Tags

This section describes the genesis and design purpose of the Veda syntax, which started with the Flexible Import tag - ~FI_T - in Excel VBA at the turn of the 21st century. Various tags and conventions have evolved over two decades to serve the core design philosophy of Veda.

Regions and Time

The system settings file (SysSettings) supports some tags that specify the basic structure of the model in terms of regions, time periods, time slices, currencies, and units.

~BookRegions_map

The concept of BookRegions serves a core Veda principle that structures that are common across regions should be declared only once. This applies to the BaseYear templates that exist in the root of the Veda model folder. They are named as VT (Veda Template) _ BookRegion _ Sector _ Version. For example, a model may have ‘VT_ElecReg_ELC_v01’ and ‘VT_DemReg_EndUse_v01’ in the root folder to describe electricity supply in the first file and the demand in the second one. ElecReg could map to Electricity grid regions, which is a reasonable way to represent electricity supply, and DemReg could map to states or provinces, which is a reasonable way to represent demands.

This tag maps the book region (also called super region) to model regions. All declarations in base year templates that do not have a region specification apply to all regions mapped to their book region.

~StartYear

The first year of the model horizon.

~TimePeriods

To specify period lengths. TIMES automatically computes the middle years as milestone years (along with the StartYear).

~MileStoneYears

This is an alternate way to specify the model periods - by directly specifying the milestone years for which the model will run. TIMES computes the period spans automatically when milestone years are specified. The last year of the model horizon can be specified (optional).

~DefaultYear

The default year to be used for any timeseries parameter.

~Currencies

List of currencies used in the model. The first entry in this table works as the default currency in Veda.

~DefUnits

This tag is used to declare the default process (activity and capacity) and commodity units by sector.

~RegionGroup_map

The RegionGroups feature enables efficient rule-based declarations by grouping multiple native regions under logical identifiers. This powerful feature eliminates repetitive region-specific declarations and enables consistent parameterization across related regions.

Purpose and Benefits:

RegionGroups solve a fundamental modeling challenge: applying the same parameters or characteristics to multiple regions without manual preprocessing. Instead of creating individual entries for each native region, modelers can define RegionGroups and make declarations at the group level, which automatically propagate to all constituent regions.

Key advantages include:

Elimination of repetitive declarations: Instead of 29 separate entries for each technology parameter, use 1 entry per RegionGroup
Rule-based parameterization: Technology characteristics available by external regional classifications (e.g., WEO regions, steel regions) can be directly applied
Flexible override capability: Native region declarations take precedence over RegionGroup declarations in the same table
Maintenance efficiency: Changes to RegionGroup parameters automatically propagate to all constituent regions

Implementation:

The RegionGroups feature is implemented through the regiongroup_map table in the SysSettings.xlsx file. This table establishes the mapping between native regions and their logical groupings.

Table Structure:

The regiongroup_map table allows each native region to belong to multiple RegionGroups, enabling different grouping schemes for different purposes. However, it is the user’s responsibility to ensure that only one grouping scheme is used within any single VEDA tag.

Example with Two Grouping Schemes:

regiongroup	Region
WEO Regional Classification
weopg_European Union	EU_north
weopg_European Union	EU_northeast
weopg_European Union	France
weopg_European Union	Germany
weopg_India	India
weopg_China	China
weopg_United States	USA
weopg_United States	Canada
Development Status Classification
Reg_Dev	EU_north
Reg_Dev	EU_northeast
Reg_Dev	France
Reg_Dev	Germany
Reg_Dev	USA
Reg_Dev	Canada
Reg_Eme	India
Reg_Eme	China
Reg_Eme	Africa_north
Reg_Eme	Brazil

Key Points:

Each native region (e.g., Germany) appears in multiple RegionGroups (weopg_European Union and Reg_Dev)
The WEO scheme provides detailed regional classifications (8+ groups)
The Development Status scheme simplifies to just 2 groups (Reg_Dev vs Reg_Eme), reducing 29 regions to 2 categories
User Responsibility: Never mix grouping schemes in the same table (e.g., don’t use both weopg_European Union and Reg_Dev in the same declaration)

Usage in Model Declarations:

Once defined, RegionGroups can be used directly in the region column of any VEDA table:

TechName	region	Comm-IN-A	INPUT	VAROM
Steel_BOF_CCS	weopg_European Union	INDCOA,INDNGA	0.3456	250
Steel_BOF_CCS	feIndia	INDCOA,INDNGA	0.3456	280

Override Behavior:

When both RegionGroup and native region declarations exist in the same table, the native region declaration takes precedence:

TechName	region	efficiency	cost
PowerPlant	weopg_European Union	0.85	1000
PowerPlant	Germany	0.90	1200

Result: Germany gets efficiency=0.90, cost=1200 (native region override), while all other weopg_European Union regions get efficiency=0.85, cost=1000 (RegionGroup default).

Technical Implementation:

RegionGroups are resolved during tag processing, which means they follow all standard VEDA overwriting rules within scenarios:

Tag Processing Phase: RegionGroups are expanded to their constituent native regions
Precedence Rules Apply: Standard VEDA precedence rules determine final parameter values
Override Hierarchy: Native region declarations override RegionGroup declarations in the same table
Scenario Integration: RegionGroup declarations can be overridden by subsequent scenario files

Common Use Cases:

WEO Regional Classifications: Apply World Energy Outlook technology characteristics directly to RegionGroups without preprocessing
Development Status Groupings: Group regions by economic development level (emerging vs. developed) for policy analysis
Sector-Specific Groupings: Map regions to major production centers (e.g., steel regions, oil regions) for industry-specific technology data
Geographic Classifications: Group regions by climate zones, resource availability, or trade relationships

Multiple Grouping Schemes:

VEDA allows each native region to belong to multiple RegionGroups simultaneously, enabling flexible modeling approaches. For example, Germany can be in both weopg_European Union (WEO classification) and Reg_Dev (development status classification).

Critical User Responsibility:

Warning

VEDA will not prevent you from mixing grouping schemes, but you must ensure consistency within each table. Never use RegionGroups from different schemes in the same VEDA tag or table.

Valid Usage (Consistent Scheme):

TechName	region	efficiency
PowerPlant	Reg_Dev	0.85
PowerPlant	Reg_Eme	0.75

Invalid Usage (Mixed Schemes):

TechName	region	efficiency
PowerPlant	weopg_European Union	0.85
PowerPlant	Reg_Eme	0.75

This mixes WEO classification (weopg_European Union) with development status classification (Reg_Eme) in the same table, which should be avoided.

Best Practices:

Scheme Consistency: Use only one grouping scheme per VEDA table/tag
Naming Conventions: Use clear prefixes (e.g., weopg_, Reg_, steel_) to distinguish grouping schemes
Documentation: Clearly document which grouping scheme is used in each model component
Efficiency Focus: Prefer simpler schemes when possible (e.g., Reg_Dev/Reg_Eme reduces 29 regions to 2 groups)
Override Capability: Leverage native region overrides for exceptions within RegionGroup defaults
Scenario Integration: Combine with scenario files to create flexible policy analysis frameworks

Getting started with the RES

These tags define the key elements - processes, commodities, topology, and core parameters. These tags don’t support wild cards.

Commodity Definition Table (~FI_COMM)

The Commodity Definition Table (~FI_COMM) is used to declare the non-numerical characteristics of commodities in the model. Each commodity must be declared only once within these tables to avoid conflicts, such as inconsistent attributes (e.g., different time slice levels).

The ~FI_COMM table is supported in B-Y templates, SubRES files, and the SysSettings template. For large and complex models, a best practice is to centralize all commodity declarations in a single template, such as the SysSettings template, to maintain consistency and avoid duplication.

Valid column headers for the ~FI_COMM table are described in Table 1 below. Their order in the table can be changed.

Best Practice: Declare commodities only once in a single template location to prevent errors or conflicting definitions.

Warning

Critical: Multiple duplicate declarations

Avoid declaring the same Commodity or Process with conflicting attributes across multiple tables or scenarios.

When you run the model with different scenario combinations, the same commodity (e.g., ELC) or process may have different attributes in each run if it is declared with conflicting attributes—such as different Time Slice Levels (TSLVL) in the ~FI_Comm or ~FI_Process tags—across multiple scenarios. This makes debugging and interpreting the output of GAMS runs difficult.

Best Practice:

Always declare each commodity and process exactly once with consistent attributes across all templates and scenarios. This ensures that all GAMS runs use the same fundamental parameter definitions, allowing you to confidently analyze and compare results across different scenario combinations.

Table Layout and Usage

The ~FI_COMM table is used to declare commodities with their associated attributes and properties. Each commodity is declared once with its characteristics.

Valid Column Headers

The valid column headers for a ~FI_COMM table are listed below (see the Example Table section for a complete example):

Header	Description
Csets	The sets to which commodities belong. Valid entries are: - `NRG` (energy) - `MAT` (material) - `DEM` (demand service) - `ENV` (emissions) - `FIN` (financial) Note: These declarations are inherited until the next entry is encountered.
Region	Specifies the region. By default, it applies to all regions unless explicitly declared. Note: This column is used only in B-Y templates and is not allowed in SubRES files.
CommName	The name of the commodity (e.g., `COA` for coal).
CommDesc	A description of the commodity (e.g., “Solid Fuels”).
Unit	The unit associated with the commodity throughout the model (e.g., `PJ`). User Responsibility: Ensure unit consistency throughout the model.
LimType	Defines the sense of the balance equation for the commodity. Valid entries: - `LO` (Production >= Consumption, default for all but MAT commodities) - `FX` (Production = Consumption, default for MAT commodities) - `UP` (Production <= Consumption)
CTSLvl	Specifies the commodity time-slice tracking level. Valid entries: - `ANNUAL` (default) - `SEASON` - `WEEKLY` - `DAYNITE`
PeakTS	Defines peak time slice monitoring. Valid entries: - `ANNUAL` (default) - Specific time slices defined in the SysSettings file (comma-separated).
CType	Indicates electricity and heat commodities. Valid entries: - `ELC` (electricity) - `HTHEAT` (high-temperature heat) - `LTHEAT` (low-temperature heat)

Note: Comma-separated elements are allowed in fields like Csets and PeakTS.

Example Table

Below is an example of a ~FI_COMM table for commodity definitions:

~FI_COMM	CommName	CommDesc	Csets	Unit	LimType	CTSLvl
	COA	Solid Fuels	NRG	PJ	LO	ANNUAL
	ELEC	Electricity	NRG	PJ	FX	SEASON

In this example: - COA is defined as a solid fuel energy commodity, measured in petajoules (PJ), with a default limit type of LO and time-slice tracking at the ANNUAL level. - ELEC is defined as an electricity commodity with a balance equation of FX and time-slice tracking at the SEASON level.

Best Practices

Declare each commodity only once to prevent conflicts. Tip: Centralize declarations in the SysSettings template for large models.
Ensure consistent use of units across the model for all commodities.
Verify attributes such as LimType and CTSLvl for correctness, particularly when working with complex time-slice structures.
Use comma-separated entries cautiously and only where appropriate, such as for time-slice monitoring (PeakTS).

By adhering to these practices, users can efficiently manage commodity definitions and avoid potential modeling errors.

Note

The following commodities (climate module) can be used without being defined: BEOHMOD,CH4-ATM,CH4-GTC,CH4-LO,CH4-MT,CH4-PPB,CH4-PPM,CH4-PREIND,CH4-UP,CO2-ATM,CO2-GTC,CO2-LO,CO2-PPM,CO2-PREIND,CO2-UP,CS,DELTA-ATM, DELTA-LO,EXT-EOH,FORCING,GAMMA,LAMBDA,N2O-ATM,N2O-GTC,N2O-LO,N2O-MT,N2O-PPB,N2O-PPM,N2O-PREIND,N2O-UP,PHI-AT-UP,PHI-CH4,PHI-LO-UP,PHI-N2O,PHI-UP-AT,PHI-UP-LO, SIGMA1,SIGMA2,SIGMA3,TOTCH4,TOTN2O.

Process Definition Table (~FI_PROCESS)

The Process Definition Table (~FI_PROCESS) is used to declare the non-numerical characteristics of processes in Veda. Each process must be defined only once in this table, and it serves as the foundational structure for assigning essential attributes like process name, description, activity unit, capacity unit, and more. These tables are supported in both Base-Year (B-Y) Templates and SubRES files.

Note

The ~FI_PROCESS table provides a flexible layout: the column order can be changed, and valid entries for each header are well-defined.

Warning

Each process must be declared exactly once with consistent attributes across all templates. Like FI_COMM tags, FI_PROCESS tags are processed in parallel, which can cause non-deterministic results if duplicate or inconsistent declarations exist. See the parallel processing warning in the FI_COMM section above for details.

Key Features

Process Declaration - Each process is declared only once using its name, description, and associated attributes. - Supported in B-Y Templates and SubRES files. However, region declarations are only valid in B-Y templates.
Non-Numerical Attributes - This table focuses on defining process characteristics rather than numerical data.
Flexible Layout - The order of columns is user-defined, as long as valid headers are used.
Region-Specific Data - Region declarations can be used in B-Y Templates but are not allowed in SubRES files.

Valid Column Headers

The following are valid column headers for the ~FI_PROCESS table:

Header	Description
Sets	Sets to which processes belong, indicating the process type. Valid entries include: `ELE`: Thermal or other power plant `CHP`: Combined heat and power `PRE`: Generic process `DMD`: Demand device `IMP`: Import process `EXP`: Export process `MIN`: Mining process `HPL`: Heating plant `IPS`: Inter-period storage `NST`: Night storage device `STG`: General timeslice storage `STS`: Simultaneous DayNite/Weekly/Seasonal storage `STK`: Combined DayNite/Weekly/Seasonal and inter-period storage.
Region	Specifies the region(s) where the process exists (comma-separated entries allowed). Default: Applied to all regions if not specified. Valid only in B-Y templates (regional data for SubRES processes must be provided in `SubRES_<sector>_Trans` files).
TechName	The name of the process (e.g., `MINCOA1`), up to 32 characters. Recommendation: Limit to 27 characters to account for potential VEDA2.0 additions (e.g., for vintaging or dummy imports).
ProcessDesc	A descriptive name for the process (e.g., `Domestic supply of Solid Fuels Step 1`), up to 255 characters.
Tact	The activity unit of the process (e.g., `PJ`). Users must ensure unit consistency.
Tcap	The capacity unit of the process. Users must ensure unit consistency.
Tslvl	The operational time-slice level of the process. Valid entries: `ANNUAL` `SEASON` `WEEKLY` `DAYNITE` Default behavior: `DAYNITE` for `ELE`, `STGTSS`, and `STGIPS` processes. `SEASON` for `CHP` and `HPL` processes. `ANNUAL` for all other process types.
PrimaryCG	The Primary Commodity Group (PCG) of the process. Normally, this is left unspecified as VEDA assigns a default PCG. Specify only if overriding the default or creating a new PCG.
Vintage	Indicates whether the process uses vintage tracking. Valid entries: `YES`: Vintage tracking enabled. `NO` (default): Vintage tracking disabled.

Note

Comma-separated entries are allowed for applicable columns (e.g., Region, Sets).

Example Layout

Below is an example of a ~FI_PROCESS table:

~FI_PROCESS	Region	TechName	ProcessDesc	Tact	Tcap	Tslvl
	US	MINCOA1	Domestic supply of coal	PJ	MW	ANNUAL
	US	EXPCOA1	Export process for coal	PJ	MW	DAYNITE

Best Practices

Consistency: Ensure consistency in units for activity (Tact) and capacity (Tcap).
Region-Specific Data: Use the Region column only in B-Y templates, and provide SubRES process regional data in appropriate SubRES transaction files.
Naming: Keep process names concise (maximum 27 characters recommended) to avoid issues with internal naming extensions in VEDA2.0.
Default Values: Allow defaults (e.g., Tslvl, PrimaryCG, Vintage) unless specific customizations are required.

By defining processes in the ~FI_PROCESS table, users create a robust framework for modeling non-numerical characteristics, ensuring clarity and consistency across the energy system model.

Flexible Import Table (~FI_T)

Preparing input data for models usually imposes a significant data processing burden on the modeler because the input is expected in a particular format, which is different from the format that is used to maintain the data.

The Flexible Import Table (~FI_T) is a versatile table used primarily to create the model topology, defining process inputs, outputs, and parameters in Base-Year (B-Y) templates and SubRES files. Its flexible structure allows users to specify parameters and their numerical values with minimal intervention. Data is imported as provided, without modification during the import process.

Key Features

Flexible Structure
- The table layout can be adapted to match source data, minimizing preprocessing efforts.
- Indexes for attributes such as region, year, and timeslice can be specified as either row identifiers or column headers.
Direct Data Import
- Data is not altered or expanded during import.
- This behavior is consistent with the UC tables (see Section 2.4.7), making it ideal for precise, user-defined parameter definitions.
Row and Column Organization
- Row identifiers and column headers define the dimensions for data rows.
- Numerical data is input directly into the corresponding cells.

Layout and Regions

The ~FI_T table consists of six distinct regions:

Row ID Column Headers These columns define the dimensions for data rows. Valid headers are listed below (see Table 3 for details):
- Region: Declares the region.
- TechName: Declares the technology name.
- Comm-IN / Comm-IN-A: Input commodities / Auxiliary input commodities.
- Comm-OUT / Comm-OUT-A: Output commodities / Auxiliary output commodities.
- Attribute: Defines the attribute (e.g., DEMAND, ACT_BND).
- Year: Specifies the year(s); comma-separated values are allowed.
- TimeSlice: Specifies time slices; comma-separated values are allowed.
- LimType: Specifies limit types (UP, LO, FX, N).
- CommGrp: User-defined commodity group.
- Curr: Currency declaration.
- Stage / SOW: Multi-stage decision points and states of the world for stochastic models.
- Other_Indexes: Special dimensions required by certain attributes (e.g., EnvLimit attributes).
Note: Comma-separated elements are allowed in these headers.
Row Identifiers
The specific elements for the dimensions defined in the row ID column headers.
Data Area Column Headers
Columns define additional dimensions for the data. These can include:
- Attribute
- Year
- TimeSlice
- LimType
- Commodity
- CommGrp (internal VEDA groups only: DEMO, DEMI, NRGO, etc.)
- Region
- Currency
Multiple dimensions can be combined in column headers, separated by a ``~``.
Data Numerical values that correspond to the row and column dimensions.
Table-Level Declarations Global declarations in the table header (following a colon :) apply to all data without an explicit index value. Example: ~FI_T: DEMAND assigns DEMAND as the attribute for all rows lacking a specific attribute.
Comments Comment rows can be identified by:
- A * character at the beginning of any cell in the row.
- A \I: prefix, which is safer and avoids confusion with wildcard or operation symbols.

Example Layout

~FI_T	Region	TechName	Comm-IN	Attribute	2020~UP
	US	PowerPlant1	Coal	ACT_BND	500
	US	PowerPlant1	NaturalGas	ACT_BND	200

In this example: - The table defines activity bounds (ACT_BND) for the PowerPlant1 process in the US region for the year 2020. - Coal has an upper bound of 500, and Natural Gas has an upper bound of 200.

Best Practices

Ensure row and column dimensions are clearly defined and consistent.
Use the ~FI_T placement correctly, preceding the first data column to allow for flexible row identifiers.
Use table-level declarations to simplify repetitive data entries.
Avoid using * for comments when it might conflict with wildcard usage; prefer \I: for clarity.

By leveraging the flexibility of the ~FI_T table, users can efficiently configure process inputs, outputs, and parameters, aligning the model structure with source data seamlessly.

Process and Commodity Filtering

The Foundation of VEDA’s Rule-Based Processing

Process and commodity filtering is the core mechanism that enables VEDA’s powerful rule-based data processing. This system allows users to apply parameters, transformations, and operations to groups of processes and commodities based on flexible criteria, eliminating the need for repetitive individual declarations.

Applications Across VEDA:

Transformation Tables (INS, UPD, MIG): Bulk parameter insertion and updates
Reports: Multi-dimensional classification and aggregation
Set Definitions: Dynamic process and commodity groupings

Five Filtering Dimensions

VEDA provides five complementary methods for identifying processes and commodities:

1. Set Membership (pset_set, cset_set)

Filter by predefined VEDA process or commodity sets
Most robust for administrative groupings
Example: pset_set: ELEGEN (all electricity generation processes)

2. Name Patterns (pset_pn, cset_cn)

Filter by process or commodity name patterns
Supports wildcards and exclusions
Example: pset_pn: *COAL* (processes with “COAL” in name)

3. Description Patterns (pset_pd, cset_cd)

Filter by process or commodity description text
Useful when names are cryptic but descriptions are clear
Example: pset_pd: *Combined Cycle*

4. Input Topology (pset_ci)

Filter processes by input commodities (most robust for functional classification)
Based on actual energy flows, not naming conventions
Example: pset_ci: TRDELC (all processes consuming electricity for transport)

5. Output Topology (pset_co)

Filter processes by output commodities
Functional classification based on what processes produce
Example: pset_co: ELC (all electricity-producing processes)

Pattern Syntax

Wildcards:

*: Multi-character wildcard (zero or more characters)
?: Single-character wildcard (exactly one character)
[_]: Literal underscore (when not used as wildcard)

Examples:

*COAL*: Contains “COAL” anywhere
COAL*: Starts with “COAL”
*COAL: Ends with “COAL”
PWR??01: “PWR” + exactly 2 characters + “01”

Comma-Separated Lists (OR Logic):

COA,GAS,OIL: Coal OR Gas OR Oil
*ELEC*,*GRID*: Contains “ELEC” OR contains “GRID”

Exclusion Syntax:

-*OLD*: Exclude processes with “OLD” in name
*,--RETIRED*: All processes EXCEPT those with “RETIRED”

Logic Control Architecture

VEDA provides sophisticated control over how filtering conditions are combined through six logic control columns:

Two-Block Architecture:

┌─────────────────┐    ┌──────────────────────────────────┐
│   Set Block     │    │  Name/Desc/Input/Output Block   │
│   pset_set      │ ←──┤  pset_pn, pset_pd,              │
│   cset_set      │    │  pset_ci, pset_co               │
└─────────────────┘    └──────────────────────────────────┘
        ↑                              ↑
   _forsets columns              _andor columns
(block integration)           (within-block logic)

Logic Control Columns:

Within Name/Desc/Input/Output Block:

t_pos_andor: Process positive conditions (AND/OR across pset_pn, pset_pd, pset_ci, pset_co)
c_pos_andor: Commodity positive conditions (AND/OR across cset_cn, cset_cd)
t_neg_andor: Process negative/exclusion conditions (AND/OR across exclusion fields)
c_neg_andor: Commodity negative/exclusion conditions (AND/OR across exclusion fields)

Set Block Integration:

t_pos_andor_forsets: How process sets join with other process conditions
c_pos_andor_forsets: How commodity sets join with other commodity conditions

Default Behavior (when columns omitted):

All logic = AND (most restrictive)
Within comma-separated values = OR (always)

Logic Control Examples

Example 1: Standard AND Logic (Default)

pset_set: ELEGEN        # Set membership
pset_pn: *COAL*         # Name pattern
pset_ci: COA            # Input commodity
# Result: Processes in ELEGEN set AND name contains COAL AND consumes COA

Example 2: OR Logic Within Block

pset_pn: *COAL*         # Name pattern
pset_ci: COA            # Input commodity
t_pos_andor: OR         # Name OR Input
# Result: Processes with COAL in name OR consuming COA

Example 3: Set Block OR Integration

pset_set: ELEGEN        # Set membership
pset_pn: *RENEW*        # Name pattern
t_pos_andor_forsets: OR # Set OR Name
# Result: Processes in ELEGEN set OR with RENEW in name

Example 4: Complex Mixed Logic

# Positive conditions
pset_set: PWRGEN        # Power generation set
pset_pn: *COAL*,*GAS*   # Coal or gas in name
pset_ci: COA,GAS        # Consumes coal or gas
t_pos_andor: OR         # Name patterns OR input commodities
t_pos_andor_forsets: AND # Set AND (name OR input)

# Negative conditions
pset_pn: -*OLD*         # Exclude old plants
pset_pd: -*RETIRED*     # Exclude retired plants
t_neg_andor: OR         # Exclude if old OR retired

# Result: (PWRGEN set) AND (coal/gas name OR coal/gas input)
#         AND NOT (old name OR retired description)

Example 5: Commodity Filtering

cset_set: ALLELC        # Electricity commodity set
cset_cn: *H2*           # Hydrogen commodities
c_pos_andor_forsets: OR # Set OR name pattern
# Result: All electricity commodities OR hydrogen commodities

Filtering Method Selection Guidelines

Use Topology-Based Filtering When:

Functional relationships are primary concern (pset_ci, pset_co)
Model has inconsistent naming conventions
New technologies added frequently
Cross-model compatibility required

Use Set-Based Filtering When:

Predefined VEDA process sets exist (pset_set, cset_set)
Administrative or organizational groupings needed
Consistent with existing model structure

Use Pattern-Based Filtering When:

Topology insufficient for distinction (pset_pn, pset_pd, cset_cn, cset_cd)
Regional, size, or vintage distinctions needed
Legacy compatibility required

Recommended Approach:

Start with topology (pset_ci, pset_co) for primary functional classification
Add sets (pset_set) for administrative groupings
Supplement with patterns (pset_pn, pset_pd) for secondary attributes
Use logic control to create sophisticated combination rules

Best Practices

Efficiency:

Topology first: Most robust and maintenance-free
Specific patterns last: Place most restrictive conditions at the end
Avoid over-complexity: Use simplest logic that achieves the goal

Maintainability:

Document logic choices: Explain why specific combinations are used
Test edge cases: Verify filtering captures intended processes/commodities
Plan for growth: Design filters that handle new technologies automatically

Performance:

Use sets when available: Faster than pattern matching
Minimize wildcards: More specific patterns process faster
Combine related conditions: Group related filters in single operations

The data workhorses

The TFM (Transformation) tags enable bulk insert or update of parameters in a rule-based manner - via technology/commodity filters that are based on set membership, shortname, description, and topology. It is also possible to include existing parameters (and their values) as filter criteria.

DINS, INS, and UPD Tables

Veda supports three main transformation table types for inputting data:DINS (Direct Insert), INS (Insert), and UPD (Update). Each serves a distinct purpose, with varying degrees of efficiency and complexity depending on the dataset’s structure and the modeling requirements.

Important

The ~TFM_DINS tag offers the highest processing efficiency, followed by ~FI_T and ~TFM_INS.

Tags ~TFM_UPD and ~TFM_MIG are the least efficient. Whenever possible, users are encouraged to use DINS or INS, provided the logic can be transferred.

1. ~TFM_DINS (Transformation Direct Insert Tables)

Purpose: ~TFM_DINS is the preferred table type when the dataset is fully enumerated, meaning all fields are explicitly defined without any wildcards or comma-separated lists.

Key Characteristics: - Processes are identified using only the pset_pn column. - Commodities (if applicable) are defined explicitly via the cset_cn column. - No wildcards (e.g., ?, *) or comma-separated values are allowed.

Advantages: - The most efficient tag.

Use Case: When all model elements are clearly defined in advance, such as a process-specific bound (ACT_BND) applied to individual processes without any rules.

2. ~TFM_INS (Transformation Insert Tables)

Purpose: INS is the general-purpose table for inserting new data into the database. It allows for greater flexibility in specifying model elements.

Key Characteristics: - Supports wildcards (e.g., ALL, *) and comma-separated values in fields like pset_pn and cset_cn. - Inserts absolute values directly into the database without referencing existing seed data.

Advantages: - Provides flexibility for users who work with less granular or generic data definitions. - Easy to use for scenarios where exact enumeration is not required.

Use Case:

In this example from DemoS_001, it is used to declare three new attributes (G_DYEAR, Discount, and YRFR) by row.

3. ~TFM_UPD (Transformation Update Tables)

Purpose: UPD is used when data modifications depend on the presence of existing seed values in the database.

Key Characteristics: - Performs numerical transformations on seed values (e.g., multiplying or dividing an existing value). - Supports conditional insertion, where new data is added only if a corresponding seed value exists. - Requires prior existence of seed data in an alphabetically inferior scenario in the database.

Advantages: - Ensures data integrity by operating conditionally on existing entries. - Enables dynamic adjustments of seed values without overwriting them.

Use Case:

In this figure it sets default prices (ACTCOST) for the backstop dummy processes for energy commodities (IMP*Z - dummy IMPort processes ending with “Z”) and demands (IMPDEMZ - a dummy IMPDEMZ process that can feed any demand). Note that the process and attribute MUST already have been specified for the qualifying process. Though not shown in the example above the data specification field may also contain operators (+, *, -, /) there the resulting value is applied to the existing value for the qualifying processes.

Note

UPDate and Replacing Data: UPDate is sometimes confused with replacing data. Any of these tags will replace data if they exist in BY_Trans or SubRES trans files and data for the same indexes has been declared in the BY or SubRES files. Otherwise, they will simply create new entries in the scenario where they exist. The “replacing” will happen if this scenario file appears after the scenario with the original data in the scenario group selected for the case.

Comparison of DINS, INS, and UPD

Feature	DINS	INS	UPD
Data Enumeration	Fully enumerated	Supports wildcards/lists	Relies on existing data
Wildcards / Comma-Separated Values	Not allowed	Allowed	Not applicable
Seed Data Requirement	Not required	Not required	Required
Primary Use Case	Explicit, enumerated data	Flexible data insertion	Conditional modifications
Performance	Fastest	Moderate	Slowest

Best Practices

Use DINS wherever possible for maximum efficiency, especially when handling large datasets that are fully enumerated.
Use INS for flexible data insertion when working with generic definitions or multiple entries defined using wildcards or lists.
Use UPD sparingly, only for cases where transformations or conditional insertions are explicitly required, as it involves additional computational overhead.

By understanding the distinct roles and advantages of each table type, users can optimize their data preparation workflows and improve overall model performance.

Tip

By default, DINS, INS, and UPD tables use regions (or Value/AllRegions) as the data value column headers. However, there are scenarios where it is beneficial to organize data differently, such as: 1. Improving Table Readability: Wider tables with alternative column headers can reduce data preprocessing and make data easier to interpret. 2. Enhancing Efficiency: Minimizing the number of rows in a table reduces the processing overhead for rule application.

To support these needs, Veda provides several variants of DINS, INS, and UPD tables. These variants allow the user to specify attributes, years, or timeslices as value column headers.

~TFM_INS Variants The ~TFM_INS variants offer flexible table layouts for inserting data. The following variants are available:

TFM_INS-AT: The value fields use attributes as column headers.
TFM_INS-TS: The value fields use years as column headers.
TFM_INS-TSL: The value fields use timeslices as column headers.

—

### ~TFM_DINS Variants The ~TFM_DINS variants allow fully enumerated data to use alternative column headers. The following variants are supported:

TFM_DINS-AT: The value fields use attributes as column headers.
TFM_DINS-TS: The value fields use years as column headers.
TFM_DINS-TSL: The value fields use timeslices as column headers.

—

### ~TFM_UPD Variants The ~TFM_UPD variants allow update tables to organize value fields differently. The supported variants include:

TFM_UPD-AT: The value fields use attributes as column headers.
TFM_UPD-TS: The value fields use years as column headers.

Example Table Layouts

TFM_INS-TS Example

~TFM_INS-TS	Region	TechName	Attribute	2020	2025
	US	PowerPlant1	ACT_BND	500	550
	US	PowerPlant2	ACT_BND	300	320

In this example:

The value fields use years (2020, 2025) as column headers.
Each row specifies the activity bounds (ACT_BND) for a technology in a region.

TFM_UPD-AT Example

~TFM_UPD-AT	Region	TechName	2020~UP	2025~UP
	US	PowerPlant1	ACT_BND=500	ACT_BND=550
	US	PowerPlant2	ACT_BND=300	ACT_BND=320

In this example:

The value fields use attributes (ACT_BND) as column headers, enabling a compact layout for multiple attributes.

Multiple regions or region groups (comma-separated) can be specified in table-level declarations for ~TFM_DINS-TS, ~TFM_INS-TS, and ~TFM_FILL-R tags. Example: ~TFM_INS-TS:Region=Reg_Dev,Reg_Eme; specifies Reg_Dev and Reg_Eme as table-level region declarations.

Best Practices

Choose Variants Wisely: Select a table variant that aligns with the structure of your source data to minimize preprocessing.
Keep Tables Wide: Wider tables (fewer rows) are more efficient, as they reduce the rule processing required for each row.
Simplify Preprocessing: Use the variant that closely matches your source data layout, reducing the need for manual restructuring.
Fully Enumerate Data for DINS Variants: Ensure all data is fully enumerated (no wildcards or lists) when using DINS variants for optimal performance.

By leveraging these variants, users can efficiently configure their tables for improved readability and reduced computational overhead, while ensuring that data aligns seamlessly with Veda’s processing structure.

~TFM_MIG

~TFM_FILL-R

To create sets

The following tags enable creation of named groups of processes and commodities.

~TFM_CommGrp

~TFM_PSets

~TFM_CSets

Other Tags

~Tradelinks

~Tradelinks_DINS

~Tradelinks_Desc

~UC_T

~TFM_INS-txt

This works exactly like the INS tag, but supports text values for the following Veda attributes that can be used to override values that come from the original process/ commodity definition tables: PRC_PCG, PRC_TSL, PRC_VINT, COM_LIM, COM_TSL, COM_TYPE.

~TFM_TOPINS

~TFM_TOPDINS

Legacy Tags

It is not recommended to use these tags anymore, but they are still supported for backward compatibility reasons.

~COMEMI

Use attribute VDA_EMCB via any regular Veda tag instead.

~PRCCOMEMI

Use attribute FLO_EMIS via any regular Veda tag instead.

~TFM_Fill

Use TFM_Fill-R instead.

Wildcard Support

The columns PSET_PN, PSET_PD, PSET_CO, PSET_CI (for process filters), and CSET_CN, CSET_CD (for commodity filters) support the use of comma-separated entries, with wild cards , in all TFM tables apart from DINS:

Comma-Separated Entries: You can specify multiple entries in these columns by separating them with commas (,).

Example: Process1,Process2,Process3
Wildcards: Wildcards allow flexible and broad pattern-matching for process or commodity names.

Wildcards Overview

Asterisk (`*`): - Acts as a multi-character wildcard, matching zero or more characters.
Examples:
Elec* matches Elec, Electricity, ElecGen, etc.

*Gen matches ElecGen, HeatGen, etc.
Question Mark (`?`) or Underscore (`_`): - Acts as a single-character wildcard, matching exactly one character.
Examples:
Tech_? matches Tech_A, Tech_B, etc.

Fuel_? matches Fuel_X, Fuel_Y, etc.
Square Brackets for Literal `_`: - If you want to refer to _ as an actual character (not a wildcard), enclose it in square brackets [ ].
Example:
Tech[_]_ matches Tech_A, Tech_B, etc.

Examples

Process Set Columns (PSET_…)

Entry: PSET_PN
- Value: Elec* Matches: Electricity_Generation, ElecStorage, etc.
- Value: Fuel?_Gen Matches: Fuel1_Gen, Fuel2_Gen, etc.
- Value: Tech_[_]X Matches: Tech_X.

Commodity Set Columns (CSET_…)

Entry: CSET_CN
- Value: Elec, Heat* Matches: Elec, Heat, HeatPump, etc.
- Value: Gas?_Supply Matches: Gas1_Supply, Gas2_Supply, etc.