Configuration loading and merging utilities.

Handles loading configuration from YAML files, environment variables, and command-line overrides.

load_yaml(file_path)

Load configuration from a YAML file.

Parameters:
  • file_path (Path) –

    Path to YAML configuration file

Returns:
  • dict[str, Any]

    Configuration dictionary

Raises:
  • FileNotFoundError

    If file doesn't exist

  • YAMLError

    If file is not valid YAML

merge_configs(base, override)

Recursively merge two configuration dictionaries.

Override values take precedence over base values. Nested dictionaries are merged recursively.

Parameters:
  • base (dict[str, Any]) –

    Base configuration dictionary

  • override (dict[str, Any]) –

    Override configuration dictionary

Returns:
  • dict[str, Any]

    Merged configuration

load_env_overrides()

Load configuration overrides from environment variables.

Environment variables follow the pattern: METAHQ_SETUP_= Nested keys use double underscores: METAHQ_SETUP_PARALLEL__NUM_WORKERS=8

Returns:
  • dict[str, Any]

    Configuration overrides from environment

load_config(config_file=None, overrides=None, use_env=True)

Load pipeline configuration with priority handling.

Configuration is loaded and merged in this order (highest priority last): 1. Default configuration (from Pydantic defaults) 2. User configuration file (if provided) 3. Environment variables (if use_env=True) 4. Direct overrides (if provided)

Parameters:
  • config_file (Path | None, default: None ) –

    Path to user configuration file

  • overrides (dict[str, Any] | None, default: None ) –

    Direct configuration overrides

  • use_env (bool, default: True ) –

    Whether to load overrides from environment variables

Returns:
Raises:
  • ValidationError

    If configuration is invalid

Examples:

>>> # Load with defaults only
>>> config = load_config()
>>> # Load from file
>>> config = load_config(config_file=Path("my_config.yaml"))
>>> # Load with overrides
>>> config = load_config(
...     config_file=Path("my_config.yaml"),
...     overrides={"parallel": {"num_workers": 16}}
... )

save_config(config, output_path)

Save configuration to a YAML file.

Parameters:
  • config (PipelineConfig) –

    Configuration to save

  • output_path (Path) –

    Path to save YAML file

get_default_config()

Get default pipeline configuration.

Returns: