Configuration
WaveDiff uses a set of YAML and INI configuration files to control each pipeline task. This section provides a high-level overview of the configuration system, followed by detailed explanations of each file.
Overview of Pipeline Tasks
WaveDiff consists of four main pipeline tasks:
Pipeline |
Purpose |
|---|---|
|
Trains a PSF model using the provided dataset and hyperparameters. |
|
Evaluates model performance using multiple metrics, optionally comparing against a ground-truth model. |
|
Generates figures summarizing the results from the metrics pipeline. |
|
Simulates stellar PSFs used as training/test datasets. |
You configure these tasks by passing a configuration file to the wavediff command (e.g., --config configs.yaml).
Configuration File Structure
WaveDiff expects the following configuration files under the config/ directory:
config
├── configs.yaml
├── data_config.yaml
├── logging.conf
├── metrics_config.yaml
├── plotting_config.yaml
└── training_config.yaml
All
.yamlfiles use standard YAML syntax and are loaded as nested dictionaries of key–value pairs.logging.confuses standard INI syntax and configures logging behavior.Users may modify values but should not rename keys or section names, as the software depends on them.
Each of the configuration files is described in detail below.
data_config.yaml — Data Configuration
1. Purpose
Specifies where WaveDiff loads (or later versions may generate) the training and test datasets.
All training, evaluation, and metrics pipelines depend on this file for consistent dataset paths.
2. Key Fields
data.training.data_dir(required) — directory containing training datadata.training.file(required) — filename of the training datasetdata.test.*— same structure astraining, for the test datasetSimulation-related fields — reserved for future releases
Notes
The simulation options are placeholders; WaveDiff v3.x does not yet auto-generate datasets.
The default dataset bundled with WaveDiff can be used by simply pointing to its directory.
Example (minimal)
data:
training:
data_dir: path/to/training/data
file: train.npy
test:
data_dir: path/to/test/data
file: test.npy
training_config.yaml — Training Pipeline Configuration
1. Purpose
Controls the training pipeline, including model selection, hyperparameters, optional metrics evaluation, and data loading behavior.
2. General Notes
Every field has an inline comment in the YAML file.
All required parameters must be specified. Missing values will prevent the model from being instantiated, as there is currently no default configuration provided.
Optional fields:
metrics_config(run metrics after training)param_hparams,nonparam_hparamsmulti_cycle_params.save_all_cycles
Some parameters are specific to physical or polychromatic PSF models.
Example training configuration file is provided in the top-level root directory of the repository (
training_config.yaml). Users can copy and adapt this template for their own runs.If any descriptions are unclear, or unexpected behaviour occurs, please open a GitHub issue.
Note: The values in the examples shown below correspond to a typical WaveDiff training run. Users should adapt parameters such as model_name, telescope dimensions, pixel/field coordinates, and SED settings to match their own instrument or dataset. All required fields must still be specified.
3. Top-Level Training Parameters
training:
# ID name for this run (used in output files)
id_name: run__001
# Path to Data Configuration file (required)
data_config: data_config.yaml
# Load dataset on initialization (True) or manually later (False)
load_data_on_init: True
# Optional: metrics configuration to run after training
metrics_config:
4. Model Parameters (model_params)
Controls PSF model type, geometry, oversampling, and preprocessing:
model_params:
# Model type. Options: 'poly' and 'physical_poly'
model_name: physical_poly
# Number of wavelength bins for polychromatic reconstruction
n_bins_lda: 8
# Downsampling rate to match telescope pixel sampling
output_Q: 3
# Oversampling rate of OPD/WFE PSF model
oversampling_rate: 3
# Pixel PSF postage stamp size
output_dim: 32
# OPD/Wavefront space dimensions
pupil_diameter: 256
# Physical correction switches
use_prior: False
correct_centroids: True
add_ccd_misalignments: False
# Centroid correction parameters
sigma_centroid_window: 2.5 # Std dev of centroiding window
reference_shifts: [-1/3, -1/3] # Euclid-like default shifts
# Obscuration / geometry
obscuration_rotation_angle: 0 # Degrees (multiple of 90); counterclockwise rotation.
# CCD misalignments input file path
ccd_misalignments_input_path: /path/to/ccd_misalignments_file.txt
# Boolean to use sample weights based on the noise standard deviation estimation
use_sample_weights: True
# Sample weight generalised sigmoid function
sample_weights_sigmoid:
# Boolean to define if we apply the sigmoid function to the sample weights
apply_sigmoid: False
# Maximum value of the sigmoid function and consequently the maximum value of the sample weights
sigmoid_max_val: 5.0
# Power of the sigmoid function. The higher the value the steeper the sigmoid function. In the limit
sigmoid_power_k: 1.0
# Interpolation settings for physical-poly model
interpolation_type: None
interpolation_args: None
# Spectral energy distribution (SED) parameters
sed_interp_pts_per_bin: 0
sed_extrapolate: True
sed_interp_kind: linear
sed_sigma: 0
# Field and pixel coordinates
x_lims: [0.0, 1.0e3]
y_lims: [0.0, 1.0e3]
pix_sampling: 12 # in [um]
# Telescope parameters
tel_diameter: 1.2 # [m]
tel_focal_length: 24.5 # [m]
euclid_obsc: True. # Use Euclid-specific obscuration mask. Set to False for other instruments or custom masks.
LP_filter_length: 3 # Low-pass filter for obscurations
5. Parametric Model Hyperparameters (param_hparams)
param_hparams:
random_seed: 3877572
l2_param: 0.0 # L2 loss for OPD/WFE
n_zernikes: 15
d_max: 2 # Max polynomial degree
save_optim_history_param: true
6. Non-Parametric Model Hyperparameters (nonparam_hparams)
nonparam_hparams:
d_max_nonparam: 5
num_graph_features: 0
l1_rate: 1.0e-8
project_dd_features: False
reset_dd_features: False
save_optim_history_nonparam: true
7. Training Hyperparameters (training_hparams)
Controls batches, loss, and multi-cycle learning:
training_hparams:
batch_size: 32 # Number of samples per batch
loss: 'mask_mse' # Options: 'mask_mse', 'mse'
multi_cycle_params:
total_cycles: 2
cycle_def: complete # Options: 'parametric', 'non-parametric', 'complete', etc.
save_all_cycles: False
saved_cycle: cycle2
learning_rate_params: [1.0e-2, 1.0e-2]
learning_rate_non_params: [1.0e-1, 1.0e-1]
n_epochs_params: [20, 20]
n_epochs_non_params: [100, 120]
metrics_config.yaml — Metrics Configuration
1. Purpose
Defines how a trained PSF model is evaluated. This configuration specifies which metrics to compute, which model weights to use, and how ground truth stars are obtained. It allows you to:
Select a fully trained PSF model or a checkpoint for evaluation.
Specify which training cycle’s weights to evaluate.
Compute Polychromatic, Monochromatic, OPD, and Weak Lensing Shape metrics.
Use precomputed ground truth stars from the dataset if available, or automatically generate them from the configured ground truth model.
Optionally produce plots of the computed metrics via a plotting configuration file.
2. General Notes
WaveDiff automatically searches the dataset used for training. If the dataset contains
stars,SR_stars, orsuper_res_starsfields, these are used as the ground truth for metrics evaluation.If precomputed ground truth stars are not found in the dataset, WaveDiff regenerates them using the
ground_truth_modelparameters. All required fields inmodel_paramsmust be specified; leaving them empty will prevent the metrics pipeline from running (see Ground Truth Model Parameters for details).The metrics evaluation can be run independently of training by specifying trained_model_path and
trained_model_config.Metrics defined in Metrics Overview table are selectively computed according to their boolean flags. The Polychromatic Pixel Reconstruction metric is always computed.
The
plotting_configparameter triggers plotting of the metrics results if a valid configuration file is provided. If left empty, metrics are computed without generating plots (see Plotting Configuration).Batch size and other evaluation hyperparameters can be set under
metrics_hparams(see Evaluation Hyperparameters)
3. Metrics Overview
Metric type |
Description |
Relevant v3.0 Flag |
|---|---|---|
Polychromatic Pixel Reconstruction |
Computes absolute and relative RMSE of pixel residuals between the trained polychromatic PSF model and test data at low- and super-pixel resolution. |
|
Monochromatic Pixel Reconstruction |
Computes absolute and relative RMSE of pixel residuals as a function of wavelength between a monochromatic PSF model and test data. |
|
Optical Path Differences (OPD) |
Computes absolute and relative RMSE of residuals between predicted OPD maps and ground truth OPD. |
|
Weak Lensing Shape Metrics |
Second-order moments-based metrics for PSF ellipticity and size at super-pixel resolution using GalSim HSM. |
|
Notes:
Metrics requiring ground truth (Monochromatic, OPD) are valid only on simulated datasets.
RMSE may be less reliable for noisy stars (e.g., real data). Alternative formulations are in development.
Super-resolution is required for Weak Lensing Shape metrics on undersampled PSFs (e.g., Euclid observations).
4. Top-Level Configuration Parameters
metrics:
model_save_path: <enter psf_model or checkpoint>
saved_training_cycle: 2
trained_model_path: </path/to/parent/directory/of/trained/model>
trained_model_config: <enter name of trained model config file>
eval_mono_metric: True
eval_opd_metric: True
eval_train_shape_results_dict: False
eval_test_shape_results_dict: False
plotting_config: <enter name of plotting_config.yaml or leave empty>
Notes:
model_save_path: Load final PSF model weights (psf_model) or checkpoint weights (checkpoint).saved_training_cycle: Choose which training cycle to evaluate (1, 2, …).trained_model_path: Absolute path to parent directory of previously trained model. Leave empty for training + metrics in serial.trained_model_config: Name of training config file intrained_model_path/config/.eval_mono_metric: If True, computes the monochromatic pixel reconstruction metric. Requires aground_truth_model(see).eval_opd_metric: If True, computes the optical path difference (OPD) metric. Requires aground_truth_model.eval_train_shape_results_dict/eval_test_shape_results_dict: Compute Weak Lensing Shape metrics on the training and/or test dataset.plotting_config:Optionally provide aplotting_config.yamlfile to generate plots after metrics evaluation.Behaviour notes:
Metrics controlled by flags (
eval_mono_metric,eval_opd_metric,eval_train_shape_results_dict,eval_test_shape_results_dict) are only computed if their respective flags are True.The Polychromatic Pixel Reconstruction metric is always computed, regardless of flags.
Future releases may allow optional
ground_truth_modelinstantiation if the dataset already contains precomputed stars.
5. Ground Truth Model Parameters
Mirrors training parameters for consistency:
ground_truth_model:
model_params:
model_name: <ground_truth_poly or ground_truth_physical_poly>
n_bins_lda: 20
output_Q: 3
oversampling_rate: 3
output_dim: 32
pupil_diameter: 256
LP_filter_length: 2
use_sample_weights: True
use_prior: False
correct_centroids: False
sigma_centroid_window: 2.5
reference_shifts: [-1/3, -1/3]
obscuration_rotation_angle: 0
add_ccd_misalignments: False
ccd_misalignments_input_path:
interpolation_type: None
sed_interp_pts_per_bin: 0
sed_extrapolate: True
sed_interp_kind: linear
sed_sigma: 0
x_lims: [0.0, 1.0e+3]
y_lims: [0.0, 1.0e+3]
param_hparams:
random_seed: 3877572
l2_param: 0.
n_zernikes: 45
d_max: 2
save_optim_history_param: True
nonparam_hparams:
d_max_nonparam: 5
num_graph_features: 10
l1_rate: 1.0e-8
project_dd_features: False
reset_dd_features: False
save_optim_history_nonparam: True
Notes:
All fields in
model_paramsare required. Do not leave them empty. Even if the dataset contains precomputed ground truth stars, omittingmodel_paramswill prevent the metrics pipeline from running.Parameters mirror
training_config.yamlfor consistency.Future releases may allow optional instantiation of
ground_truth_modelwhen precomputed stars are available in the dataset.
6. Evaluation Hyperparameters
metrics_hparams:
batch_size: 16
opt_stars_rel_pix_rmse: False
l2_param: 0.
output_Q: 1
output_dim: 64
Parameter explanations:
batch_size: Number of samples processed per batch during evaluation.opt_stars_rel_pix_rmse: IfTrue, saves RMSE for each individual star in addition to mean across FOV.l2_param: L2 loss weight for OPD.output_Q: Downsampling rate from high-resolution pixel modeling space.output_dim: Size of the PSF postage stamp for evaluation.
plotting_config.yaml — Plot Configuration
The plotting_config.yaml file defines how WaveDiff generates diagnostic plots from the metrics produced during model evaluation. While the plotting routines are mostly pre-configured internally, this file allows you to combine and compare metrics from multiple training runs, or simply visualize the results of the most recent metrics pipeline execution.
1. Purpose
This configuration controls how metric outputs from one or more WaveDiff runs are located and aggregated for plotting. It enables users to:
Specify where metrics outputs are stored,
Select which runs to include in joint plots,
Associate each run with its corresponding
metrics_config.yaml,Optionally display plots interactively during execution.
2. General Notes
All plotting styles and figure settings are hard-coded and do not require user modification.
If the plotting task is executed immediately after a metrics evaluation run, all fields except
plot_showmay be left empty—the pipeline will automatically locate the outputs of the active run.When plotting results from multiple runs, the entries in
metrics_dirandmetrics_configmust appear row-aligned, with each position referring to the same run.If any descriptions are unclear, or if you encounter unexpected behavior, please open a GitHub issue (https://github.com/CosmoStat/wf-psf/issues/new).
3. Basic Structure
An example plotting_config.yaml is shown below:
plotting_params:
# Path to the parent folder containing wf-psf output directories (e.g. $WORK/wf-outputs/)
metrics_output_path: <PATH>
# List of output directories (e.g. wf-outputs-xxxxxxxxxxx) whose metrics should be plotted
metrics_dir:
# - wf-outputs-xxxxxxxxxxx1
# - wf-outputs-xxxxxxxxxxx2
# List of the metric config filenames corresponding to each listed directory
metrics_config:
# - metrics_config_1.yaml
# - metrics_config_2.yaml
# If True, plots are shown interactively during execution
plot_show: False
### 4. Example Directory Structure Below is an example of three WaveDiff runs stored under a single parent directory:
wf-outputs/
├── wf-outputs-202305271829
│ ├── config
│ │ ├── data_config.yaml
│ │ └── metrics_config_200.yaml
│ ├── metrics
│ │ └── metrics-poly-coherent_euclid_200stars.npy
├── wf-outputs-202305271845
│ ├── config
│ │ ├── data_config.yaml
│ │ └── metrics_config_500.yaml
│ ├── metrics
│ │ └── metrics-poly-coherent_euclid_500stars.npy
├── wf-outputs-202305271918
│ ├── config
│ │ ├── data_config.yaml
│ │ └── metrics_config_1000.yaml
│ ├── metrics
│ │ └── metrics-poly-coherent_euclid_1000stars.npy
Example: Plotting Multiple Runs
To jointly plot metrics from the three runs shown above, the plotting_config.yaml would be:
plotting_params:
metrics_output_path: $WORK/wf-outputs/
metrics_dir:
- wf-outputs-202305271829
- wf-outputs-202305271845
- wf-outputs-202305271918
metrics_config:
- metrics_config_200.yaml
- metrics_config_500.yaml
- metrics_config_1000.yaml
plot_show: False
This configuration instructs the plotting pipeline to load the metrics from each listed run and include them together in summary plots.
Master Configuration
1. Purpose
The configs.yaml file is the master controller for WaveDiff.
It defines which pipeline tasks should be executed (training, metrics evaluation, plotting) and in which order.
Each task points to a dedicated YAML configuration file—allowing WaveDiff to run multiple jobs sequentially using a single entry point.
2. Example: Multiple Training Runs
To launch a sequence of training runs (models 1…n), list each task and its corresponding configuration file:
---
training_conf_1: training_config_1.yaml
training_conf_2: training_config_2.yaml
...
training_conf_n: training_config_n.yaml
Outputs will be organized as:
wf-outputs-20231119151932213823/
├── checkpoint/
│ ├── checkpoint_callback_poly-coherent_euclid_200stars_1_cycle1.*
│ ├── ...
│ ├── checkpoint_callback_poly-coherent_euclid_200stars_n_cycle1.*
├── config/
│ ├── configs.yaml
│ ├── data_config.yaml
│ ├── training_config_1.yaml
│ ├── ...
│ └── training_config_n.yaml
├── optim-hist/
├── plots/
└── psf_model/
├── psf_model_poly-coherent_euclid_200stars_1_cycle1.*
├── ...
└── psf_model_poly-coherent_euclid_200stars_n_cycle1.*
3 Example: Training + Metrics + Plotting
To evaluate metrics and generate plots for each trained model, include the corresponding configuration files:
config/
├── configs.yaml
├── data_config.yaml
├── metrics_config.yaml
├── plotting_config.yaml
├── training_config_1.yaml
├── ...
└── training_config_n.yaml
Note: current WaveDiff versions generate one plot per metric per model. Creating combined plots requires a separate run Plot Configuration. . A future update will support automatic combined plots.
4 General Notes
configs.yamlmay contain any combination of the three task types:trainingmetricsplotting
Tasks always execute in the order they appear in the file.
The current release runs all jobs on a single GPU, sequentially.
Parallel multi-GPU execution is planned for a future version.
For questions or feedback, please open a GitHub issue.