wf_psf.data.data_utils
Data utilities and lightweight runtime data structures.
Provides lightweight dataset containers, runtime conversion contexts, and helper utilities used throughout the dataset normalization and preprocessing pipeline.
This module includes:
Dictionary-like dataset container abstractions
Dataset inspection and normalization helpers
Runtime conversion context objects used during field-level processing
Domain-specific preprocessing contexts (e.g. SED processing)
These utilities support schema-driven dataset conversion workflows used by training, validation, and inference pipelines.
Notes
The conversion context system is intentionally extensible to support additional scientific domains and instrument pipelines beyond the current Euclid-specific workflows. Future extensions may include dedicated contexts for:
PSF modeling
Instrument calibration
Detector noise simulation
Functions
|
Convert an object to a |
Classes
|
Global runtime context for dataset conversion operations. |
|
Lightweight container for structured dataset data. |
|
Context object containing parameters required for SED processing. |
- class wf_psf.data.data_utils.ConversionContext(seds: SEDContext | None = None)[source]
Bases:
objectGlobal runtime context for dataset conversion operations.
This object aggregates optional domain-specific contexts required during dataset preprocessing and conversion. It is passed through the conversion pipeline and accessed by field-specific handlers.
Currently, it contains an optional SED context used for spectral energy distribution processing.
- Attributes:
- seds
- seds: SEDContext | None = None
- class wf_psf.data.data_utils.DatasetContainer(data: dict[str, Any])[source]
Bases:
MutableMappingLightweight container for structured dataset data.
Stores data internally as a dictionary, while providing dictionary-style and attribute-style access for convenience.
Examples
>>> container = DatasetContainer({'x': np.array([1, 2, 3]), 'y': np.array([4, 5, 6])}) >>> container['x'] array([1, 2, 3]) >>> container.x array([1, 2, 3]) >>> container.to_dict() {'x': array([1, 2, 3]), 'y': array([4, 5, 6])}
Methods
clear()get(key[, default])Retrieve the corresponding layout by the string key.
items()keys()pop(k[,d])If key is not found, d is returned if given, otherwise KeyError is raised.
popitem()as a 2-tuple; but raise KeyError if D is empty.
setdefault(k[,d])to_dict()Return data as dict.
update([E, ]**F)If E present and has a .keys() method, does: for k in E: D[k] = E[k] If E present and lacks .keys() method, does: for (k, v) in E: D[k] = v In either case, this is followed by: for k, v in F.items(): D[k] = v
values()
- class wf_psf.data.data_utils.SEDContext(simPSF: PSFSimulator, n_bins_lambda: int)[source]
Bases:
objectContext object containing parameters required for SED processing.
This context encapsulates all runtime dependencies needed for spectral energy distribution (SED) transformations within the dataset conversion pipeline.
- Parameters:
simPSF (Any) – PSF simulator instance used during SED processing. This object is responsible for modelling instrument response effects applied to spectral data.
n_bins_lambda (int) – Number of wavelength bins used for discretizing the SED during conversion.
- simPSF: PSFSimulator
- wf_psf.data.data_utils.to_container(obj) DatasetContainer | None[source]
Convert an object to a
DatasetContainer.Transforms various dataset representations into a standardized
DatasetContainerused by downstream processing.Supported input types include dictionaries, dataclasses, objects with attributes, and existing
DatasetContainerinstances.- Parameters:
obj (Any) – Object representing dataset data.
- Returns:
Structured container wrapping the dataset data.
- Return type:
DatasetContainer or None
- Raises:
TypeError – If the input type is not supported.