wf_psf.data.safe_batch
Safe batch processing utilities.
This module provides utilities for filtering batches of aligned arrays based on sample-wise validity criteria, typically derived from an anchor array (e.g. centroid coordinates).
The core functionality ensures that all dataset components (images, masks, metadata, etc.) remain aligned when invalid samples (NaNs, Infs) are removed. This is critical for preventing silent misalignment bugs in downstream processing.
It also provides lightweight logging helpers to track which samples were filtered, improving traceability and debugging.
These utilities are intended for use during data preparation stages, particularly after feature extraction steps such as centroid estimation.
Author(s): Jennifer Pollack <jennifer.pollack@cea.fr>
Functions
|
Log identifiers of samples removed by a validity mask. |
|
Filter aligned arrays using a validity mask derived from an anchor array. |
- wf_psf.data.safe_batch.log_filtered_objects(mask, obj_ids, logger, context='')[source]
Log identifiers of samples removed by a validity mask.
- Parameters:
mask (np.ndarray) – Boolean mask of shape (N,) indicating valid samples.
obj_ids (array-like) – Identifiers aligned with the samples (length N).
logger (logging.Logger) – Logger instance used for reporting.
context (str, optional) – Additional context string appended to log messages.
- wf_psf.data.safe_batch.safe_batch_builder(anchor: ndarray, **arrays: dict[str, Any]) tuple[ndarray, dict[str, Any]][source]
Filter aligned arrays using a validity mask derived from an anchor array.
This strict version enforces that all array-like inputs are aligned along their first dimension. Any mismatch raises an error.
- Parameters:
anchor (np.ndarray) – Array used to compute validity (typically centroids), of shape (N,) or (N, D).
**arrays (dict of str to Any) – Arrays associated with each sample. All NumPy arrays must have length N.
- Returns:
mask (np.ndarray) – Boolean mask of shape (N,) indicating valid samples.
filtered (dict) – Dictionary containing filtered arrays.
- Raises:
ValueError – If any array has a length different from N.
TypeError – If an unsupported type is passed.