Run the entire pipeline on a single example CFIS image with tile ID 246.290:
ShapePipeand activate the
Run the job script
job_sp 246.290 -j 127
ShapePipe pipeline processes single-exposure images and stacked images. Input images have to be calibrated beforehand for astrometry and photometry. This tutorial of an entire
ShapePipe run covers specifically images from CFIS, the Canada-France Imaging Survey. CFIS stacks are so-called tiles, which are the co-adds of on average three exposures in the r-band.
File types and names¶
ShapePipe pipeline handles different image and file types, some of which
are created by the pipeline during the analysis. These file types are listed below.
All files follow a (configurable) naming and numbering convention, to facilitate bookkeeping for
tracking relevant image information. In general, the convention is <image_type>_
ID can be a combination of numbers and special characters such as
Naming and numbering of the input files can closely follow the original image names and (ID) numbers provided by the telescope and pre-processing software, with some required modifications as described below.
Single-exposure mosaic image.
Multi-HDU FITS file containing a mosaic from multiple CCDs of a single exposure (an exposure is also called epoch). Each CCD is stored in a different HDU. These files are used on input by
ShapePipe. The pixel data can contain the observed image, a weight map, or a flag map. These images are typically created by a telescope analysis software (e.g.~
pitcairn). Examples from CFIS are
2214439p.flag.fits.fz. These names need to be modified to be correctly identified by
pneeds to be removed, the image type needs to precede the ID, and the file name can only contain a single dot (
.) delimiting the file extension. We create the extension
fitsfzfor compressed FITS file.
Default convention: <image_type>-<exposure_number>.fitsfz.
Single-exposure single-CCD image.
FITS file containing a single CCD from an individual exposure. The pixel data can contain the observed image, a weight map, or a flag map.
Default convention: <image_type>-<exposure_number>-<CCD_number>.fits
FITS file containing a stack by co-adding different single exposures, created by software such as
swarp. A stacked image is also called tile. These files are used on input by
ShapePipe. The pixel data can contain the observed image, a weight map, or a flag map. Tile images and weights are created in the case of CFIS by Stephen Gwyn using a combination of
swarpand his own software. Examples of file names are
CFIS.205.267.r.weight.fits.fz, the latter is a compressed FITS file, see below. Tile flag files are created the mask module of
ShapePipe(see Mask images). The tile ID needs to be modified such that the
.between the two tile numbers (RA and DEC indicator) is not mistaken for a file extension delimiter. For the same reason, the extension
.fits.fzis changed to
.fitzfz. In addition, for clarity, we include the string
imagefor a tile image type.
Default convention: <image_type>-<tile_number>.fits
Database catalogue files
For very large files that combine information from multiple tiles or single exposures,
sqlitedata base catalogues.
log_exp_headers.sqlite, exposure header information
Numpy array binary files
Some large files are stored as numpy arrays. These contain FITS header information. Example:
SExtractorproduce FITS files with file exentions other than
.psffor files containing PSF model information for a single CCD, and
.catfor a PSF catalogue.
final shape catalogue The end product of
ShapePipeis a final catalogue containing a large number of information for each galaxy, including its shape parameters, the ellipticity components :math:
e_2. This catalogue also contains shapes of artificially sheared images. This information is used in post-processing to compute calibrated shear estimates via metacalibration.
Summary statistic files
SEToolsmodule that creates samples of objects according to some user-defined selection criteria (see Select stars) also outputs ASCII
files with user-defined summary statistics for each CCD, for example the number of selected stars, or mean and standard deviation of their FWHM.
Tile ID list
ASCII file with a tile number on each line. Used for the
get_image_runnermodule to download CFIS images (see Download tiles).
Single-exposure name list
ASCII file with a single-exposure name on each line. Produced by the
find_exposure_runnermodule to identify single exposures that were used to create a given tile. See Find exposures).
SEToolsmodule can also produce plots of the objects properties that were selected for a given CCD. The type of plot (histogram, scatter plot, …) and quantities to plot as well as plot decorations can be specified in the selection criteria config file (see Select stars). Example:
The pipeline core and all called modules write ASCII log files to disk.
ShapePipe splits the processing of CFIS images into several parts:
These are the retrieval and preparation of input images, processing of single exposures,
processing of tile images, creation and upload (optional) of final shape catalogues.
The following flowchart visualised the processing parts and steps.
Below, the individual processing steps are described in detail.
Input and output paths¶
All required paths are automatically set in the job script
If an example config file is run outside this script, the following path variables might need to be defined.
$SP_RUN: Run directory of
ShapePipe. In general this is just
pwd, and can be set via
but on a cluster this directory might be different.
$SP_CONFIG: Path to configuration files. In our example this is
In addition, the output path
$SP_RUN/output needs to be created by the user before running
Job and pipeline scripts¶
The job script to run the pipeline in its entity or in parts is
for all options.
This script creates the subdirectory
$SP_RUN/output to store all pipeline outputs
(log files, diagnostics, statistics, output images, catalogues, single-exposure headers with WCS information).
Optionally, the subdir
output_star_cat is created by the used to store the external star catalogues for masking. This is only necessary if the pipeline is run on a cluster without internet connection to access star catalogues. In that case, the star catalogues need to be retrieved outside the pipeline, for example on a login node, and copied to
The job script automaticall performs a number of subsequent calls to the
shapepipe_run -c $SP_CONFIG/<config>.ini
The config file
<config>.ini contains the configuration for one or more modules.
See the main
ShapePipe readme for more details.
The user specifies which steps are run with the command line option
-j JOB. The integer value
is bit-coded such that arbitrary combinations of steps can be run with a single call to
example, to run steps #1 and #2, type
job_sp -j 3.
To run the job script, one or more CFIS tiles need to be chosen. If the tile IDs are known, they are provided to
job_sp on the command line.
If the tile IDs are not known a priori, they can be selected via sky coordinates, with the script
For example, to find the tile number for a Planck cluster at R.A.=213.68 deg, dec=57.79 deg, run:
cfis_field_select -i /path/to/shapepipe/auxdir/CFIS/tiles_202007/tiles_all_order.txt --coord 213.68deg_54.79deg -t tile --input_format ID_only --out_name_only --out_ID_only -s
The input text file (provide via the flag
-i) contains a list of CFIS tiles, this can also be directory containing the tile FITS files.
The following sections describe the different steps that are performed with
Run the pipeline¶
Retrieve input images¶
job_sp TILE_ID -j 1
retrieves the image and weight corresponding to TILE_ID using the module
It then identifies the exposures that were used to create the tile image via the
Finally, another call to
get_images retrieves the exposure images, weights, and flag files.
For the retrieval method the user can choose betwen
download from VOspace (
create symbolic link to existing file on disk (
Note that internet access is required for this step if the download method is
An output directory
output) is created containing the results of
get_images for tiles (
get_images for exposures (
Prepare input images¶
job_sp TILE_ID -j 2
the compressed tile weight image is uncompressed via the
uncompress_fits module. Then, the single-exposure images, weight, and flags are split into single-exposure single-CCD file
(one FITS file per CCD) with
Finally, the headers of all single-exposure single-CCD files are merged into a single
sqlite file, to store the WCS information of the input exposures.
Two output directories are created,
run_sp_exp_SpMh for the output of the modules
job_sp TILE_ID -j 4
to mask tile and single-exposure single-CCD images. Both tasks are performed by two calls to the
Note that internet access is required for this step, since a reference star catalogue is downloaded.
The output of both masking runs are stored in the output directory
run_sp_MaMa, with run 1 (2) of
mask corresponding to tiles (exposures).
Diagnostics: Open a single-exposure single-CCD image and the corresponding pipeline flag
ds9, and display both frames next to each other. Example
ds9 image-2113737-10.fits pipeline_flag-2113737-10.fits
zoom fit for both frames, click
scale zscale for the image, and
color aips0 for the flag, to display something like this:
By eye the correspondence between the different flag types and the image can be
seen. Note that the two frames might not match perfectly, since (a) WCS
information is not available in the flag file FITS headers; (b) the image can
have a zero-padded pixel border, which is not accounted for by
Detect objects on tiles and process stars on single exposures¶
job_sp TILE_ID -j 8
performs a number of steps. First, objects on the tiles are deteced with the
Next, the following tasks are run on the single-exposure single-CCD images:
Objects are deteced with
Star candidates are selected via
The PSF model is created, either with
psfexfor PSFex, or with
The PSF model is interpolated to star positions for validation. For the PSFEx model, this is done via a call to
psfex_interp. For MCCD, the modules
The output directory for both the
psfex options is
This stores the output of SExtractor on the tiles (
tile_Sx), on the exposures (
Se), and the Psf model (
The following plots show an example of a single CCD, in the center of the focal plane.
Star magnitude histogram
Stars in CCD (mag)
Stars in CCD (size)
The stellar locus is well-defined
Magnitude distribution looks reasonable
Stars are relatively homogeneously distributed over the CCD
The uniform and small seeing of CFHT is evident
To contrast the last plot, here is the case of the CCD in the lower right corner, which shows a known (but yet unexplained) lack of stars in the lower parts:
The statistics output file for the center CCD #10:
cat star_stat-2113737-10.txt # Statistics Nb objects full cat = 1267 Nb stars = 160 stars/deg^2 = 6345.70450519073 Mean star fwhm selected (arcsec) = 0.7441293125152588 Standard deviation fwhm star selected (arcsec) = 0.014217643037438393 Mode fwhm used (arcsec) = 0.7345179691314697 Min fwhm cut (arcesec) = 0.7159179691314698 Max fwhm cut (arcsec) = 0.7531179691314697
Global star sample statistics¶
The statistics on stars from all CCD can be combined to create histograms, with the non-pipeline script
stats_global -o stats -v -c $SP_CONFIG/config_stats.ini
to create histograms (as
.txt tables and
.png plots) in the directory
stats. Here are some example plots :
Non-masked objects per CCD
Stars per CCD
No CCD with a very large masked area
No CCD with insufficient stars
Rather broad seeing distribution
stats_global read all
SETool output stats files found in a given input directory tree. It can thus produce histogram combining
The focus of the next step,
job_sp TILE_ID -j 16
is the selection of galaxies as extended objects compared to the PSF.
First, the PSF model is interpolated to galaxy positions, according to the PSF model
mccd_interp. Next, postage stamps around galaxies
of the weights maps are created via
vignetmaker. Then, the spread model
is computed by the
spread_model module. Finally, postage stamps
around galaxies of single-exposure data is extracted with another call
The output directory is
run_sp_MiViSmViif the PSF model is
This corresponds to the MCCD/PSFex interpolation (
Sm), and the
second call to
job_sp TILE_ID -j 32
computes galaxy shapes using the multi-epoch model-fitting method
ngmix. At the same time,
shapes of artifically sheared galaxies are obtained for metacalibration.
Shape measurement is performed in parallel for each tile, the number of processes can be specified
by the user with the option
--nsh_jobs NJOB. This creates
NJOB output directories
X = 1 …
NJOB containing the result of
The last real processing step is
job_sp TILE_ID -j 64
This task first merges the
ngmix output files from the previous step into
one output file. Then, previously obtained information are pasted into a final shape catalogue via
Included are galaxy detection and basic measurement parameters, the PSF model at
galaxy positions, the spread-model classification, and the shape measurement.
Two output directories are created.
The first one is
run_sp_Ms for the
The second is
run_sp_Mc for the
make_cat task; the name is the same for both the
PSFEx PSF model.
Optionally, after the pipeline is finished, results can be uploaded to VOspace via
job_sp TILE_ID -j 128