LPipe - Detailed usage manual


Table of Contents

  1. Goals and Scope
  2. General Workflow
  3. Display, Validation, and Logging
  4. Descriptons of Processing Steps
  5. Additional Tools
  6. Examples
  7. Installation
  8. Troubleshooting


Goals and Scope

LPipe is a package for fast and entirely automated reduction of longslit and imaging data acquired with LRIS. It is designed to function in a wide range of circumstances: for bright and faint objects and using all available gratings, grisms, and dichroics. (It does not yet support polarimetry or multi-slit modes.) It supports all four versions of the red CCD. It prioritizes robust reductions over obtaining theoretically optimized S/N, and is able to identify and ignore (or take steps to mitigate) most forms of bad data resulting from instrument problems or observer error.

It is well-suited for quicklook reductions at the telescope, for time-critical observations (e.g. TOO's), and for exploration of archival data. It is less well-suited for specific projects where e.g. extremely accurate calibration is necessary. For spectra, the typical relative flux calibration accuracy is about 5-10 percent and the typical wavelength calibration accuracy is about 1 pixel. Imaging astrometry is good to 0.5 arcsec across most of the field (but worse at the edges, and not suitable for mosaicing); flat-fielding accuracy is completely depending on the data-gathering procedure. In rare cases, during periods of instrument/telescope problems, or on nights when poor observing/calibration procedures are used, the performance may be worse. The imaging pipeline was developed earlier than the spectroscopic pipeline and has comparatively effort expended to modernize it or make it flexible/extensible.

In default mode the pipeline offers one-line reductions, and it is anticipated that this is the mode that will be employed by most users. In other words, typical usage requires nothing more sophisticated than this command to reduce an entire night of data:

IDL> lpipe

However, the pipeline also permits finer control by specifying additional options at the command lines, and via a GUI interface. Additionally, a wide variety of quality-assurance (QA) check-plots are produced which may be helpful in diagnosing pipeline issues. Some information about these options (and general information on pipeline operations to help demystify the procedures) is provided below. For more information, users can also consult the in-code documentation or the associated publication.

Note that you can always type:

IDL> lpipe, /help

For some basic information (including a list and descriptions of all processing steps and some additional tools.)




General Workflow

The pipeline employs a series of self-contained steps in series. Some steps are shared between imaging and spectroscopy (although most have separate implementations internally). The steps, in the order they are normally carried out in pipeline processing, are:

spectroscopyimagingsummary
prepare Combine amplifiers, bias-subtract, add header metadata
makeflat Produce flat fields
flatten Flat-field correct
reflatten Produce super-sky flat fields and flatten again
makefringe Produce fringe frame
rmfringe Correct for fringing
split Separate left and right CCDs
crclean Remove cosmic ray pixels
skysubtract Model and subtract night-sky lines
sum Combine individual (2D) spectra
trace Determine trace functions and slit profiles; identify objects
extract Extract objects to produce 1D spectra
wavcal Produce wavelength solution from an arc
wavapply Apply wavelength solution and linear flexure correction
response Determine sensitivity function from standard stars
fluxcal Flux-calibrate data using sensitivity function
combine Combine multiple 1D spectra of the same source
connect Pair red and blue spectra and connect at junction point
astrometry Download star catalogs and determine/apply WCS solution
photometry Solve for image zeropoints
stack Combine multiple imaging exposures

Deprecated steps that will be restored in future pipeline versions are in italics.

Within each step, blue imaging, then blue spectroscopy, then red imaging, then red spectroscopy, are processed. Within that sequence the order usually follows that of the associated filenames, even if the configuration changed back and forth during a night.

The steps are designed to be entirely self-contained: almost all information is stored in file headers and re-read with each new step. If the pipeline is interrupted it can thus continue at a later date without having to repeat any earlier steps. By default, if the pipeline is restarted it will retry any failed operations but will not redo any operation that would overwrite a file that already exists. This behavior can be altered using the options below.

Workflow control command-line options:

Step control:
Higher-level control:
File-based filtering:



Display, Validation, and Logging

Almost every message printed to the screen during pipeline operations is also saved to a permanent logfile (lpipelog.txt). If the pipeline is re-run the logfile is appended from the bottom; it is cleared only if deleted by the user.

Additionally, most of the 1D spectral reduction steps will produce check plots for user quality assurance. These may be flashed up on the screen, written to disk as a postscript file, or both. These are generated by specific pipeline processing steps as they run (see the individual step sections for more details).

The most powerful validation option by far, though, is the lrisvalidate tool. This will allow you to visually Step through final spectra and some of their associated calibration files, such as sky line plots (to confirm wavelength calibration), response plots (to confirm flux calibration), and 2D spectra (to confirm tracing, object selection/extraction, and cosmic ray rejection).

Display/logging command-line options:



Summary of All Steps


prepare

This step does several things together: the overscan (bias) is subtracted, the amplifiers and left/right chips are joined together into a simple extension-free FITS file, and the array is transposed so that (for spectra) the wavelength dispersion axis is horizontal and the spatial axis is vertical, and a large amount of header information is added.

Bias subtraction and amplifier combination uses a modified version of readmhdu.fits (written by Marc Kassis), with the /linebias option set. There is no gain correction applied (here or at any other stage in the pipeline). Spectra or images taken in full-frame mode are subsequently cropped down to the standard cropping regions and transposed. Additionally, a bad pixel map is employed to flag pixels and columns that are known to be problematic (they are set to NaN, which all subsequent steps interpret as missing data.)

The amount of header information added is substantial and includes solar/lunar ephemerides, unique identifiers corresponding to the first file in the same configuration sequence, and many other details. Particularly notably, the pipeline adds information about binning cropping via the LTM and LTV keywords, which allow the 'physical' coordinates as seen in DS9 to always self-consistently map to the same detector pixels regardless of binning or cropping. For existing header fields, interpretive comments are added. The header is also 'corrected' for missing information: frequently LRIS fails to write some critical keywords to the header, which the pipeline recovers by copying those keywords from the matching file in the opposite camera.

Associated options:

Output:

Diagnostic output:

None.


makeflat

Produces flat-fields by identifying and combining images of a uniform source. For imaging this is either the dome screen or the twilight sky; for spectroscopy it can also be an internal flat such as a halogen or deuterium flat. A median combination is used in all cases but additional filtering is done as necessary: for twilight-flat imaging bright sources are excluded and removed, and for spectroscopy a variety of methods are used to filter out spectral lines or remove spatial banding/gradients associated with the lamps. These methods are not perfect and leave some residuals.

Associated options:

Output:

Diagnostic output:

None.


flatten

Uses processed flat-fields to correct the data. Ideally the flat-fields are in the exact configuration as the data, so this is straightforward, but there can be a variety of complications. For example the wavelength solution for spectra might be slightly different, a dichroic might be different, an order-blocking filter might be present or absent, etc.

Even when the configuration is identical the flat-fielded image is unlikely to exactly match the data because of flexure and because components, once moved, may not always return to their original configuration.

Output:

Diagnostic output:

None.


reflatten

Currently this step is for imaging only. It has been observed that when LRIS is switched in and out of imaging mode the dust spots and overall vignetting pattern of the filter itself remain at a constant location move by many pixels. However, this pattern remains fixed within a block of imaging observations even when the filter is changed. Thus the pipeline constructs super-sky flats, based on all data in each filter during a given imaging block. It then, as part of the same step, corrects the data using these flats. (This is performed only if there are a large number of frames and different fields for a particular filter. Otherwise, no super-sky flattening is performed.)

Associated options:

Output:

Diagnostic output:

None.


split

This is a straightforward step that breaks apart the joined frame into right (and left, if specified in the chip option) amplifiers. Some additional cropping is also applied.

Associated options:

Output:

Diagnostic output:

None.


crclean

Identifies and removes cosmic rays from the data. Given the challenging nature of cosmic rays in deep-depletion data a custom algorithm is used. Similar algorithms are used for imaging and spectroscopy although they are coded separately.

Associated options:

Output:

Diagnostic output:

None.


skysubtract

Subtract an estimate of the night-sky emission flux from the 2D spectrum and place it in an extension. It is important to note that this procedure is not used directly in final spectrum construction because the sky is 'readded' at the type of extraction, but temporary removal of the confusing sky lines is needed for intermediate steps including (in particular) source identification and tracing of faint objects.

Associated options:

None.

Output:

Diagnostic output:

None.


sum

Coadd the individual 2D spectra of a common source (at a common slit orientation) to produce a single 2D spectrum. If the telescope was dithered along the slit the dithered images (i.e. those after the first) are shifted before stacking using the header positional keywords. (If the shift was shifted laterally these are not included and are instead used for a separate stack.)

Associated options:

Output:

Diagnostic output:

None.


trace

A multi-stage step that does several, related tasks. First, each (summed) 2D spectrum is loaded. If a moderately bright object is located anywhere on the trace, a section of the 2D spectrum around that object is then used to calculate a generic tracing function for use in all extractions involving that file. This is saved as a '.trace' file, which contains the polynomial fit terms. The polynomial order is generally low (3rd order by default and even less if the trace is lost for a significant fraction of the spectrum), and as a result small inaccuracies in the tracing - of order 1-2 pixels - are not uncommon, but large tracing errors are now very rare. (If there is no bright object on the trace, the pipeline instead selects another object observed in the same configuration and close in time and uses its trace function as a substitute.)

Next, a median-filtered sum along the trace (i.e, over all wavelengths) is used to produce a 1D spatial profile along the slit. This is saved as a file ending in '.profile' and is used in source detection.

Next, two object detection procedures are run on the profiles described previously. The first run is only for profiles that contain very bright, single objects (generally, standard stars): it is used to determine where the primary target tends to fall on the detector. The second run is for all science observations: all profile peaks are measured and the algorithm selects the source that is closest to the position where the bright objects tend to be located. (The brightness of the source is used as a secondary criterion if there are several sources close to the nominal position.) All object positional data (for all files) is saved in a single text file, objectpos.txt.

Finally (after both cameras have been run), the red and blue trace center positions are compared for all spectra that overlap in time to see if the same source is being extracted on both sides and if the aperture diameters are the same (or at least similar). If not, one aperture is shifted to avoid the production of 'chimeric' spectra in later stages.

Associated options:

Output:

Diagnostic output:



extract

Following tracing the objects are extracted to produce 1D spectra. This uses the trace and positional data generated above and any user modifications to it (see below). The extraction is done using a custom procedure, which uses a basic "top hat" extraction with two parallel background bands on either side of the object to measure (and subtract) the sky background. (Note: the sky is 're-added' to the sky-subtracted spectrum because the sky subtraction procedure can sometimes remove some source flux.)

Note that this is not an optimal extraction. Indeed, to guard against tracing errors the default aperture radius is larger than is optimal even for a simple extraction.

Associated options:

Additionally, users can exert fine control on the extraction using the GUI system lrisapertures. This is a separate routine that must be run after the trace step is complete, and allows the user to change the positions and widths of the object extraction apertures. If any object apertures are changed those targets will need to be re-reduced starting with this step. The lrisapertures routine will be separately documented.

Output:

Diagnostic output:

Note that the above plots are generated after extraction, not after tracing. This ensures that they match what is actually extracted (as opposed to what is going to be extracted next time the pipeline is run.



wavcal

Wavelength-calibrate arc spectra. This does its own basic median extraction to produce a 1D spectrum, then runs a line detection routine on the result and uses a custom pattern-matching routine to match the resulting line list against a reference line list and use it to determine a wavelength solution.

The Cd and Zn arcs take about 5 minutes to fully warm up and produce lines with the expected strengths and line ratios, which is about 4 minutes longer than the available patience of a typical observer. Unwarmed arcs are missing expected lines and often confuse the pattern matcher, leading to a bad solution. Two steps are taken to mitigate this. If multiple identical-configuration arcs were taken in sequence, all except the last are ignored (under the assumption that earlier ones might not be fully warmed). Second and more critically, all solutions are validated to make sure that all lines that expected to be present actually are present - and if not, the solution is not written.

Solutions are stored as a list of polynomial fit terms. These are relative to the 'middle' of the array in array coordinates which makes them vulnerable to changes in cropping or binning, though the pipeline does know how to translate binning changes and thus arcs can be taken in binning modes different from the science data. (Note: it is not a bad idea to always take 1x1 binning arcs to avoid line saturation and help ensure accurate centroid measurements even if science observations are binned.)

Associated options:

Output:

Diagnostic output:


wavapply

Does two things: first, it matches arc solutions to science data to determine the actual expected wavelength of each pixel; next, applies a linear flexure correction to these wavelengths using the sky lines.

For consistency, each base configration (grating+dichroic) has its own associated master arc solution that is used for all observations with that configuration during a run, even if multiple arc solutions exist. If arcs were taken during the night (i.e., as well as during the afternoon/morning), the arc closest in time to an observation is used to correct the linear (central wavelength) term only, but all higher-order terms originate from the master arc.

Flexure adjustment is usually performed by matching the wavelengths of detected sky lines in the sky-spectrum column. Failing this (for very short spectra or spectra in twilight) telluric absorption is used instead. This is much easier for the red camera than for the blue camera, since the latter generally has only one or even zero strong sky lines, and as a result the blue wavelength solution is much less likely to be accurate.

Associated options:

Output:

Diagnostic output:

None.


response

Determine the response function, a translation function from (spatially summed) DN's to actual physical f_lambda units as a function of wavelength using observations of known standard stars.

Standard star reference are taken from a variety of sources. Some of these (e.g. HST CALSPEC standards) are excellent but others are quite poor; many are missing spectral features and some contain telluric lines. The program interpolates both these standard reference spectra and the observed counts spectra over known absorption and telluric lines, then determines an overall response function which is then median-filtered (to remove outliers) and smoothed (to suppress noise). The telluric absorption profile is then measured separately by comparison of the actual counts spectrum to the interpolated spectrum. These are stored as separate columns in a .response file.

Associated options:

None.

Output:

Diagnostic output:


fluxcal

Uses the response function to flux-calibrate observations. The program tries to find two standard stars, one at lower airmass and one at higher airmass, and averages them (after adjusting each using a model Mauna Kea atmospheric attenuation function). 'Good' standards with few spectral features and very good reference spectra are chosen in preference to 'poor' standards. The overall response function is applied (again with an airmass adjustment based a Mauna Kea atmospheric adttenuation curve), and then subsequently the telluric absorption is corrected for using observations of the same standard.

Telluric-specific standards are not specifically recognized or applied at this stage; the overall flux standard is always the same star that is used for telluric correction.

Associated options:

Output:

Diagnostic output:

None.


connect

Combines the flux-calibrated red and blue files to produce a final output spectrum.

Blue and red spectra are first matched to identify which spectra to 'pair up'. Next, the pipeline determines the region of wavelength overlap with which to calculate a rescaling factor (to correct the relative flux calibrations for small offsets).

The pipeline must first (for each object) calculate a rescaling adjustment factor because the flux calibration procedure can cause small (or on nonphotometric nights, large) absolute errors in the flux calibration. This is based on the median flux ratio over the overlap region. If the S/N is low or if there is no (or little) overlap, no rescaling is performed.

Additionally it must decide at what wavelength to join the spectra. (A 'hard' junction is used: every individual row in the output spectrum comes either from the blue or the red camera, not a coaddition of both.) This is based on the maximum of the products of the red and blue response over the overlap. For some grisms (e.g. 600/4000) this ends up being the edge of the blue CCD.

Associated options:

Output:

Diagnostic output:


astrometry

Solve for the WCS of an image. Uses a custom python routine, autoastrometry.py. This is done in three stages: first, a set of reference star catalogs are downloaded from SDSS or Pan-STARRS for each field (in order to avoid having to repeatedly query the web for each individual image and if the pipeline is run again). Second, each image is aligned to this master catalog. Finally, each image is aligned again: beginning with a reference image (ideally a short exposure) and then 'building up' with each subsequent image aligned either to the reference image or another twice-solved image. This second step increases (relative) astrometric accuracy by allowing a much deeper catalog than SDSS/PS1 to be used.

Associated options:

None. (See the split option, however.)

Output:

Diagnostic output:

None.


photometry

Solve for the zeropoint of each image. This uses an 'absolute and relative' combined algorithm similar to what is used for astrometry (i.e., a short exposure is solved first and then the relative zeropoints of longer exposures are solved relative to that short exposure). Generates a catalog file of star locations and magnitudes for each image; its file suffix indicates the nature of the absolute calibration of that image.

Note that solving absolute zeropoints with LRIS is difficult because the images are deep: it is challenging to find good PSF stars that are not saturated, except in short images (1 minute), and the unsaturated stars are likely to have significant catalog uncertainties. (This problem is most acute in redder filters.) Color terms are not corrected for. Partial zeropoints in a standard aperture (e.g. 1" radius) are generally more reliable because the risk of contamination by neighboring objects is much less: it is better to solve photometry directly against a catalog, rather than rely on a zeropoint solution.

Associated options:

Output:

Additionally, the headers of the images produced in the astrometry step are edited to add the new values.

Diagnostic output:


stack

Combines individual images together into a stacked mosaic. If both chips are being processed, all r images and all l images are combined separately, and then the coadds are combined together. The zeropoints calculated from photometry are used to adjust the photometric scaling of each image prior to stacking.

Associated options:

Output:

Diagnostic output:

None.



Examples

  1. Reduce data stored in the subdirectory raw/:
    IDL> lpipe, data='raw'

  2. Fully reduce only spectroscopy, only the red camera:
    IDL> lpipe, mode='s', camera='red'

  3. Run only the "prepare" step (bias subtraction/formatting), on raw data stored in a separate directory:
    IDL> lpipe, step='prepare', data='/scr3/user/lris/20120501/'

  4. Run all the 2D spectroscopy reduction steps, but don't do any later (1D) reductions.
    IDL> lpipe, mode='s', stop='skysubtract'

  5. Reduce all the imaging, include both left and chips (full field):
    IDL> lpipe, mode='i', chips='rl'

  6. Reprocess the extraction and all subsequent steps of a target:
    IDL> lpipe, mode='s', start='extract', target='J1910+1234', /redo

  7. Display some information at the command line:
    IDL> lpipe, /help




Additional Tools

Beyond the pipeline command-line options, a number of tools exist for data acquisition and exploration, and for checking and changing the pipeline results. These are run separately from the usual "lpipe" command. A brief summary of these is provided below; the code headers themselves can be checked for further information.

Checking, editing, and validation tools:
Archive convenience tools:



Installation

Create the subdirectory 'lpipe' somewhere on your hard drive (probably in your IDL directory), and unpack the contents of the pipeline tarball there (e.g., tar -xvf lpipe.tar.gz). You will need to tell IDL about the existence of this new directory by editing the IDL_PATH system variable: add the string ":+/path/to/lpipe:+/path/to/lpipe/dependencies/" to whatever paths are stored there currently, replacing "/path/to/" with the actual path. (The variable will need to be edited in your .bashrc, .cshrc, or .idlenv file to be available for future use.) The GSFC IDLastro routines must also be installed (and visible within IDL_PATH); see http://idlastro.gsfc.nasa.gov/ for programs and instructions.

In order to fully process imaging observations, you will also need to have autoastrometry, swarp and sextractor installed (this requirement will eventually be removed via a simplification of the astrometric solver method). If the latter two cannot simply be called via "swarp" and "sex", you may need to edit the file lpipe.par AND also edit the dependencies/autoastrometry.py file to indicate the actual commands to call these routines in the global variables at the top of the code. The standard UNIX routine wget is also used to download star catalogs. These are not necessary for spectrosopic reductions.

See the installation guide for more info.




Troubleshooting

While this pipeline is designed to deal with all possible observing circumstances (including many common mistakes), much more testing and development will be required before this ideal is fully reached. Despite best efforts, the program may crash if it encounters an unanticipated situation or has problems accomplishing its goals and is unable to proceed. If you encounter problems, try e-mailing Daniel Perley (d-a-perley[at]ljmu-ac-uk; replace the dashes with dots) for assistance, after checking the below.

If the pipeline does not crash, but does not process any files:

If the pipeline crashes, halts, or processes no files beyond a certain step:

If processing completes, but the results are problematic:

Users are encouraged to report all major bugs (especially crashes) by e-mailing Daniel Perley (d-a-perley[at]ljmu-ac-uk).