In late 1980's and early/mid 1990's, Palomar Observatory conducted a survey of the entire northern sky, using its Samuel Oschin Telescope (the 48-inch Schmidt). This "second" Palomar sky survey (POSS-II) was the last of the major photographic sky surveys, which served as the fundamental atlases of the sky for many decades.
The sky was photographed in three different colors, blue-green, red, and far-red (Kodak J, F, and N emulsions). In order to cover the entire Northern sky required approximately 900 partly overlapping fields, 6.5 degrees on the side, spaced by 5 degrees (for comparison, the full Moon has a diameter of half a degree). However, in order to make the resulting sky atlas scientifically useful in the modern era, and to fully exploit its scientific potential, the photographs had to be converted into a digital image format, suitable for a computer analysis. This was done as a collaborative project by the Space Telescope Science Institute (STScI) and Caltech. STScI needed this as a source of guide stars to point accurately the Hubble Space Telescope (HST). Images from these scans are available from various "Digital Sky Survey" (DSS) servers. The photographs were also digitized by the U.S. Naval Observatory, and a couple of other venues, for their own special purposes.
The result of this effort was a Digital Palomar Observatory Sky Survey (DPOSS), containing about 3 Terabytes of digital images, plus catalogs of extracted sources (stars, galaxies, quasars, etc.) and their properties. A terabyte is a thousand Gigabytes or a million Megabytes; a byte corresponds to a single letter of text. An average book contains about half a million bytes. Thus, the amount of information in this digital survey is equivalent to that in about 6 million books. The human genome is less than Gigabyte in size.
Thus, DPOSS was the largest sky survey at the time, in terms of the data volume. It has been since surpassed by sky surveys an order of magnitude larger, but it served as a valuable testbed and precursor in many ways.
This is a vast amount of information, especially by the standards at the time of the production. Converting these large amounts of raw data into useful scientific results was a challenging task, which requires an entirely new generation of computing analysis tools. As a collaboration between Caltech and the JPL's Machine Learning Systems group, we have developed a powerful new software system, called SKy Image Cataloging and Analysis Tool, or SKICAT. This pioneering and award-winning system incorporated the latest in the artificial intelligence (AI) or machine learning (ML) technology, including databases, automated classification tools, expert systems, machine-assisted discovery, etc., in order to automatically catalog and measure sources detected in the sky survey images, to classify them as stars or galaxies, and to assist an astronomer in performing scientific analyses of the resulting object catalogs.
The really novel aspect of SKICAT was its machine learning ability. The computer system can be trained by an astronomer to perform a very tedious and repetitive, yet non-trivial task of finding, measuring, and classifying the sky objects. The scientist can then concentrate on the interpretative and creative part of the work, again assisted by the new AI/ML tools.
In addition, a very substantial effort went into the calibration of the survey, using superior CCD images obtained at the Palomar 60-inch telescope. They covered a tiny portion of the DPOSS area, but provided both a photometric calibration (uniform assessment of the brightness of objects across the survey), and a training data set for automated star/galaxy classification programs.
The processing of the sky survey resulted in a catalog of over 50 million galaxies, and half a billion stars, including tens of thousands of quasars. It covered most of the northern sky, but avoided the regions near the Galactic Plane, where crowding of stellar images made analysis difficult, while offering little in terms of the scientific returns for the survey team. The product is the Palomar-Norris Sky Catalog, or PNSC - acknowledging the generous sponsorship by the Norris Foundation.
This was a new type of an astronomical catalog, living in a computer memory, constantly upgraded as the new data and better calibrations come in. Unlike the astronomical catalogs in the past, it will never be printed! Printing the galaxy catalog alone would require about a thousand thick volumes, and printing the whole catalog would take about 40,000 large volumes! Instead, users can download the parts of the catalog they need via web interface, and analyze them using the powerful software tools developed for this purpose.
Effectively, DPOSS is a digital road map of the northern sky, and the means to navigate and explore it. Both the digital sky survey itself and the data analysis techniques we have developed will enable astronomers to perform a large variety of scientific studies, exceeding in scope most previous efforts by orders of magnitude, and acting as a testbed and a foundation for subsequent efforts. For example, a modified version of SKICAT was used by planetary scientists to discover up to a million small volcanos on Venus, from the Magellan radar images. Our star/galaxy classification and high-redshift quasar discovery techniques were adopted by other groups; etc.
The survey produced a number of scientific results, and it continues to do so, even as more modern data sets of a superior quality are now becoming available. Among the early results, we have used the counts of galaxies at different brightness levels to test the models of galaxy evolution; produced new and improved measurements of the large-scale structure in the universe, as probed by the clustering of faint galaxies; produced major new catalogs of galaxy clusters and compact groups, and characterized galaxy properties in them; explored a tidal disruption of globular star clusters; discovered many low surface brightness galaxies; etc. We also discovered about a 100 of the most distant quasars known at the time, surpassing by an order of magnitude the previous efforts in this field, and used them as valuable probes of the early universe and quasar evolution. In a data set this large, there is an exciting possibility of discovering entirely new kinds of astronomical objects or phenomena; some examples we found include very peculiar, extremely rare types of quasars, and transient sources seen only once in repeated exposures of the sky. Work on these and many other projects enabled by DPOSS continues.
DPOSS was surpassed by a superior, fully digital Sloan Digital Sky Survey, which covered only about 40% of the DPOSS area, but went deeper, in 5 filters, and with a better image quality. Many other, fully digital sky surveys are now under way or are being planned, including our own Palomar-Quest survey. The DPOSS data set is also one of the initial data sets used in the foundations of the National Virtual Observatory.
The DPOSS team was lead by Prof. S. George Djorgovski, and it included Drs. Reinaldo de Carvalho, Roy Gal, Ashish Mahabal, Steve Odewahn, Robert Brunner, Nick Weir, Julia Kennefick, Paulo Lopes, Eilat Glikman, Sandra Castro, and others, including a number of excellent Caltech undergraduates. On the computational technology side, we also collaborated with Drs. Usama Fayyad, Joe Roden, Rich Doyle, Paul Stolorz, Matthew Graham, Roy Williams, Joe Jacob, and others. Our international collaborators include Drs. Giuseppe Longo, Roberto Scaramella, and others. Generous funding for the work on DPOSS was provided by the Norris Foundation and by other private donors. Essential funding for the computational technology developments was provided by the NASA AISRP program. Some funding was also provided by the National Science Foundation.
For more info, please contact George Djorgovski: