The Purpose of the Meeting
Within the United States, there is now a major, community-driven push towards the National Virtual Observatory (NVO). The NVO will federate the existing and forthcoming digital sky archives, both ground-based and space based.
The NVO will likely grow into a Global Virtual Observatory, serving as the fundamental information infrastructure for astronomy and astrophysics in the next century. We envision productive international cooperation in this rapidly developing new field.
The goal of this conference is to clearly define the scientific motivation and needs, and to focus on the technical problems and challenges related to the conception of the NVO and its global equivalents. A broad community input and feedback at this early stage are essential. We aspire to build a major new facility for all astronomers, a powerful new machine to explore the universe.
Motivation: The Ongoing Data Flood in Astronomy
We are at the start of a new era of information-rich astronomy. Several ongoing sky surveys over a range of wavelengths are now generating data sets measured in the tens of Terabytes. These surveys are creating catalogs of objects (stars, galaxies, quasars, etc.) numbering in billions, with up to a hundred measured numbers for each object. Yet, this is just a foretaste of the much larger data sets to come. Large digital sky surveys and data archives are becoming the principal sources of data in astronomy. The very style of observational astronomy is changing: systematic sky surveys are now used both to answer some well-defined questions which require large samples of objects, and to discover and select interesting targets for follow-up studies with space-based or large ground-based telescopes.
This vast amount of new information about the universe, now measured in Terabytes, and soon in Petabytes, will enable and stimulate a new way of doing astronomy. We will be able to tackle some major problems with an unprecedented accuracy, e.g., mapping of the large-scale structure of the universe, the structure of our Galaxy, etc. The unprecedented size of the data sets will enable searches for extremely rare types of astronomical objects (e.g., high-redshift quasars, brown dwarfs, etc.) and may well lead to surprising new discoveries of previously unknown types of objects or new astrophysical phenomena. Combining surveys done at different wavelengths, from radio and infrared, through visible light, ultraviolet, and x-rays, both from the ground-based telescopes and from space observatories, would provide a new, panchromatic picture of our universe, and lead to a better understanding of the objects in it. These are the types of scientific investigations which were not feasible with the more limited data sets of the past.
For the first time in the history of astronomy, we will have data sets whose full information content greatly exceeds the original purposes for which the data were obtained. This opens the new field of data-mining of digital sky surveys, using the data for newly conceived projects and exploring the vast data parameter spaces. It is inevitable that the previously poorly explored parts of the observable parameter space will contain new discoveries and surprises.
The Technical Challenges
This great opportunity comes with a commensurate technological challenge: how to manage, combine, analyze and explore these vast amounts of information, and to do it quickly and efficiently? We know how to collect many bits of information, but can we effectively refine the essence of knowledge from this mass of bits?
The data volumes here are several orders of magnitude larger than what astronomers are used to dealing with, and the old methods simply do not work. There are issues on how to optimally store and access such complex data, how to combine sky surveys done at different wavelengths, how to visualize them, to search through them, etc. A lot of powerful techniques already exist and can be used or tested in these new astronomical applications; others may be developed in collaboration with applied computer scientists.
The Need for a Virtual Observatory
Many individual digital sky survey archives, servers, and digital libraries already exist, and represent essential tools of modern astronomy. However, in order to join or federate these valuable resources, and to enable a smooth inclusion of even greater data sets to come, a more powerful infrastructure and a set of tools are needed. For example, it is now easy to obtain data on a given object or a small set of objects, or to obtain images of a given small patch on the sky. However, we do not have easy and generally available tools to quickly join several multi-Terabyte sky surveys and to perform sophisticated queries in the resulting complex data sets. In other words, we can do old astronomy with subsets of the new data --- but we really want to do the new type of astronomy which these enormous data sets can support.
The concept of a virtual observatory thus emerged. A virtual observatory would be a set of federated, geographically distributed, major digital sky archives, with the software tools and infrastructure to combine them in an efficient and user-friendly manner, and to explore the resulting data sets whose sheer size and complexity are beyond the reach of traditional approaches. It would help solve the technical problems common to most large digital sky surveys, and optimize the use of our resources.
This systematic, panchromatic approach would enable new science, in addition to what can be done with individual surveys. It would enable meaningful, effective experiments within these vast data parameter spaces. It would also facilitate the inclusion of new massive data sets, and optimize the design of future surveys and space missions. Most importantly, the NVO would provide access to powerful new resources to scientists and students everywhere, who could do first-rate observational astronomy regardless of their access to large ground-based telescopes. Finally, the NVO would be a powerful educational and public outreach tool.