About Us

Context

Chile is home to many of the world's largest and most advanced astronomical facilities, including the Atacama Large Millimeter Array (ALMA), the Gemini South 8m optical telescope, the Blanco 4m and other telescopes on Cerro Tololo, and the Atacama Cosmology Telescope, together with other private telescopes like Magellan. These telescopes are all funded in part by the National Science Foundation and enable the U.S. community to lead the world in astronomical research. The Chilean astronomical community has access to 10% of the available telescope time to conduct their own research, often in collaboration with U.S. astronomers. With the advent of new instruments like the Dark Energy Camera on the Blanco telescope at Cerro Tololo and the new Large Synoptic Survey Telescope (both funded in part by the NSF), the U.S. and Chilean astronomical research communities are now challenged by the prospects of "Big Data".

 
The field of Astronomy is leading the way in the area of “Big Data” in science, specifically in the production of large scientific datasets that have been carefully designed to fulfill the requirements of diverse groups of scientists in their efforts to understand various aspects of our universe, from how our solar system formed to the evolution and destiny of the universe, with the same dataset!  Current projects include: the Panoramic Survey Telescope & Rapid Response System (PanSTARRS), which uses the world’s largest digital camera to map the northern sky; the Visible and Infrared Survey Telescope for Astronomy (VISTA) facility, which has undertaken several parallel infrared surveys of the southern sky; and the Dark Energy Survey (DES), which has recently begun to map one quarter of the southern sky with the powerful combination of a 570-megapixel imager on the CTIO Blanco 4m telescope.  Furthermore, on the horizon we have the Large Synoptic Survey Telescope (LSST), which will produce 15 terabytes of data per night covering the whole southern sky every three to four nights with its 3.2 gigapixel camera.  These astronomical projects, and their resulting datasets, are more challenging than other projects that produce highly specialized, and in some cases larger, datasets.  These astronomical datasets are not targeted to answer a single question, but instead are designed to feed a full range of research topics.  As such, they are often more complex, both in their development, including the necessary pipeline processing and storage, and their use, from effective queries to effective analyses of the results. 

 

Challenges

The volume and complexity of astronomical data continues to grow as the current generation of surveys come online (PanSTARRS, DES, VISTA). Astronomers will need to work with giga-, tera-, and even peta-bytes of data in real time (LSST). This poses the challenge of developing and using new tools for data discovery, access, and analysis. In addition, there are new opportunities for interdisciplinary research in applied mathematics, statistics, machine learning, crowd-sourcing, and other areas of active development. Astronomy provides a sandbox where scientists can come together from diverse fields to address common challenges within the "Big Data" paradigm.  Our research leadership hangs on whether the next generation of astronomers can be productive with petabyte-sized data volumes generated by these new experiments.
 
The tools and techniques used in the analyses of these extremely large datasets must be both general and flexible, but at the same time somewhat complex in order to allow scientists to explore the full promise of these incredibly rich resources.  We propose to meet the need for scientists with experience in using these tools and techniques by beginning to train advanced undergraduates and beginning graduate students today.  These young scientists will be finishing their studies just in time to lead the exploration of the data coming from LSST and other large surveys in 2020 and beyond.
 

Goals
The “La Serena School for Data Science: Applied Tools for Astronomy” will provide an intensive week of interdisciplinary lectures focused on applied tools for handling big astronomical datasets. Participants will be instructed in how astronomical data are processed, accessed and analyzed, including reduction pipelines, databases, and scientific programming.  The School will be taught by an interdisciplinary group of international professors who will use real data and examples.  Participants will work on team-based projects and be provided training and access to the Chilean National Laboratory for High Performance Computing located at the University of Chile's Center for Mathematical Modeling. 
 
The one-week school will give a rapid, hands-on introduction to the topics related to analysis of “Big Data”, including:

Astronomical Data Acquisition,
Introductory Probability and Statistics,
Data Processing Pipelines and their output,
Astronomical Databases,
Tools of the Virtual Observatory,
Tools of High Performance Computing,
Advanced Statistical tools applied to Astronomy. 

The school will begin with an introduction first to astronomical data, and then to the concepts of probability and statistics necessary to use statistical analysis on large astronomical datasets.  The coursework will then move into hands-on training for the students in the use of existing tools, including those available through the International Virtual Observatory, as well as in the use of advanced databases and high performance computing (HPC) platforms for analyses of large datasets.

 
Principal Support
Primary support for the LSSDS has been provided by the U.S. National Science Foundation and the Chilean CONICYT.

 
Participating Institutions

AURA Observatory in Chile
LSST
NOAO
University of Chile/Center for Mathematical Modeling
the Chilean National Laboratory for High Performance Computing (NLHPC)
University of La Serena
REUNA

User login