A method for finding anomalous astronomical light curves and their analogs


Our understanding of the Universe has profited from deliberate, targeted studies of known phenomena, as well as from serendipitous, unexpected discoveries, such as the discovery of a complex variability pattern in the direction of KIC 8462852 (Boyajian’s star). Upcoming surveys, such as the Vera C. Rubin Observatory Legacy Survey of Space and Time (LSST), will explore the parameter space of astrophysical transients at all time scales, and offer the opportunity to discover even more extreme examples of unexpected phenomena. We investigate strategies to identify novel objects and to contextualize them within large time-series data sets in order to facilitate the discovery of new classes of objects, as well as the physical interpretation of their anomalous nature. We develop a method that combines tree-based and manifold-learning algorithms for anomaly detection in order to perform two tasks: 1) identify and rank anomalous objects in a time-domain dataset; and 2) group those anomalies according to their similarity in order to identify analogs. We achieve the latter by combining an anomaly score from a tree-based method with a dimensionality manifold-learning reduction strategy. Clustering in the reduced space allows for the successful identification of anomalies and analogs. We also assess the impact of pre-processing and feature engineering schemes and investigate the astrophysical nature of the objects that our models identify as anomalous by augmenting the Kepler data with Gaia color and luminosity information. We find that multiple models, used in combination, are a promising strategy to identify novel light curves and light curve families.