Open Access BASE2021

The miniJPAS survey: star-galaxy classification using machine learning

Baqui, P. O; Hernández-Monteagudo, C; López-Sanjuan, C; Solano, E; Varela, J; Vílchez Medina, José Manuel; Benítez, Narciso; Cenarro, A. J; González Delgado, Rosa M; Marín-Franch, A; Moles, Mariano; Vázquez Ramió, H

Zugriff(Open Access)

Abstract

Full list of authors: Baqui, P. O.; Marra, V.; Casarini, L.; Angulo, R.; Díaz-García, L. A.; Hernández-Monteagudo, C.; Lopes, P. A. A.; López-Sanjuan, C.; Muniesa, D.; Placco, V. M.; Quartin, M.; Queiroz, C.; Sobral, D.; Solano, E.; Tempel, E.; Varela, J.; Vílchez, J. M.; Abramo, R.; Alcaniz, J.; Benitez, N. Bonoli, S.; Carneiro, S.; Cenarro, A. J.; Cristóbal-Hornillos, D.; de Amorim, A. L.; de Oliveira, C. M.; Dupke, R.; Ederoclite, A.; González Delgado, R. M.; Marín-Franch, A.; Moles, M.; Vázquez Ramió, H.; Sodré, L.; Taylor, K. ; Context. Future astrophysical surveys such as J-PAS will produce very large datasets, the so-called "big data", which will require the deployment of accurate and efficient machine-learning (ML) methods. In this work, we analyze the miniJPAS survey, which observed about similar to 1 deg(2) of the AEGIS field with 56 narrow-band filters and 4 ugri broad-band filters. The miniJPAS primary catalog contains approximately 64 000 objects in the r detection band (mag(AB)less than or similar to 24), with forced-photometry in all other filters.Aims. We discuss the classification of miniJPAS sources into extended (galaxies) and point-like (e.g., stars) objects, which is a step required for the subsequent scientific analyses. We aim at developing an ML classifier that is complementary to traditional tools that are based on explicit modeling. In particular, our goal is to release a value-added catalog with our best classification.Methods. In order to train and test our classifiers, we cross-matched the miniJPAS dataset with SDSS and HSC-SSP data, whose classification is trustworthy within the intervals 15 <= r <= 20 and 18.5 <= r <= 23.5, respectively. We trained and tested six different ML algorithms on the two cross-matched catalogs: K-nearest neighbors, decision trees, random forest (RF), artificial neural networks, extremely randomized trees (ERT), and an ensemble classifier. This last is a hybrid algorithm that combines artificial neural networks and RF with the J-PAS stellar and galactic loci classifier. As input for the ML algorithms we used the magnitudes from the 60 filters together with their errors, with and without the morphological parameters. We also used the mean point spread function in the r detection band for each pointing.Results. We find that the RF and ERT algorithms perform best in all scenarios. When the full magnitude range of 15 <= r <= 23.5 is analyzed, we find an area under the curve AUC=0.957 with RF when photometric information alone is used, and AUC=0.986 with ERT when photometric and morphological information is used together. When morphological parameters are used, the full width at half maximum is the most important feature. When photometric information is used alone, we observe that broad bands are not necessarily more important than narrow bands, and errors (the width of the distribution) are as important as the measurements (central value of the distribution). In other words, it is apparently important to fully characterize the measurement.Conclusions. ML algorithms can compete with traditional star and galaxy classifiers; they outperform the latter at fainter magnitudes (r greater than or similar to 21). We use our best classifiers, with and without morphology, in order to produce a value-added catalog. © ESO 2021 ; POB thanks, for financial support, the Coordenacao de Aperfeicoamento de Pessoal de Nivel Superior - Brasil (CAPES) - Finance Code 001. VM thanks CNPq (Brazil) and FAPES (Brazil) for partial financial support. This project has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No 888258. LADG is supported by the Ministry of Science and Technology of Taiwan (grant MOST 106-2628-M-001-003-MY3), and by the Academia Sinica (grant AS-IA-107-M01). ES has been partly supported by the Spanish State Research Agency (AEI) Projects AYA2017-84089 and MDM-2017-0737 at Centro de Astrobiologia (CSIC-INTA), Unidad de Excelencia Maria de Maeztu. MQ is supported by the Brazilian research agencies CNPq and FAPERJ. RGD acknowledges financial support from the State Agency for Research of the Spanish MCIU through the "Center of Excellence Severo Ochoa" award to the Instituto de Astrofisica de Andalucia (SEV-2017-0709) and through the projects AyA2016-77846-P and PID2019-109067GB-100. LS acknowledges support from Brazilian agencies CNPq (grant 304819/2017-4) and FAPESP (grant 2012/00800-4). This work made use of the Virgo Cluster at Cosmo-ufes/UFES, which is funded by FAPES and administrated by Renan Alves de Oliveira. Based on observations made with the JST/T250 telescope and PathFinder camera for the miniJPAS project at the Observatorio Astrofisico de Javalambre (OAJ), in Teruel, owned, managed, and operated by the Centro de Estudios de Fisica del Cosmos de Aragon (CEFCA). We acknowledge the OAJ Data Processing and Archiving Unit (UPAD) for reducing and calibrating the OAJ data used in this work. Funding for OAJ, UPAD, and CEFCA has been provided by the Governments of Spain and Aragon through the Fondo de Inversiones de Teruel; the Aragon Government through the Research Groups E96, E103, and E16_17R; the Spanish Ministry of Science, Innovation and Universities (MCIU/AEI/FEDER, UE) with grant PGC2018-097585-B-C21; the Spanish Ministry of Economy and Competitiveness (MINECO/FEDER, UE) under AYA2015-66211-C2-1-P, AYA2015-66211-C2-2, AYA2012-30789, and ICTS-2009-14; and European FEDER funding (FCDD10-4E-867, FCDD13-4E2685). Based on data from ALHAMBRA Data Access Service the at CAB (CSIC-INTA). Funding for the DEEP2 Galaxy Redshift Survey has been provided by NSF grants AST-95-09298, AST-0071048, AST-0507428, and AST0507483 as well as NASA LTSA grant NNG04GC89G. The Hyper SuprimeCam (HSC) collaboration includes the astronomical communities of Japan and Taiwan, and Princeton University. The HSC instrumentation and software were developed by the National Astronomical Observatory of Japan (NAOJ), the Kavli Institute for the Physics and Mathematics of the Universe (Kavli IPMU), the University of Tokyo, the High Energy Accelerator Research Organization (KEK), the Academia Sinica Institute for Astronomy and Astrophysics in Taiwan (ASIAA), and Princeton University. Funding was contributed by the FIRST program from the Japanese Cabinet O ffice, the Ministry of Education, Culture, Sports, Science and Technology (MEXT), the Japan Society for the Promotion of Science (JSPS), Japan Science and Technology Agency (JST), the Toray Science Foundation, NAOJ, Kavli IPMU, KEK, ASIAA, and Princeton University. This paper makes use of software developed for the Large Synoptic Survey Telescope. We thank the LSST Project for making their code available as free software at dm.lsst.org. This paper is based [in part] on data collected at the Subaru Telescope and retrieved from the HSC data archive system, which is operated by Subaru Telescope and Astronomy Data Center (ADC) at National Astronomical Observatory of Japan. Dpart carried out with the cooperation of Center for Computational Astrophysics (CfCA), National Astronomical Observatory of Japan. Funding for SDSS-III has been provided by the Alfred P. Sloan Foundation, the Participating Institutions, the National Science Foundation, and the U.S. Department of Energy Office of Science. The SDSS-III website is sdss3.org. SDSS-III is managed by the Astrophysical Research Consortium for the Participating Institutions of the SDSS-III Collaboration including the University of Arizona, the Brazilian Participation Group, Brookhaven National Laboratory, Carnegie Mellon University, University of Florida, the French Participation Group, the German Participation Group, Harvard University, the Instituto de Astrofisica de Canarias, the Michigan State/Notre Dame/JINA Participation Group, Johns Hopkins University, Lawrence Berkeley National Laboratory, Max Planck Institute for Astrophysics, Max Planck Institute for Extraterrestrial Physics, New Mexico State University, New York University, Ohio State University, Pennsylvania State University, University of Portsmouth, Princeton University, the Spanish Participation Group, University of Tokyo, University of Utah, Vanderbilt University, University of Virginia, University of Washington, and Yale University. ; Peer reviewed