DoA Machine Learning Club

DATES

UPCOMING EVENTS

We temporarily have our ML sessions once every two weeks on Monday 6:30-7:30 pm Beijing time. Each time we share our own works and (or) review the ML-Astro papers (by one/two slides, pdf only, etc.). Addr. offline in room 605/727 or online by zoom (links will be given in advance).


We also co-organise the joint machine learning talks with the SKAO and JBCA. We aim to meet ~5 times per year to learn from an expert on the field of published work around their interest.


UPDATE: Moved to another GitHub Repository, THU-DoA-DATA-SCIENCE,

The time table for 2022-2023 Autumn Semester.
Session info Talk description
Time: 11/21, 2022
Speaker: TBD
location: Online via Zoom
TBD

PAST EVENTS

LSST Cadence Strategy Evaluations for AGN Time-series Data in Wide-Fast-Deep Field

Speaker: Xinyue Sheng(盛馨月) (Birmingham)
Abstract: The variability in Active Galactic Nuclei (AGN) light curves signals a range of physics, including that of accretion onto a compact objects and the potential to be associated with gravitational wave generation. As such, the upcoming Vera Rubin Observatory (VRO) Legacy Survey of Space and Time (LSST) could be a critical dataset.
We have applied, for the first time in astronomy, a novel Stochastic Recurrent Neural Network (SRNN) algorithm to reconstruct light curves of AGN from simulated LSST data. We use three Continuous Auto-Regressive Moving Average (CARMA) representations of AGN variability to simulate 10-year AGN light curves as they will appear in the upcoming LSST data stream.
We show the impact on AGN science of five proposed cadence strategies for LSST's primary Wide-Fast-Deep (WFD) survey, as determined by a metric to evaluate how well SRNN recovers the underlying CARMA parameters. We find that the light curve reconstruction is most sensitive to the duration of gaps between observing season, and that of the proposed cadences, those that change the balance between filters, or avoid having long gaps in the g-band perform better.
Overall, SRNN is a promising means to reconstruct densely sampled AGN light curves but that for the tested LSST cadences, CARMA/SRNN models struggle to recover the decorrelation timescale (τ) if there are long gaps in survey observations. This may indicate a major limitation in using LSST WFD data for AGN variability science.

The fourth JBCA-SKAO-DoA joint machine learning session

Title: An Unsupervised Learning Approach for Quasar Continuum Prediction.
Speaker: Zechang Sun (Tsinghua)

Title: Efficiently Leveraging unlabelled astronomical data for image classification.
Speaker: Inigo Slijepcevic (JBCA)

Recording,

Deep neural networks and GW signal recognization

Speaker: Haimeng Zhao (Tsinghua, visiting EPFL)
Abstract: The modeling of binary microlensing light curves via the standard sampling-based method can be challenging, because of the time-consuming light curve computation and the pathological likelihood landscape in the high-dimensional parameter space. Machine learning has the potential to accelerate this process. However, the real data are often sparsely sampled with large gaps. This irregularity poses challenges to conventional machine-learning techniques. In this talk, I will discuss our recent work MAGIC, which is a machine learning framework to efficiently and accurately infer the microlensing parameters of binary events with realistic data quality. The key feature of MAGIC is the introduction of neural controlled differential equation (neural CDE), which provides the capability to handle light curves with irregular sampling and large data gaps.
arXiv, Recording, Ref1, Ref2,

Deep neural networks and GW signal recognization

Speaker: He Wang (ITP-CAS)
Abstract: Deep learning is a neural-inspired pattern recognition technique that has been shown to be as effective as conventional signal processing. And It has been shown to have considerable potential to identify gravitational-wave (GW) signals. In this talk, I will first review some related works on the detection and characterization of GW signals. I will then present the effectiveness of matched-filtering convolutional neural networks (MFCNN) [1] we proposed on the GW recognition and identifying generalization properties of gravitational waves. At last, I will briefly cover some ongoing works and future prospects.
Ref1, Ref2, Ref3, Ref4.

21cm cosmology with ANN

Speaker: Hayato Shimabukuro (Yunnan University)
Abstract: The period from the cosmic dark ages to the cosmic reionization period is still veiled in the history of the universe. The 21cm line is an effective method to explore these periods. In this talk, I will show how to obtain information on the cosmic reionization period by applying machine learning to the 21cm line signal.

Normalizing Flows for Optimal and Robust Cosmological Analysis

Speaker: Biwei Dai (UC Berkeley)
Abstract: Normalizing Flows (NFs) are a powerful method for modeling complex probability distributions. By applying NFs directly to cosmological fields, we are able to learn the high-dimensional field-level data likelihood. This allows us to extract more cosmological information compared to traditional analysis based on summary statistics, which usually loses information due to data compression. In the first part of the talk, I will introduce Translation and Rotation Equivariant Normalizing Flow (TRENF), a generative NF model which explicitly incorporates symmetries. I will show that the TRENF likelihood agrees well with the analytical expression on Gaussian random fields. On nonlinear cosmological overdensity fields, TRENF leads to significant improvements in constraining power over the standard power spectrum analysis. TRENF is also a generative model of the data, and its samples agree well with the N-body simulations it trained on. I will also talk about how to handle effects that break the symmetry of the data, such as the survey mask. In the second part of the talk, I will introduce multiscale flow, an NF model which is able to decompose the data likelihood into different scales. By comparing the posterior of different scales, we can identify the systematics in the data analysis and obtain robust constraints of cosmological parameters.
Recording

The 4th JBCA-SKAO-DoA joint machine learning club

Title: Deep Learning of DESI Mock Spectra to Find Damped Lyα Systems.
Speaker: Ben Wang (Tsinghua)
Abstract: We have updated and applied a convolutional neural network (CNN) machine learning model to discover and characterise damped Lyα systems (DLAs) based on Dark Energy Spectroscopic Instrument (DESI) mock spectra. We have optimized the training process and constructed a CNN model that yields a DLA classification accuracy above 99% for spectra which have signal-to-noise (S/N) above 5 per pixel. Classification accuracy is the rate of correct classifications. This accuracy remains above 97% for lower signal-to-noise (S/N) ≈1 spectra. This CNN model provides estimations for redshift and HI column density with standard deviations of 0.002 and 0.17 dex for spectra with S/N above 3 per pixel. Also, this DLA finder is able to identify overlapping DLAs and sub-DLAs. Further, the impact of different DLA catalogs on the measurement of Baryon Acoustic Oscillation (BAO) is investigated. The cosmological fitting parameter result for BAO has less than 0.61% difference compared to analysis of the mock results with perfect knowledge of DLAs. This difference is lower than the statistical error for the first year estimated from the mock spectra: above 1.7%. We also compared the performance of CNN and Gaussian Process (GP) model. Our improved CNN model has moderately 14% higher purity and 7% higher completeness than an older version of GP code, for S/N > 3. Both codes provide good DLA redshift estimates, but the GP produces a better column density estimate by 24% less standard deviation. A credible DLA catalog for DESI main survey can be provided by combining these two algorithms.

Title: Deterministic Langevin Monte Carlon.
Speaker: Richard Grumitt (Tsinghua)
Abstract:I present Deterministic Langevin Monte Carlo (DLMC), a general purpose Bayesian inference algorithm that replaces the stochastic term in the traditional Langevin algorithm with the deterministic gradient of the particle density. Particles are evolved in the direction of increasing target density minus the particle density, or equivalently increasing importance weight. We utilise normalising flows (NF) for density estimation, which allow us to further leverage NF-based preconditioning and sampling to acclerate inference. By removing the stochastic term from the Langevin updates we are able obtain accurate posterior estimators using orders of magnitude fewer likelihood evaluations than state-of-the-art algorithms such as Hamiltonian Monte Carlo.
Recording

"Chit-Chat" discussions

Speaker: ALL.
Abstract:We decide to try a new form of the machine learning session, "Chit-Chat " discussions, where everyone can present the current progress of their work and ask for advice. The main purpose of these discussions is to help us know what others are doing and get new ideas for our research. The discussions will be very casual and informal, we encourage everyone to participate and express their ideas freely.
Here are some examples of what you can present in this "Chit-Chat ":
1. the current progress of your work
2. some problems you are trying to solve in your research
3. some interesting/unsolved data-analysis problems
4. some ideas for your new projects

Deterministic Langevin Monte Carlo

Speaker: Richard Grumitt (Tsinghua).
Abstract:I present Deterministic Langevin Monte Carlo (DLMC), a general purpose Bayesian inference algorithm that replaces the stochastic term in the traditional Langevin algorithm with the deterministic gradient of the particle density. Particles are evolved in the direction of increasing target density minus the particle density, or equivalently increasing importance weight. We utilise normalising flows (NF) for density estimation, which allow us to further leverage NF-based preconditioning and sampling to acclerate inference. I will also discuss a gradient-free implementation of DLMC, based on constructing surrogates of the target gradient.

How to quantify fields or textures? A guide to the scattering transform

Speaker: Sihao Cheng (JHU, IAS).
Abstract:Extracting information from stochastic fields or textures is a ubiquitous task in science, from exploratory data analysis to classification and parameter estimation. From physics to biology, it tends to be done either through a power spectrum analysis, which is often too limited, or the use of convolutional neural networks (CNNs), which require large training sets and lack interpretability.
I will present a new powerful tool called the “scattering transform”, which stands nicely between the two extremes. It borrows mathematical ideas from CNNs but does not require any training. I will show recent progress of its theoretical framework and use various examples (including generative models) to demonstrate its interpretability and its advantage over traditional statistics.

Summary of the AISRS22

Speaker: Ce Sui (Tsinghua).
Abstract:I will give a very brief summary of the AISRS22(AI Super-Resolution Simulations) conference that happened last month and highlight some interesting talks. Then we can have a discussion about the ideas behind these works or some techniques they use.
AISRS22 is about applications of AI in the super-resolution enhancement of numerical simulations in Physics and Engineering. You can find the list of talks and their abstracts at
AISRS22 workshop

The third JBCA-SKAO-DoA joint machine learning session

Title: Machine Learning For Detecting FRBs.
Speaker: Ben Stappers (JBCA)
Abstract: I will discuss our use of convolutional neural networks for detecting Fast Radio Bursts (FRBs) with MeerKAT.

Title: MAHGIC: a Model Adapter for the Halo–Galaxy Inter-Connection.
Speaker: Yangyao Chen (Tsinghua)
Abstract:We develop a model to establish the interconnection between galaxies and their dark matter haloes. We use Principal Component Analysis (PCA) to reduce the dimensionality of both the mass assembly histories of haloes/subhaloes and the star formation histories of galaxies, and Gradient Boosted Decision Trees (GBDT) to transform halo/subhalo properties into galaxy properties. We use two sets of hydrodynamic simulations to motivate our model architecture and to train the transformation. We then apply the two sets of trained models to dark-matter-only (DMO) simulations to show that the transformation is reliable and statistically accurate. The model trained by a high-resolution hydrodynamic simulation, or by a set of such simulations implementing the same physics of galaxy formation, can thus be applied to large DMO simulations to make ‘mock’ copies of the hydrodynamic simulation. The model is both flexible and interpretable, which paves the way for future applications in which we will constrain the model using observations at different redshifts simultaneously and explore how galaxies form and evolve in dark matter haloes empirically.

How do AI help our Bayesian Inference: several methods with an example on deblending problem

Speaker: Kangning Diao (Tsinghua) and Ce Sui (Tsinghua).
Abstract:We will introduce how to use deep generative models and Bayesian Inference to solve inverse problems in Astronomy. Specifically, we are going to demonstrate this method by using The Deblending Problem as an example. To help you better understand the process, we designed a Jupyter notebook that covers the main procedure and codes for solving this problem using AI-assisted Bayesian Inference.

The second JBCA-SKAO-DoA joint machine learning club

Opening:
Dandan Xu (Tsinghua)

Title: Unsupervised machine learning for more efficient repetitive modeling of large scale datasets.
Speaker: Eamonn Kerins (JBCA)
Abstract:In an era of large-scale surveys astrophysicists are often required to analyse vast datasets to identify and model signals of a known prior form. I will talk about how unsupervised machine learning methods can help accelerate such modelling. The Spearnet survey is using telescope networks to observe and model exoplanet atmospheres. We have developed a new hybrid method that uses machine learning to accelerate our atmospheric modelling. I'll describe the basic method and its potential for widespread application to other problems.

Title: Classifying Major Mergers in the CANDELS fields using a Deep Learning model trained with IllustrisTNG data.
Speaker: Leonardo Ferreira (JBCA)
Abstract:Merging is potentially the dominant process in galaxy formation, yet there is still debate about its history over cosmic time. To address this, we classify major mergers and measure galaxy merger rates up to z∼3 in all five CANDELS fields using convolutional neural networks (CNNs) trained with simulated galaxies from the IllustrisTNG cosmological simulations. We show that the model can achieve 90% accuracy when classifying mergers from the simulation and has the additional feature of separating mergers before the infall of stellar masses from post mergers. We compare the machine learning classifications on CANDELS galaxies with visual merger classifications from Kartaltepe et al. 2015, and show that they are broadly consistent. Finally, they demonstrate that galaxy merger rates measured by the model are consistent with results found for CANDELS galaxies using close pairs statistics.
video

(Journal club) AI Poincaré: Machine Learning Conservation Laws from Trajectories

Speaker: Jiani Chu (Tsinghua) and Richard Long (MU-NAOC).
Abstract:We present AI Poincaré, a machine learning algorithm for auto-discovering conserved quantities using trajectory data from unknown dynamical systems. We test it on five Hamiltonian systems, including the gravitational 3-body problem, and find that it discovers not only all exactly conserved quantities, but also periodic orbits, phase transitions and breakdown timescales for approximate conservation laws.
slides1
arxiv

(Journal club) Real-time detection of anomalies in large-scale transient surveys

Speaker: Xiaosheng Zhao (Tsinghua).
Abstract:New time-domain surveys, such as the Rubin Observatory Legacy Survey of Space and Time (LSST), will observe millions of transient alerts each night, making standard approaches of visually identifying new and interesting transients infeasible. We present two novel methods of automatically detecting anomalous transient light curves in real-time. Both methods are based on the simple idea that if the light curves from a known population of transients can be accurately modelled, any deviations from model predictions are likely anomalies. The first modelling approach is a probabilistic neural network built using Temporal Convolutional Networks (TCNs) and the second is an interpretable Bayesian parametric model of a transient. We demonstrate our methods' ability to provide anomaly scores as a function of time on light curves from the Zwicky Transient Facility. We show that the flexibility of neural networks, the attribute that makes them such a powerful tool for many regression tasks, is what makes them less suitable for anomaly detection when compared with our parametric model. The parametric model is able to identify anomalies with respect to common supernova classes with low false anomaly rates and high true anomaly rates achieving Area Under the Receive Operating Characteristic (ROC) Curve (AUC) scores above 0.8 for most rare classes such as kilonovae, tidal disruption events, intermediate luminosity transients, and pair-instability supernovae. Our ability to identify anomalies improves over the lifetime of the light curves. Our framework, used in conjunction with transient classifiers, will enable fast and prioritised follow-up of unusual transients from new large-scale surveys.
slides
arxiv

The first JBCA-SKAO-DoA joint machine learning club

Opening:
Richard Long
DoA (Chair: Yi Mao):
Yi Mao, Shude Mao, Ben Wang, Hongming Tang
SKAO (Chair: Philippa Hartley):
Philippa Hartley, Peter Wortmann
JBCA (Chair: Anna Scaife):
Anna Scaife, Ben Stappers, Albert Zijlstra
Closing:
Micah Bowles

AI-assisted super-resolution cosmological simulations

Speaker: Yueying Ni (CMU).
Abstract:In this work, we expand and test the capabilities of our recently developed super-resolution (SR) model to generate high-resolution (HR) realizations of the full phase-space matter distribution, including both displacement and velocity, from computationally cheap low-resolution (LR) cosmological N-body simulations. The SR model enhances the simulation resolution by generating 512 times more tracer particles, extending into the deeply non-linear regime where complex structure formation processes take place. We validate the SR model by deploying the model in 10 test simulations of box size 100 ℎ −1Mpc, and examine the matter power spectra, bispectra and 2D power spectra in redshift space. We find the generated SR field matches the true HR result at percent level down to scales of 𝑘 ∼ 10 ℎ Mpc−1 . We also identify and inspect dark matter halos and their substructures. Our SR model generate visually authentic small-scale structures, that cannot be resolved by the LR input, and are in good statistical agreement with the real HR results. The SR model performs satisfactorily on the halo occupation distribution, halo correlations in both real and redshift space, and the pairwise velocity distribution, matching the HR results with comparable scatter, thus demonstrating its potential in making mock halo catalogs. The SR technique can be a powerful and promising tool for modelling small-scale galaxy formation physics in large cosmological volumes.
pdf/video to dowloud
arxiv

Radio Galaxy Zoo: how citizen science prompt machine learning development?

Speaker: Hongming Tang (MU-Tsinghua).
pdf/video to dowloud

RFI mitigation using machine learning

Speaker: Ce Sui (Tsinghua).

(Journal club) Capturing the physics of MaNGA galaxies with self-supervised Machine Learning

Speaker: Dandan Xu (Tsinghua).
Abstract: As available data sets grow in size and complexity, advanced visualization tools enabling their exploration and analysis become more important. In modern astronomy, integral field spectroscopic galaxy surveys are a clear example of increasing dimensionality and complexity of datasets, which challenge the traditional methods used to extract the physical information they contain. We present the use of a novel self-supervised Machine Learning method to visualize the multi-dimensional information on stellar population and kinematics in the MaNGA survey in a two dimensional plane. Our framework is insensitive to non-physical properties such as the size of integral field unit (IFU) and is therefore able to order galaxies according to their resolved physical properties. Using the extracted representations, we study how galaxies distribute based on their resolved and global physical properties. We show that even when using exclusively information about the internal structure, galaxies naturally cluster into two well-known categories from a purely data driven perspective: rotating main-sequence disks and massive slow rotators, hence confirming distinct assembly channels. Low-mass rotation-dominated quenched galaxies appear only as a third cluster if information about the integrated physical properties is preserved, suggesting a mixture of assembly processes for these galaxies without any particular signature in their internal kinematics that distinguishes them from the two main groups. The framework for data exploration is publicly released with this publication, ready to be used with the MaNGA or other integral field data sets.
arxiv

Machine Learns AGN

Speaker: Jiachen Jiang (Tsinghua).
Abstract:The optical and UV variability of the majority of AGN may be related to the reprocessing of rapidly-changing X-ray emission from a more compact region near the central black hole. Such a reprocessing model would be characterised by lags between X-ray and optical/UV emission due to differences in light travel time. Observationally however, such lag features have been difficult to detect due to gaps in the lightcurves introduced through factors such as source visibility or limited telescope time. In this work, Gaussian process regression is employed to interpolate the gaps in the Swift X-ray and UV lightcurves of the narrow-line Seyfert 1 galaxy Mrk 335. In a simulation study of five commonly-employed analytic Gaussian process kernels, we conclude that the Matern $\frac{1}{2}$ and rational quadratic kernels yield the most well-specified models for the X-ray and UVW2 bands of Mrk 335. In analysing the structure functions of the Gaussian process lightcurves, we obtain a broken power law with a break point at 125 days in the UVW2 band. In the X-ray band, the structure function of the Gaussian process lightcurve is consistent with a power law in the case of the rational quadratic kernel whilst a broken power law with a break point at 66 days is obtained from the Matern $\frac{1}{2}$ kernel. The subsequent cross-correlation analysis is consistent with previous studies and furthermore, shows tentative evidence for a broad X-ray-UV lag feature of up to 30 days in the lag-frequency spectrum where the significance of the lag depends on the choice of Gaussian process kernel.
pdf/video to dowloud
arxiv

Reionization Parameter Estimation using Solid Harmonic Wavelet Scattering Transform with Likelihood-free Inference

Speaker: Xiaosheng Zhao (Tsinghua).

Discussion Section and Initiate of the ML Session

Speaker: All attendees.