Poster Titles and Abstracts

Poster Session 1: Wednesday, September 3 at 3:15–4:15 p.m. CT
Poster Session 2: Thursday, September 4 at 4:15–5:15 p.m. CT

Note: Posters in Poster Session 1 will be on display for the first two days of the conference (Tuesday and Wednesday) and posters in Poster Session 2 will be on display for the last two days of the conference (Thursday and Friday)

**Organized alphabetically by presenter’s last name within poster session**

Poster Session 1:


Julian Cuevas-Zepeda, The University of Chicago
Intelligent Systems for Automated Sensor Calibration
By nature, novel sensor development is expensive. During the testing and maturation phase of a new sensor, a significant time investment is required to find the optimal operating parameters before the sensor’s performance can be studied and characterized under various conditions. The total effort spent on parameter optimization can occupy a year or more of an expert’s time; representing a significant investment in routine work. In this work, we present a novel technique for automated sensor calibration. This technique leverages machine learning to autonomously identify optimal operating states. Through this technique, we achieved a Fano-limited energy resolution for our n-SiSeRO CCD and have standardized its use in our laboratory.


Ugur Demir, Northwestern University
AI-Accelerated Binary Star Track Interpolation Through Dynamic Time Warping
Binary stellar evolution simulations are computationally expensive, requiring hundreds of CPU hours per evolutionary sequence. It severely limits large-scale population synthesis studies. We present a novel AI-accelerated approach that reduces computation time from hours to seconds while maintaining astrophysical accuracy.

Our method leverages pre-simulated grids as memory banks for K-Nearest Neighbor interpolation, enhanced with Dynamic Time Warping (DTW) for optimal sequence alignment. Binary evolution tracks present unique challenges due to their irregularly sampled, variable-length nature with abrupt morphological changes during critical phases like mass transfer events. Traditional interpolation methods that compress tracks to uniform length lose essential physical information. Our DTW-based framework preserves all morphological features of the tracks by finding correspondences between evolutionary states. This enables us to align similar signals, allowing for more accurate interpolation. Evaluation across multiple binary configurations demonstrates consistent performance improvements, particularly for rapidly varying parameters such as mass transfer rates.


Suyash Deshmukh, Vanderbilt University
Transformer-Based Analysis of Glitches and Signals in Gravitational Wave Data
Gravitational wave detectors, such as LIGO, are among the most sensitive instruments ever built. Alongside signals from astrophysical events, they record a wide range of non-astrophysical noise transients- glitches – caused by instrumental and environmental disturbances. These glitches can mimic or obscure real signals, posing a major challenge for confident detection and accurate interpretation. Matched filtering, the traditional signal detection method, is optimal but computationally intensive, relying on large banks of simulated templates. It struggles to identify signals outside known parameter spaces and is highly sensitive to the presence of glitches, which must be studied and characterized to mitigate their impact on the data.

In this work, we investigate the use of OpenAI’s Whisper model for gravitational wave data analysis. Whisper is a large-scale audio transformer that can be adapted with minimal fine-tuning for tasks such as glitch classification, signal detection, and parameter estimation. Leveraging a pre-trained model preserves the adaptability of large architectures without the high cost of training from scratch. We further analyze Whisper’s embedding space and apply dimensionality reduction to uncover label-agnostic clustering of glitches. This approach enables rapid glitch characterization beyond template-limited methods and offers the ability to recognize and adapt to previously unseen glitch types.


Jaden Downing, University of Illinois Urbana-Champaign
Physics Informed Neural Networks and Stability of Planets in Stellar Binaries
Exoplanets orbiting in stellar binary systems have been discovered with notable frequency within our neighboring galaxies. Much research has been done in predicting the dynamical evolution of triple systems through analytical and numerical methods and, most recently, incorporating machine learning. Some authors have succeeded in using such methods to classify stable, mixed, and unstable dynamical zones for planets in multi-star-planet systems. However, previous methods have been limited in their representation of the dynamical behavior of planets in the mixed zones of stability. This research proposes refining the accuracy of the behavior modeled in these regions through the expansion of the parameter spaces covered, and through the application of Physics Informed Neural Networks (PINNs). PINNs can encode the physical laws that govern orbital mechanics, making them useful for modeling complex and nonlinear dynamical systems with accuracy that is challenging to attain using classical machine learning methods. In particular, PINNs will allow us to more accurately predict and classify behavior in mixed regions near orbital resonances. Such advancements in the accuracy of stability classifications for planets in stellar binary systems could have impacts on the detection of new exoplanets, determining planet stability, and understanding planet formation and decay.


Marina Dunn, Institute of Astrophysics of the Canary Islands
Inferring the 3D Shapes of Galaxies: Bayesian and Simulation-Based Inference with Euclid Data
Understanding the intrinsic 3D shapes of galaxies is key to constraining their formation and evolution, yet direct measurements remain challenging due to projection effects and observational limitations. We present a statistical framework to infer 3D shape distributions for hundreds of thousands of star-forming galaxies from the Euclid Quick Data Release (Q1), leveraging 2D structural observables such as axis ratios and semi-major axes. Using a differentiable Bayesian model with Hamiltonian Monte Carlo (HMC), we aim to constrain the fractions of prolate, oblate, and spheroidal systems by sampling posterior distributions across stellar masses (log(M*/Msun) = 9–11.5) and redshifts (z ~ 0–3). In parallel, we implement a simulation-based inference (SBI) approach using the Learning the Universe Implicit Likelihood Inference (LtU-iLi) framework, trained on millions of mock 2D histograms, to validate and compare against the HMC results. This dual-method approach provides robust constraints on galaxy shape demographics, shedding light on how intrinsic geometry evolves with mass and cosmic time. These methods demonstrate the potential of combining Euclid’s high-resolution imaging with advanced statistical techniques to recover 3D galaxy structure from imaging data — and will scale to galaxy samples nearly 30 times larger with the upcoming Euclid Data Release 1 (DR1).


Anees Fatima, Chicago State University
AI-Driven Real-Time Cyber Threat Detection for Wi-Fi Networks
As smart cities and IoT ecosystems increasingly depend on Wi-Fi connectivity, the wireless spectrum has become a critical attack surface for cyber threats such as DDoS attacks, packet sniffing, and unauthorized access. We introduce WiFi Threat Detector, an AI-powered, real-time cybersecurity framework that passively monitors 802.11 traffic to detect wireless-layer attacks using commodity hardware.

The system is composed of four modular components: offline machine learning model training, live packet capture, intelligent inference, and interactive visual analytics. Our approach leverages AI techniques including RF fingerprinting, entropy-based feature extraction, and behavioral traffic analysis. It aligns with emerging paradigms in SDN-Edge-IoT security [3], Zero Trust architecture [1], and signal-level device identification [2].

Evaluated on real-world datasets, the framework achieves over 97% detection accuracy with minimal latency. This lightweight and scalable solution advances proactive, AI-enhanced defense mechanisms for pervasive wireless network security.

References
1.Pokhrel, S. R., et al. (2025). Toward Decentralized Operationalization of Zero Trust Architecture. IEEE JSAC.
2.Tang, Y., et al. (2024). UAV Detection and Identification Based on WiFi Signal and RF Fingerprint. MDPI Sensors.
3.Jin, X., et al. (2023). Robust DDoS Detection and Mitigation in SDN-Edge-IoT. IEEE Access.


Alex Garcia, University of Virginia
Using AI to Learn the Next Generation of Cosmological Simulations
Cosmological simulations have revolutionized our understanding of galaxy formation and evolution. Many models successfully reproduce a range of galaxy scaling relations; however, a critical — and often underappreciated — uncertainty remains in how robust these results are to choices in model physics and numerical resolution. Typically, simulations are calibrated “by eye,” often using lower-resolution models and varying one parameter at a time. Moreover, flagship simulations are usually released as a single realization, limiting our ability to assess model sensitivity and predictive uncertainty. In this talk, we introduce a new suite of galaxy simulations based on the SMUGGLE model, which features a high-resolution, multiphase interstellar medium (ISM). Our SMUGGLE suite includes over 1,000 simulations of Milky Way–mass galaxies with systematic variations in astrophysical parameters and resolution. We employ this dataset to train and validate an active machine learning framework designed to (i) identify optimal models and (ii) quantify uncertainties in simulation predictions. This approach offers a more rigorous path toward understanding the robustness and predictive power of galaxy formation models.


Nikhil Garuda, The University of Texas at Austin
Using HaloFlow with Domain Adaptation
We present an extension of HaloFlow (Hahn et al. 2023), a machine learning approach that infers the dark matter and stellar mass of galaxies from their photometry and morphology. HaloFlow uses simulation-based inference with normalizing flows to conduct rigorous Bayesian inference. In this study, we use state-of-the-art synthetic galaxy images from Bottrell et al. (2023) that are constructed from the IllustrisTNG, Simba and Eagle hydrodynamic simulations and include realistic effects of the Hyper Suprime-Cam Subaru Strategy Program (HSC-SSP) observations. While HaloFlow improves mass estimation, it isn’t robust on its own when applied to this dataset. In this talk, I’ll discuss the domain adaptation techniques we’ve explored to enhance performance. I’ll present the results from these methods, as well as future directions for improving model robustness and generalizing across different datasets.


Harley Katz, The University of Chicago
Machine-Learned Closures for Three-Moment Radiation Transport
Radiation transport (RT) is a key physical ingredient of most state-of-the-art cosmological simulations, and two moment methods (e.g. M1) have emerged as one of the most popular algorithms for solving the RT equation due to its computational efficiency. However, two moment methods are known to fail catastrophically in particular situations, e.g. when radiation fronts attempt to cross. Extending to higher-order moment methods, such as a three-moment scheme can remedy these failures; however, unlike the two moment methods, there are no known analytic closures that generalize to arbitrary situations. Hence, three moment methods are too computationally expensive to use in practice because they require the solution to a constrained optimization problem at every grid point and time step. To overcome the computational expense of the three moment methods while achieving the same physical fidelity, we propose a machine-learned closure for a three moment system, that directly maps low-order moments to higher-order ones, bypassing costly optimization steps. Our neural network offers orders-of-magnitude speedup and facilitates scalable radiation hydrodynamics simulations in multi-dimensional settings, overcoming the limitations of two moment methods at a similar computational expense.


Chang Liu, Northwestern University
A Morphological Star-galaxy Classifier for DESI Legacy Surveys: Application for the Time-domain Astronomy
Separating resolved and unresolved sources in large imaging surveys is essential for downstream science, such as identifying extragalactic transients in wide-field time-domain surveys. I will present a supervised machine-learning classifier developed to identify extended sources (i.e., galaxies) in DESI Legacy Survey (LS) DR10 imaging, using XGBoost. Rather than working with raw images, we rely on morphological features derived from LS data products. To address missing values, which is common in this data set, we build a “Hybrid” model: a weighted combination of two XGBoost classifiers, each containing features combining aperture flux measurements from the “blue” (gr) and “red” (iz) passbands observed by LS. The Hybrid model provides a good balance between sensitivity and robustness, and significantly improves upon the morphological types provided by the LS pipeline: it achieves near perfect completeness for galaxies, critical for identifying transient hosts, while still recovering >70% of stars brighter than 24 mag. The resulting LS Point Source Catalog (LS-PSC) provides morphological scores for over 3 billion sources, making it the largest morphological catalog for resolved and unresolved sources. LS-PSC will be integrated into the alert stream of the upcoming La Silla Schmidt Southern Survey (LS4) to enable real-time filtering of extragalactic transients.


Kate Overdeck, The University of Chicago
Searching for Milky Way Satellite Galaxies with Citizen Science and Machine Learning
Milky Way satellite galaxies are critical to our understanding of galaxy formation, dark matter, and the hierarchical structure of the universe, yet they are challenging to detect due to their low surface brightness, small angular size, and sparse stellar populations. In this project, we combine human classifications from the Zooniverse citizen science platform with machine learning approaches to image classification. Zooniverse volunteers visually inspect stellar density maps and Hertzsprung–Russell diagrams constructed from stars in the vicinity of each target field to flag potential dwarf galaxy candidates. We use labels from citizen scientists to train a machine learning model on both simulated data and real data. These inputs allow the model to learn key signatures of dwarf galaxies based on spatial and photometric patterns. We compare the outputs of the two methods to evaluate agreement, detection efficiency, and the potential for each to capture unusual or borderline cases. This comparison not only highlights the strengths and limitations of automated and human-driven discovery but also informs future efforts to combine the two approaches for improved performance.


Kedar Phadke, University of Illinois Urbana-Champaign
Bayesian Inference for Enhanced Cross-Matching of Multiwavelength Astrophysical Surveys
In the current era of astrophysical surveys, linking independent catalogs across the electromagnetic spectrum is crucial. However, challenges arise from significant variations in spatial resolution, especially at mm-wavelengths, where differences of up to a factor of ten exist across various surveys. Combining these resolution issues with potential anticorrelation among catalogs based on the spectral energy distribution (SED) characteristics of different source populations makes cross-matching sources problematic. Our approach will employ a generalized and flexible cross-matching technique based on Bayesian inference to connect survey catalogs effectively. We will present early results beginning with existing datasets, including SPT-3G, WISE, ASKAP, and DES, while ensuring applicability to next-generation surveys such as Rubin-LSST and CMB-S4. This powerful tool will be automated using forward models and differentiable probabilistic programming for fast and accurate associations across diverse wavelengths, ultimately aiding in the study of astrophysical phenomena. Additionally, we will present an initial classification of approximately 30,000 SPT-3G winter-field catalog sources, which will be the deepest source catalog at mm-wavelengths. This will help us identify exotic sources for potential follow-up, including very high-redshift dusty star-forming galaxies and a previously unexplored population of active galactic nuclei (AGN).


Nabeel Rehemtulla, Northwestern University
Zero-shot inference with time series foundation models for survey-agnostic periodic variable star classification
Alongside their counterparts in natural language processing and vision, time series foundation models have recently garnered significant attention for their claimed ability to model and forecast any time series. Chronos is one of such models; it is trained on 100s of gigabytes of time series from finance, traffic, healthcare, and other domains. We assess Chronos’s out-of-domain zero-shot inference capabilities by generating embeddings of periodic variable star light curves and evaluating them with unsupervised latent space clustering and classification as a downstream tasks. We use an RNN and handcrafted features as baselines for comparison, and evaluate all embedding models on large, human-labeled datasets of periodic variable stars from ZTF, CSS, and OGLE. We find that Chronos produces information-rich embeddings, leading to state-of-the-art clustering and classification performance. While the RNN fails to generalize across surveys, we show that Chronos is survey-agnostic, yielding similar performance across real astronomical datasets. These results advocate for broad adoption of Chronos, or similar tools, for modeling the irregular time series produced by wide-field surveys like ZTF and soon LSST.


Snigdaa Sethuram, Argonne National Laboratory
AI-accelerated stellar feedback model
In line with the second pillar of the SkAI mission, this work explores the use of deep learning to emulate stellar feedback processes in high-resolution cosmological simulations, significantly reducing the computational burden imposed by traditional subgrid models. We train a spatiotemporally aware neural network based on a convolutional long short-term memory architecture to predict the effects of stellar feedback across time.

Using training data from a state-of-the-art simulation with a peak comoving spatial resolution of ~1 parsec, the model processes 3D gridded input volumes (200 pc per side) centered on star particles. Each input includes multiple physical channels: gas density, temperature, velocity, metallicity, and chemical species densities. The model ingests two consecutive timesteps of this data; for newly formed stars, the first timestep is zeroed. It then outputs a prediction of the third timestep, effectively learning a many-to-one mapping conditioned on local spatiotemporal structure.

The trained model achieves ~60% accuracy averaged across all physical channels when compared to ground-truth simulation outputs. These results establish a promising proof of concept for incorporating machine-learned feedback emulators into next-generation cosmological simulation frameworks, enabling real-time, in-situ prediction of baryonic physics with substantially lower computational cost.


Yixuan Sun, Argonne National Laboratory
Probabilistic Generative Modeling with Physics Constraints
Diffusion and flow-matching models achieve state-of-the-art results for scientific processes, enabling uncertainty quantification, probabilistic prediction, and data synthesis. However, their samples can exhibit nonphysical artifacts, limiting downstream use. We address this by enforcing physical constraints via energy composition, where we combine a learned generative energy with a constraint energy defined as an isotropic Gaussian on the constraint residual, with a variance that controls tolerance. The resulting constraint-conditioned posterior defines an unnormalized target density. We then draw asymptotically exact samples from this constrained distribution using Markov chain Monte Carlo. Across multiple benchmark scientific datasets, our method reduces constraint violations while preserving fidelity to the data distribution.


Josh Taylor, The University of Texas at Ausin/CosmicAI
Time-Series Modeling of High-Resolution Radio Spectra
We present a modeling technique to characterize high-resolution radio spectra based on ARIMA (AutoRegressive Integrated Moving Average) modeling from statistical time series analysis. ARIMA isolates the dependence of a spectrum’s shape upon both its signal and structured noise components, making fewer assumptions about the shape of a spectrum’s velocity structure than standard Gaussian component fitting, and is intended to serve as a complement to the latter. Structural dependence modeling can 1) improve summary moment calculations; 2) provide alternative approaches to signal noise estimation, which can be modeled channel-wise if desired (via GARCH/ARCH); and 3) help characterize the provenance of any observed structure in a cube’s spectra (as signal, structured noise, or white noise). ARIMA modeling is computationally lightweight, backed by statistical theory and, as a first step to an analytical pipeline, can inform further downstream tasks (e.g., identifying when Gaussian component fitting is appropriate).


Padma Venkatraman, University of Illinois Urbana-Champaign
Lens Modeling Accuracy in the Expected LSST Lensed AGN Sample
Strong gravitational lensing of quasars enables measurements of cosmological parameters through their time-delay distances (time-delay cosmography, TDC). With data from the upcoming LSST survey, we anticipate using a sample of O(1000) lensed quasars for TDC. To prepare for this dataset and enable this measurement, we construct and analyze a realistic mock sample of 1300 systems drawn from the OM10 (Oguri & Marshall 2010) catalog of simulated lenses with AGN at z < 3 in order to test a key aspect of the analysis pipeline – lens modeling. We realize the lenses as power law elliptical mass distributions and generate simulated 5-year LSST i-band coadd images for all systems. From every image, we infer the lens mass model parameters using neural posterior estimation (NPE). Focusing on the key mass model parameters, the Einstein Radius and the mass density slope, we find that, given a representative prior (sampled as the NPE training set) and consistent mass-light correlations in the test and training set, we can recover the Einstein Radius with less than 1% bias per lens, 6.5% precision per lens and density slope with less than 4% bias per lens, 8% precision per lens. We present the impact of distribution shift in the training and test distribution across learned and latent parameters. We also present the impact of different data optimization methods (such as perfect deconvolution, lens light subtraction etc) prior to modeling. Under every experiment, we combine the inferred lens mass models using Bayesian Hierarchical Inference to recover the properties of the underlying true sample and demonstrate the need for a more flexible and physically motivated population model.


Georgios Zacharegkas, Argonne National Laboratory
A generative model for lightcones of dark matter halo assembly histories
We present a differentiable model of the population of dark matter halos and subhalos occupying a cosmological lightcone. Our model can generate Monte Carlo samples of (sub)halos with an abundance that accurately approximates the mass functions in cosmological simulations across a wide range of halo mass and redshift, while capturing their dependence on cosmology, through a neural-network based halo mass function model. Each (sub)halo that is generated has a mass assembly history (MAH) captured by $\theta_{\rm MAH}$, the Diffmah parameterization of halo MAH, which is predicted by DiffmahNet; the probability distribution $P(\theta_{\rm MAH}\vert M_{\rm halo},z_{\rm obs})$ is also in close agreement with the distribution seen in simulations. The differentiability of our model makes it compatible with gradient-based parameter inference pipelines, which modern cosmological analyses increasingly rely on.


Roy Zhao, The University of Chicago
Comprehensive Neural Posterior Estimation for Galaxy-Galaxy Strong Lensing
We present a new deep learning model based on neural posterior estimation (NPE) for comprehensive extraction of astrophysical parameters from galaxy-scale strong gravitational lenses. The unprecedentedly large amount of galaxy-scale strong lenses expected in future cosmological surveys (${\cal O}(10^5)$) promises to enable valuable statistical constraints in various studies ranging from galaxy formation to the nature of dark matter, but it also poses a significant challenge for traditional modelling pipelines. To this end, our automated model includes several new, state-of-the-art features and approaches leveraging the framework of simulation-based inference (SBI). We infer a total of 20 parameters describing the mass and light profiles of both lens and source galaxies, using simulated raw multi-band data modelled under noise and observing conditions expected by the Legacy Survey of Space and Time (LSST), with its summary statistics generated by a residual network. We examine the efficacy of multi-band data in extracting nearly 20 model parameters simultaneous from strong lensing images including lens light. Finally, We perform a comprehensive set of diagnostics for SBI models, evaluating the model’s prediction accuracy, stability, and uncertainty quantification.


Poster Session 2:


Simone Astarita, University of Amsterdam and UniverseTBD
How We (Co-)Write Astronomy with LLMs
The ease of access to and the capabilities of large language models (LLMs) are changing scientific research, and astronomy is no exception. As space-specific tools proliferate (e.g., Pathfinder, AstroCoder, AstroAgents) and models expand their knowledge of the field, interest in studying the effects of AI on astronomy research and paper writing grows.

I briefly survey previous work on how LLMs might have spiked the use of certain words, highlighting the limitations of the methods and data used so far. I then introduce a dataset of over 300,000 astronomy abstracts compared with two AI versions, one produced by ChatGPT 3.5 and one by 4o, the largest AI-vs-human dataset on a specific subject and task.

First, I discuss how I crafted this dataset to reflect the actual usage of ChatGPT as best as possible. I then briefly illustrate its potential to differentiate between human- and AI-produced short text but explain why this problem remains difficult. Finally, I present some promising results on the effects of LLMs on the population level, showcasing statistical techniques to assess how LLMs modify our lexicon and writing in general.


Chayan Chatterjee, Vanderbilt University
Towards A Foundational AI Model for Gravitational Waves
As gravitational wave detectors become more advanced and sensitive, the number of signals recorded by Advanced LIGO and Virgo from merging compact objects is expected to rise dramatically. This surge in detection rates necessitates the development of adaptable, scalable, and efficient tools capable of addressing a wide range of tasks in gravitational wave astronomy. Foundational AI models present a transformative opportunity in this context by providing a unified framework that can be fine-tuned for diverse applications while leveraging the power of large-scale pre-training. In this work, we explore how advanced transformer models, specifically OpenAI’s Whisper, can be adapted as a foundational model for gravitational wave data analysis. By fine-tuning Whisper’s encoder model—originally trained on extensive audio data—and combining it with neural networks for specialized tasks, we achieve reliable results in detecting astrophysical signals and classifying transient noise artifacts or `glitches’. This represents the first application of open-source transformer models, pre-trained on unrelated tasks, for gravitational wave research, demonstrating their potential to enable versatile and efficient data analysis in the era of rapidly increasing detection rates.


Jennifer Coburn, Argonne National Laboratory
Dr. MACS: Domain-Specific Multi-Agent AI for Astronomical Research
Current AI systems lack the transparency and anti-fabrication safeguards required for astronomical research, often presenting confident results while mixing real and simulated astrophysical data without disclosure. We developed Dr. MACS (Multi-Agent Cosmology System) to achieve enhanced inference from cosmic survey data with scientific rigor, directly addressing SkAI’s first research pillar. Our LLM-agnostic multi-agent framework leverages existing foundation models as reasoning engines to tackle unique challenges astronomers face with large-scale survey data, implementing specialized agents for SDSS database queries, arXiv astrophysics literature searches, astronomical statistical analysis, and celestial object visualization. Dr. MACS enforces astronomical data integrity through complete provenance tracking of observational datasets, honest uncertainty quantification for astrophysical measurements, and clear limitation disclosure that current systems lack for astronomical research. The system incorporates human-in-the-loop feedback enabling astronomers to guide research workflows and includes interpretable AI methods with mandatory validation to prevent fabricated astrophysical results. We evaluated Dr. MACS on 14 SDSS-based astronomical research questions requiring authentic observational data analysis. Unlike general AI systems providing unreliable astronomical answers, Dr. MACS delivers transparent results with complete data sources, statistical measures, and research limitations specific to astronomical investigations. This work demonstrates purpose-built AI tools and methodologies meeting the reproducibility standards astronomers require for publishable science, showcasing innovations in interpretable and physics-informed AI for cosmic survey data analysis.


Steven Dillman, Stanford University
Representation Learning for Anomaly Detection and Unsupervised Classification of Variable X-ray Sources
We present a novel representation learning method for downstream tasks like anomaly detection, unsupervised classification, and similarity searches in high-energy data sets. This enabled the discovery of a new extragalactic fast X-ray transient (FXT) in Chandra archival data, XRT 200515, a needle-in-the-haystack event and the first Chandra FXT of its kind. Recent serendipitous discoveries in X-ray astronomy, including FXTs from binary neutron star mergers and an extragalactic planetary transit candidate, highlight the need for systematic transient searches in X-ray archives. We introduce new event file representations, E-t maps and E-t-dt cubes, that effectively encode both temporal and spectral information, enabling the seamless application of machine learning to variable-length event file time series. Our unsupervised learning approach employs PCA or sparse autoencoders to extract low-dimensional, informative features from these data representations, followed by clustering in the embedding space with DBSCAN. New transients are identified within transient-dominant clusters or through nearest-neighbour searches around known transients, producing a catalogue of 3559 candidates (3447 flares and 112 dips). XRT 200515 exhibits unique temporal and spectral variability, including an intense, hard <10 s initial burst, followed by spectral softening in an ~800 s oscillating tail. We interpret XRT 200515 as either the first giant magnetar flare observed at low X-ray energies or the first extragalactic Type I X-ray burst from a faint, previously unknown low-mass X-ray binary in the LMC. Our method extends to data sets from other observatories such as XMM–Newton, Swift-XRT, eROSITA, Einstein Probe, and upcoming missions like AXIS.


Andreas Filipp, University of Montreal, Ciela, Mila
Dark matter subhalos with strong gravitational lensing
Strong gravitational lensing provides a powerful tool to directly infer the dark matter (DM) subhalo mass function (SHMF) in lens galaxies. Traditional analyses infer the presence of individual subhalos by evaluating whether introducing localized perturbers to a smooth lens model leads to a statistically significant improvement in the fit to the observed lensed images. Recent simulation-based studies demonstrated that machine learning approaches could enable population-level inference of the SHMF by combining data across ensembles of lensing systems.

Connecting observational constraints to theoretical predictions remains challenging. Even within a fixed cosmology, the SHMF depends on properties of individual galaxies, including, for example, total mass, morphology, and merger history, leading to significant system-to-system variation.

I will cover the potential and challenges we found for two population-level methods – neural ratio estimators and sequential neural posterior estimators -, and a method to predict theoretical SHMFs directly from galaxy images, conditioned on an assumed warm DM mass. We use the DREAMS simulation suite, and Synthesizer to create realistic galaxy images. Our method accounts for inter-galaxy variability and enables scalable, image-based inference of theoretical predictions. This approach provides a new pathway to compare DM models with forthcoming lensing observations, enabling per-galaxy tests of DM’s small-scale gravitational effects.


Amera Firdous, Chicago State University
AI Tools in Cybersecurity: Can Hackers Use ChatGPT Too?
Artificial Intelligence is rapidly reshaping cybersecurity but not always in ways we expect. While defenders are using AI to detect threats and automate responses, attackers are doing the same. Tools like ChatGPT are now being leveraged to create convincing phishing emails, generate malicious code, and even simulate social engineering conversations with alarming ease.

This study explores the double-edged nature of AI in the cybersecurity arms race. Demonstrations include how generative AI can be weaponized to craft smarter, more personalized attacks with minimal technical skill. As part of this project, a set of phishing email examples was generated using ChatGPT, and preliminary testing of prompt injection scenarios was conducted to understand the risks and limitations of these attacks.

In parallel, the project evaluates how cybersecurity professionals can use AI defensively for threat hunting, vulnerability analysis, and employee training simulations. Technical walkthroughs include automation of phishing attacks, breakdowns of prompt manipulation techniques, and how generative tools may influence future cybersecurity strategies.

Finally, the study reflects on the ethical challenges of releasing powerful AI tools into the wild, with particular attention to open-access models and emerging calls for regulation.


Filomela Gerou, Argonne National Laboratory
Benchmarking for Physics Reasoning Models
Physicists and astronomers require efficient and reliable methods to achieve large-scale computations and high-complexity analysis. Large Language Models (LLMs), successful in various mathematical and analytical domains, offer potential for innovation in accelerating the research process by providing problem-solving assistance. However, current models exhibit significant limitations when applied to advanced physics and mathematics, often producing incorrect results or inconsistent inference trajectories. Here we present a comprehensive evaluation pipeline of chain-of-thought multiple-choice questions designed to benchmark LLMs on step-by-step reasoning tasks drawn from graduate-level physics courses, advanced mathematics textbooks, and cutting-edge research papers. Our benchmark includes a novel question-generation pipeline that automatically derives problems from arXiv papers and curated scientific topic lists, followed by a filtering process based on an expert evaluator-set question framework. We test LLMs by employing a multi-layered assessment protocol: a primary LLM-based judge and a jury of judges evaluate the logical soundness and correctness of model-generated answers across our reflective dataset according to a rigorous reward-penalty-based grading scheme. This framework enables fine-grained analysis of reasoning trajectories and highlights models’ strengths and failures in multi-step scientific problem solving. Our work establishes the first domain-specific benchmark for evaluating LLMs as physics reasoning assistants and provides a foundation for future progress in scientific AI.


Rachel Hur, The University of Chicago
Hierarchical neural posterior estimation has its place
Hierarchical Bayesian Modeling (HBM) combined with MCMC algorithms has been shown to provide more robust and accurate inference for real-world phenomena in which nature takes a nested form. However, MCMC-based inference can be computationally expensive, and its performance often suffers for complex posterior geometries. These costs are especially pertinent for HBM. Studies have recently demonstrated the potential for a flexible, expressive, and amortized hierarchical neural posterior estimator (HNPE) built on Normalizing Flows. These studies have mostly been performed on simple datasets, or they focus on a single parameter from each level of the hierarchy. A systematic study analyzing how both hierarchical methods compare for more complex and realistic datasets is necessary before applying HNPE for scientific measurements. Here, we re-explore the theory behind HNPE and conduct comparative numerical experiments of HNPE and MCMC-based HBM methods on real and synthetic data, including strong gravitational lensing simulations. In particular, we use a suite of diagnostics to show trade-offs in terms of accuracy, precision, time to train or sample, reproducibility, and the need for expert domain knowledge. Especially for higher dimensional and complex posteriors, HNPE is expected to drastically improve on time for inference, accuracy, and precision with an upfront training time cost.


Min Long, Boise State University
Exploring Charge Exchange Emission in Star-Forming Galaxies Through Automatic Spectroscopic Fitting Using Artificial Intelligence Algorithms
We developed an open-source automated spectroscopic analysis tool framework Astro-Neo, based on Neuroevolution algorithms (NEA) to analyze spectra with minimal human intervention but high accuracy, efficiency, and reproducibility. This is achieved by taking advantage of Artificial Intelligence (AI) algorithms like the evolution algorithm (EA) and neural networks (NNs) and can address challenges in the era of big data when upcoming detectors acquire massive data at rates orders of magnitude greater than current collection rates.

The NEA trains NN using EA, a set of metaheuristic methods inspired by natural evolution for evaluating the optimization problem of fitting and obtaining a global optimum of parameters from various physics models. The method has been applied to fit X-ray spectral data of a starburst galaxy NGC 253, from XMM-Newton reflection grating spectrometer, and compared with direct fitting using Xspec and Sherpa. The fitting models include thermal emission from the diffuse hot plasma, the charge exchange emission due to its interaction with the cold gas, and the normal components of the bright point sources and the foreground absorption. The results have been compared with both MCMC methods and our early studies using only genetic algorithms (GA) and show distinct advantages in performance and accuracy.


Bhaskar Mondal, University of Illinois Urbana-Champaign
Accelerating Thermophysical Modeling of Asteroids Using Physics-Informed Neural Networks
Modern astronomical surveys are generating large amounts of asteroid observations, offering a unique opportunity to characterize small bodies in the Solar System at scale. However, key physical properties – such as size, albedo, thermal inertia, and emissivity remain challenging to infer from the observation data, with traditional thermophysical modeling (TPM) approaches. Detailed models can help with statistical inference but they are computationally intensive to scale over large populations. To address this, we are working on a Physics-Informed Neural Network (PINN) surrogate model trained on outputs from our GPU-accelerated TPM code, ThOASTER. A PINN can embed the governing physics of heat diffusion and radiative processes into its training framework, enabling accurate, near-instantaneous prediction of temperatures, leading up to thermal fluxes, and Yarkovsky-induced orbital drift for a given asteroid’s shape, spin, orbit, and material properties. By replacing expensive forward simulations with a fast, interpretable, and physically grounded model, this approach enables rapid inference of asteroid properties from survey data by allowing us to explore large parameter spaces.


Sneh Pandya, Northeastern University
Avenues for Model Robustness in Astrophysical Applications
Despite impressive in-distribution performance, neural networks typically fail to generalize to out-of-distribution samples. In astrophysics and cosmology, many models are trained on synthetic or simulated data with the goal of being deployed on observational data. We discuss several avenues for improving the generalization capabilities of neural networks by instilling symmetry priors at the architectural level (equivariance) and introducing a custom optimal transport-based domain adaptation algorithm, dubbed SIDDA. We demonstrate the efficacy of equivariance and SIDDA across various tasks in astrophysics and cosmology, including generalization between galaxy morphology observations from different instruments using the GalaxyZoo dataset, as well as neural posterior estimation of strong lensing parameters from simulations to observations. We lastly discuss implications for improved model calibration and uncertainty quantification with the inclusion of these methods.


Anarya Ray, Northwestern University
Emulating compact binary population synthesis simulations with robust uncertainty quantification and model comparison: Bayesian normalizing flows
Population synthesis simulations of compact binary coalescences~(CBCs) play a crucial role in extracting astrophysical insights from an ensemble of gravitational wave~(GW) observations. However, realistic simulations can be costly to implement for a dense grid of initial conditions. Normalizing flows can emulate the distribution functions of a simulated population of binary parameters and thereby enable simulation-based astrophysical inference from growing GW catalogs. They can also be used for data amplification in sparse regions of the CBC parameter space to better guide the development of phenomenological population models for rarely synthesizable systems without having to simulate a prohibitively large number of binaries. However, flow predictions are wrought with uncertainties, especially for sparse training sets. In this work, I develop a method for quantifying and marginalizing uncertainties in the emulators by introducing the Bayesian Normalizing flow, a conditional density estimator constructed from Bayesian neural networks. I demonstrate the accuracy, calibration, and data-amplification impacts of the estimated uncertainties for simulations of binary black hole populations formed through common envelope evolution. I outline the applications of the proposed methodology in the context of simulation-based inference from growing GW catalogs and feature prediction, with state-of-the-art binary evolution simulators, now marginalized over model and data uncertainties.


Nabeel Rehemtulla, Northwestern University
ImageNet and GalaxyZoo as pre-training datasets for real-time transient identification
Training custom architecture vision models from scratch is a practice which has fallen out of favor and has been replaced by adopting off-the-shelf pre-trained models, often transformers. The availability of large, labeled benchmark image datasets, most notably ImageNet, was a key driving factor for this change. The GalaxyZoo project has recently produced a large, labeled dataset (~900,000 entries) of color galaxy images. Still, the overwhelming majority of astronomy-computer vision studies have not adopted this practice. We aim to assess the impact of pre-training with ImageNet or GalaxyZoo on a vision transformer fine-tuned for real-time transient identification. Each input is a three channel image (new, template, difference) from a ZTF alert, i.e., the same schema used for LSST alerts. We find that GalaxyZoo and ImageNet pre-training yield similar levels of performance, but models trained from scratch uniformly outperform models with any pre-training irrespective of the fine-tuning dataset size or the pre-training dataset used. These results suggest that the images present in ZTF alerts, and likely LSST alerts, are distinct enough from standard three-channel color images (i.e. those that appear in ImageNet and GalaxyZoo) such that classifiers using them demand more tailored pre-training strategies.


Anirban Samaddar, Argonne National Laboratory
Matching-based Efficient and Interpretable Generative Models for Scientific Data
Flow Matching (FM) models have recently emerged as a powerful class of probabilistic generative models for image synthesis. While FM has been widely explored for natural images, its potential in scientific applications remains less examined. Additionally, most flow matching models in the literature do not explicitly model the underlying structure/manifold in the training data when learning the flow which leads to inefficient learning, especially for many high-dimensional real-world and scientific datasets, which often reside in a low-dimensional manifold. To this end, we present Latent Conditional Flow Matching (Latent-CFM), which provides simplified training/inference strategies to incorporate latent data structures using pretrained deep latent variable models. In this study, we evaluate Latent-CFM and Independent Conditional Flow Matching (ICFM) for generating high-quality images in two domains: (1) simulated 2D Darcy flow fields and (2) 2D slices from evolving dark-matter simulations. Our results show that both models produce samples that preserve key structural characteristics of the training data. Notably, Latent-CFM’s learned latent variables not only improve performance on the Darcy flow dataset but also disentangle meaningful latent features, providing a more interpretable generative process. This interpretability has the potential to bridge the gap between black-box generative modeling and scientific insight.


Sammy Sharief, University of Montreal
Uncertainty Quantification for Astrophysics
As astronomy enters a new era of big data, novel generative models enable inference in high-dimensional spaces, tackling previously intractable problems. This makes defining and measuring the accuracy of inferred posteriors, especially in high-dimensional parameter spaces like images, increasingly pressing. Specifically, two questions need to be answered: For a given inference pipeline providing a posterior estimator for potentially high-dimensional variables, how can we assess the accuracy of this pipeline? And if generative models are used as components of such an inference pipeline, how do we quantify the accuracy with which these models represent their underlying training distribution? I will introduce PQMass and Pokie, two likelihood-free, sample-based statistical approaches designed to tackle these challenges directly. PQMass evaluates the quality of generative models and their ability to learn the underlying data distribution without assuming the distribution of the data or dimensionality reduction, making it highly effective for detecting subtle distributional shifts and validating generative models in cosmological data analyses. Pokie compares posterior distributions from Bayesian models solely through simulated joint samples, enabling direct model comparisons without evidence computation, as well as quantitative calibration validation. PQMass and Pokie provide new avenues for scalable accuracy assessment and improving the reliability of data-driven astronomical analysis.


Philipp Rajah Moura Srivastava, Northwestern University
Deep Learning-Based Irregularly Sampled Time Series Interpolation for Detailed Binary Evolution Simulations
Simulations of binary star systems are essential to studying important astronomical phenomena including but not limited to, gamma-ray bursts, radio pulsars, gravitational-wave mergers, supernovae, and X-ray binaries. Modern-day simulation software enables large-scale binary population studies but, unfortunately, omits large parts of the simulations. This renders some studies of these phenomena infeasible, and while recent work has achieved impressively low error rates by approximating these omitted parts using traditional signal processing techniques, these error rates may not be sufficient. Typically, these methods work by creating an alignment between similar reference simulations, and assigning a weight to each simulation. In this paper, we present a novel neural network architecture that learns both the alignment between reference simulations and the weights of each reference simulation. This is achieved by using modern deep learning techniques, which use the learned alignments and weights to construct approximations in an auto-regressive manner. We also explore the role of physics-informed loss functions for binary simulation approximation; specifically, their impact on our learned representations, the convergence of our architecture weights, and the physical consistency of predicted simulations.


Ce Sui, Tsinghua University and Johns Hopkins University
Fisher Score Matching for Simulation-Based Forecasting and Inference
We propose a method for estimating the Fisher score—the gradient of the log-likelihood with respect to model parameters—using score matching. By introducing a latent parameter model, we show that the Fisher score can be learned by training a neural network to predict latent scores via a mean squared error loss. We validate our approach on a toy linear Gaussian model and a cosmological example using a differentiable simulator. In both cases, the learned scores closely match ground truth for plausible data-parameter pairs. This method extends the ability to perform Fisher forecasts, and gradient-based Bayesian inference to simulation models, even when they are not differentiable; it therefore has broad potential for advancing cosmological analyses.


Josh Taylor, The University of Texas at Austin/CosmicAI
Prototype Learning for Weakly Signaled Astronomical Sources
We demonstrate the variety of ways an older signal processing technique — vector quantization — can enhance machine learning of modern astronomy data, where observations are plentiful (high sample size), high resolution (high dimensionality) and typically measure properties of highly continuous and weakly-signaled processes. Specifically, vector quantization can directly address:

• Sample size reduction — making learning of large samples computationally feasible;
• Curse of dimensionality issues in high-dimensional spaces — discretization defines sensible neighborhoods to combat the high-d sparsity phenomenon;
• SNR Boosting — intelligent averaging helps reveal weakly signaled processes;
• Sensible binning — helps delineate crisper boundaries for identification of regime switching;
• Parameterization issues in common local and global learning tasks — via learned (vs. prescribed) nearest neighbor networks;
• Uncertainty quantification of more exotic quantities — by providing a natural framework for the M-out-of-N bootstrap.

These benefits are showcased in various unsupervised learning tasks emanating from both real and simulated data: a Spitzer YSO archive, different molecular line observations of known star-forming regions, and protostellar core evolution in STARFORGE simulations.


Maggie Voetberg, Fermilab
DeepDiagnostics: A software package for streamlined posterior evaluation
Automated prediction techniques like simulation-based inference (SBI) are important tasks for science experiments that produce large amounts of complex, raw data. However, their development remains in its early stages because the uncertainties of these techniques lack sufficient trustworthiness and interpretability. Packages for SBI provide a growing set of diagnostics; however, the software requirements are substantial, as they are tied to the inference technology itself, and the APIs lack adaptability. We introduce the DeepDiagnostics package for diagnosing posteriors from analytic likelihood-based methods and SBI methods, such as neural posterior estimation. DeepDiagnostics produces a comprehensive set of high-quality visualizations and metrics in a highly accessible, easy-to-use, and flexible package. We address all of these goals by providing a command-line inference tool and a Python API that is controlled through a configuration file. The package includes common diagnostics, such as parity plots, corner (covariance) plots, simulation-based calibration (SBC) diagnostics (including posterior coverage and rank histograms), Lemos et al.’s PQMass and TARP, Masserano et al.’s WALDO, Linhart et al.’s LC2ST, as well as credible region diagnostics developed by our group.


Bonnie Y. Wang, The University of Chicago
Set-based Implicit Likelihood Inference of Galaxy Cluster Mass
We present a set-based machine learning framework that infers posterior distributions of galaxy cluster masses from projected galaxy dynamics and cluster morphology. By combining deep sets and conditional normalizing flows, we are able to incorporate both positional and velocity information of member galaxies to predict residual corrections to the M–σ relation for improved interpretability. The model is trained on data from the Uchuu-UniverseMachine simulation suite, which provides a population of clusters across a broad mass range as well as realistic galaxy properties. Compared to traditional M–σ point estimates, our approach exhibits a substantial reduction in predictive scatter and yields well-calibrated uncertainties. Intriguingly, we find that the model yields tighter constraints for less concentrated clusters, which are typically considered to be dynamically unrelaxed. This reveals a trend not captured by standard equilibrium-based models, suggesting that the model captures additional structure in the phase-space distribution beyond what is encoded in equilibrium-based scaling laws. These findings underscore the potential of data–driven techniques to provide both accurate and physically interpretable estimates of cluster masses, while offering new insight into cluster assembly and dynamical evolution.





The SkAI Institute is one of the National Artificial Intelligence Research Institutes funded by the U.S. National Science Foundation and Simons Foundation. Information on National AI Institutes is available at aiinstitutes.org.

NSF logoSimons Foundation logo

Open SkAI 2025 logo