Seminars
Jun 15, 2026
FDR Calibration via Synthetic Null Data: Controlling False Discoveries While Maintaining Power in High-Throughput Biology
Speaker: Professor Jessica Li
Professor & Programme Head of Biostatistics, Fred Hutchinson Cancer Center
School of Biomedical Sciences cordially invites you to join the following seminar:
Date: 15 June 2026 (Monday)
Time: 4:00 pm – 5:00 pm
Venue: Lecture Theatre 2, G/F, William M.W. Mong Block, 21 Sassoon Road
Host: Professor Yuanhua Huang
Biography
Jingyi Jessica Li is a statistician and computational biologist whose work bridges data science and biomedical research. She develops reliable and interpretable methods to analyze complex biological data, with a focus on understanding how genes function and are regulated in health and disease. Her research has helped uncover hidden patterns in gene expression and emphasized the importance of statistical rigor—ensuring that scientific discoveries are trustworthy, even when working with noisy or low-quality data. She strongly believes that statistics is not just a supporting tool but a core driver of progress in biology and medicine. Dr. Li is Professor and Program Head of Biostatistics at Fred Hutchinson Cancer Center, where she holds the Donald and Janet K. Guthrie Endowed Chair in Statistics, and Affiliate Professor of Biostatistics at the University of Washington. Previously, she was Professor of Statistics and Data Science at UCLA, with joint appointments in Biostatistics, Computational Medicine, and Human Genetics. She earned her Ph.D. in Biostatistics from the University of California, Berkeley, and her B.S. in Biological Sciences from Tsinghua University. Her contributions have been recognized with the NSF CAREER Award, Sloan Research Fellowship, ISCB Overton Prize, COPSS Emerging Leaders Award, Guggenheim Fellowship, and the Mortimer Spiegelman Award.
Abstract
False discovery rate (FDR) control is essential for reliable inference in high-throughput biology, yet it is increasingly compromised in modern analyses due to data reuse, selection bias, and model misspecification. Common remedies such as data splitting or knockoff constructions often achieve FDR control at the cost of power loss or changes to existing workflows. In this talk, I present a unified framework for calibrated inference via synthetic null data, which achieves FDR control while preserving power and leaving original data and analysis pipelines intact. The central idea is to generate data-driven synthetic null data as in silico negative controls, apply the same estimation or testing procedure to both observed and synthetic data, and use their parallel contrast to calibrate significance thresholds. This framework was motivated by a common “double-dipping” issue in single-cell RNA-seq analysis, where the same data are used both to identify cell clusters and to test for cluster-specific marker genes, leading to clustering-induced bias. This challenge led to ClusterDE, which mitigates post-clustering bias in marker discovery across single-cell, spatial, bulk, and microbiome data. Building on this idea, we developed Nullstrap, a general framework for FDR-controlled variable selection in high-dimensional models without data splitting or knockoffs. I then present Nullstrap-DE, an application of this framework to RNA-seq differential expression (DE) analysis, which calibrates popular tools such as DESeq2 and edgeR to improve FDR control under mild model violations while retaining high power. Together, these methods illustrate how synthetic null data provide a flexible and principled route to FDR calibration in high-throughput biological data analysis.
All are welcome.
New Releases
Jun 8, 2026
HKUMed School of Biomedical Sciences and Votee AI sign MoU to launch student internships and joint R&D in applied AI
May 28, 2026
HKUMed develops innovative tool to repair genetic errors, offering new hope for neurodegenerative diseases
Mar 9, 2026