Start main content



Dec 19, 2022

Seminar (2022-12-19)

School of Biomedical Sciences is pleased to invite you to join the following seminar:

Date: 19 December 2022 (Monday)
Time: 10:00 am - 11:30 am
Venue: Lecture Theatre 2, William M.W. Mong Block, 21 Sassoon Road

Speaker: Dr. Yang Zhang, Department of Computational Medicine and Bioinformatics, Department of Biological Chemistry, University of Michigan
Title: Progress and challenge in AI-based protein structure prediction



Dr. Yang Zhang is a professor in Department of Computational Medicine and Bioinformatics and Department of Biological Chemistry at the University of Michigan. The research interest of the Zhang Lab is in protein folding and structure prediction, and protein design and engineering. The I-TASSER algorithm developed in his lab was ranked as the No 1 method for automated protein structure prediction in the community-wide CASP experiments in the last decades since 2006. Dr. Zhang is the recipient of the Alfred P Sloan Award, the ASBMB DeLano Award, the US National Science Foundation Career Award, and the UM Basic Science Research Award, and was selected as the Thomson Reuters/Clarivate Analytics Highly Cited Researcher.


Protein structure prediction aims to determine the spatial location of every atom in protein molecules from the amino acid sequence by computational simulations. Depending on whether homologous structures are found in the Protein Data Bank (PDB), protein structure prediction has been historically categorized into template-based modeling (TBM) and template-free modeling (FM, or ab initio folding). In this talk, we first review the important milestones of the last decades in computer-based protein structure prediction and show that the problem can be solved in principle by TBM if fold-recognition algorithms could identify the best structural templates from the PDB. Next, we discuss protein structure prediction results in the recent community-wide blind CASP experiments, showing that new deep neural-network learning approaches, built on coevolution data from multiple sequence alignments, can result in consistent and successful folding of large proteins with complicated shapes and topologies. In particular, the end-to-end training system powered by self-attention neural networks, the AlphaFold2 platform built by the DeepMind team, could fold nearly all protein domains in the CASP14 experiment with 2/3 of them having an accuracy comparable to low-resolution experimental solutions. In the most recent CASP15 experiment, new and consistent advancement over the AlphaFold2 has been made by integrating end-to-end learning technique with folding simulations. These achievements essentially break through the 50-years-old modeling border between TBM and FM and make the success of high-resolution structure prediction no longer dependent on the PDB library, which marked a solution, at least at the fold-level, to the single-domain protein structure prediction problem. Nevertheless, constructions of atomic-resolution models for multi-domain proteins and higher-order protein-protein complexes remain challenging to the community. Given the revolutionary breakthroughs brought about recently, it is expected that the problems should be solved in the foreseeable future by integrating deep machine learning techniques and the rapid advancement of genome sequencing databases, with the aid of advanced structure assembly simulation algorithms.


Should you have any enquiries, please feel free to contact Miss Angela Wong at 3917 9216.