Berzelius Research Showcase

Here is a glimpse of some of the pioneering research projects that are harnessing the power of high-performance computing resources of Berzelius.

AI in Life Sciences
Materials Engineering
Autonomous Systems
Machine Learning Researches

AI in Life Sciences

Super-resolution Neuroimages from Widely Accessible Clinical Scans

Magnetic resonance imaging (MRI) is a widely used technique for acquiring 3D images of organs in living people. In neurology, MRI is used for diagnosis and treatment management of many brain disorders, and it can also be used in research settings to obtain information about the size, shape and integrity of finer regions of the brain.

The resolution of MRI is determined by magnet strength, measured in Teslas (T). 3T MRI tends to be the highest resolution seen at hospitals, but while higher-resolution 7T images are available for research studies, there high magnet strength makes them unwieldy. Using paired sets of 7T and 3T images in the same people, Jacob Vogel’s research team is training deep AI models on synthesize 7T-resolution images from much more common, lower-resolution 3T images. Improving the resolution of these common images will enhance their clinical usefulness, as well as their research potential.

Moving the project to the Berzelius compute cluster has almost immediately improved the results, allowing the team to achieve state-of-the-art performance in this “super-resolution” task. The group were able to train generalized models. Besides outperforming previously published models, their synthetic 7T images were qualitatively judged as superior to real 7T and 3T images by neuroradiologists and MRI scientists, and the synthetic images led to superior segmentations.

The team members are currently writing a paper describing their findings. In parallel, they are amassing more pairs of 3T and 7T images from additional scanners to improve the generalizability of the models. They hope to use their model to enhance the resolution and quality of massive public biobanks of MR images. Jacob Vogel’s research groups’ primary focus is on aging and neurodegenerative disease research. They utilize large data resources to model disease progression and uncover factors contributing to disease pathogenesis.

Description of Image — (Top) High-resolution 7T image of the cerebellum (left) and frontal cortex (right) of an elderly individual.
(Middle) A more common, lower-resolution 3T image of the same region in the same person.
(Bottom) A synthetic 7T image generated from the 3T of this individual (who was not part of the training sample).

Modifying and Retraining AlphaFold2 to Integrate Experimental Information

Predicting the structure of proteins is one of the ultimate goals in biology, as accurate predictions can help understanding how proteins work and how genetic diseases develop. AlphaFold is a Nobel-winning neural network developed by DeepMind to achieve protein structure prediction with an unprecedented level of accuracy. But predictions are not always useful on their own. At SciLifeLab, the researchers are working on finding new ways to integrate and validate these predictions with experimental data coming from direct observation of proteins. Berzelius is enabling the development of next-generation of tools achieve such integration, such as AF_unmasked.

The team is now also training new versions of AlphaFold on Berzelius with the objective of making it more efficient in predicting large proteins. Ultimately, the goal is to develop a tool to directly incorporate raw experimental information from multiple methods (EM, X-ray crystallography, NMR) to predict better structures, faster.

Peptide Binders designed by Artificial Intelligence

Patrick Bryant’s research team applies Artificial Intelligence (AI) to protein structure prediction and interaction analysis. They are developing neural networks to design specific linear and cyclic peptide binders, including those that can bind two target proteins simultaneously.

Leveraging Berzelius’s computing power, the team has developed EvoBind2 for rapid peptide design targeting diverse receptors in diabetes and cancer research. They are also working on EvoBind-multimer for dual-protein targeting, potentially enabling targeted protein degradation. Overall, they are not just predicting protein structures and interactions; they are designing the future of targeted therapies.

The team plans to design peptide binders for 400 human cancer cell surface receptors, aiming to create a range of peptides for various cancer types. This work lays the foundation for personalized diagnostics and targeted therapies, pushing the boundaries of AI-driven drug design. Qiuzhen Li is leading this initiative.

Linear binder (Receptor in green and peptide in blue) — (Left) Linear and (Right) Cyclic Binder (Receptor in green and peptide in blue)

Cyclic binder (Receptor in red and peptide in yellow) — (Left) Linear and (Right) Cyclic Binder (Receptor in green and peptide in blue)

Artificial Intelligence can Accelerate Drug Discovery

Access to three-dimensional structures of proteins is crucial for efficient drug design, but experimentally determining these can be costly and time-consuming. Jens Carlsson’s research team has recently explored whether models generated by methods based on artificial intelligence (AI), such as AlphaFold, can replace experimental structures in drug discovery.

Berzelius was used to generate thousands of AlphaFold models of TAAR1, a protein target involved in neuropsychiatric diseases. The TAAR1 models were then used to screen libraries containing millions of compounds to identify potential molecules that could bind to the protein. Predicted molecules were subsequently tested in experiments, leading to the discovery of potent drug-like compounds targeting TAAR1 with in vivo effects. Experimental structures later confirmed that the AlphaFold model of TAAR1 was highly accurate.

The results demonstrate that AI-based models can be utilized to accelerate the drug discovery process. The study was led by Ph.D. students Alejandro Díaz-Holguín (Uppsala University) and Marcus Saarinen (Karolinska Institute).

Machine Learning for Protein Structure Prediction

Arne Elofsson’s research team uses the Berzelius cluster to advance research through high-performance computational resources that enable complex biological insights. By utilizing AlphaFold inference, the team analyzes protein structures to deepen their biological understanding, as demonstrated in the research publications. The cluster’s computational power allows them to process extensive data efficiently, which is crucial for modelling intricate biomolecular interactions. Additionally, Berzelius supports the development of their flow-matching models, as described in their recent preprint, facilitating breakthroughs in multi-scale life science data. These resources significantly accelerate their progress in addressing biological challenges and model development, strengthening their contributions to computational biology and bioinformatics.

Training AI Models with Synthetic Images

Artificial Intelligence (AI) can help oncologists efficiently and precisely identify areas requiring radiation therapy for cancerous tumors. However, the scarcity of medical data for training AI models remains a challenge. To address this, researchers are turning to synthetic medical images. Leveraging the Berzelius supercomputer, this method has accelerated progress, avoiding years of potential waiting.

Anders Eklund’s research team at Linköping University utilizes the Berzelius supercomputer at the National Supercomputer Center (NSC) to generate and evaluate synthetic MRI images of brain tumors. Conducting this research on a standard powerful computer would have required more than five years. To evaluate these synthetic images, the team uses AI to compare them with actual MRI scans of brain tumors, employing models trained on both image types.

Improving AlphaFold2 Protein Structure Prediction

AlphaFold2 is a groundbreaking AI application for protein structure prediction, revolutionizing the field of protein folding. However, its predictions are sometimes insufficiently accurate. Our research focuses on enhancing these predictions by combining AlphaFold2 with RoseTTAFold2, another leading AI predictor.

The team aimed to improve prediction accuracy by leveraging the strengths of both models. This required running computationally expensive analyses on large datasets, a process enabled by Berzelius. The server provided the computational power to handle these resource-intensive tasks, allowing us to generate, comparing and utilize predictions from both applications efficiently.

By combining the outputs of AlphaFold2 and RoseTTAFold2 in a weighted manner, we demonstrated improved accuracy, achieving lower errors than using either model alone. In the future, this approach could be applied to emerging methods that outperform current applications, driving further advancements in accurate protein structure prediction.

Self-pruning Networks through Prolonged Training

Interpretability is a challenge for deep learning models. This is particularly critical for biology and medicine where understanding molecular mechanisms is almost as important as accurate predictions. To address this, Avlant Nilsson’s lab integrates molecular networks into deep learning models to ensure that they are mechanistically interpretable. However, these networks rely on prior knowledge, which can include false connections.

A potential solution lies in self-pruning networks through prolonged training. The phenomenon of grokking, where models improve generalization long after achieving perfect training accuracy, appears to involve a transition from dense to sparse networks, where only important interactions are retained. The preliminary tests of prolonged training on synthetic data show promising results, with networks eliminating false edges that have been introduced. Berzelius’ computational power enables the team to test these effects across many settings, to better understand what is required to improve network-based deep learning for biology.

Materials Engineering

Machine Learning Models for Materials

Atomic-scale simulations offer valuable insights into material properties and support the design of new materials for future applications. A key limitation in such simulations is often the availability and accuracy of models that describe the energy surface of the material of interest. Recently, machine learning models—such as machine-learned interatomic potentials—have gained significant attention for their ability to model complex materials with high accuracy while remaining fast during inference.

Using the powerful GPUs available on Berzelius, these models can be trained for a wide variety of materials, enabling the study of properties like vibrational spectra, diffusion coefficients, chemical reactions, glassy behavior, and phase transitions in complex materials at the atomic scale.

Below is an example of a phase transition observed in a molecular dynamics simulation of a 2D perovskite based on such models by the research team of Paul Erhart.

Autonomous Systems

Perception for Autonomous Vehicles

The research has two branches, one is situation awareness of autonomous trucks and busses working closely with Scania; the other is sonar perception and navigation for autonomous underwater vehicles as part of the Swedish Maritime Robotics Centre, SMaRC.

The team is trying to achieve increased autonomy through the use of the latest neural network methods. The researchers have benefited greatly from the Berzelius GPU resources which have reduced the training times by s orders of magnitude. SMaRC has now been made a permanent Centre. The stakeholders have specifically asked the team to continue the sonar modeling using neural nets.

Neural Rendering for Autonomous Driving

Neural rendering for Autonomous Driving (AD) is a research project by Zenseact, in collaboration with the universities Chalmers, Lund and Linköping. The overarching goal is to modify existing collected sensor data (images, lidar, and radar data) to explore safety-critical “what if?” scenarios, e.g., what if the car ahead of us suddenly breaks? In practice, the researchers learn a digital clone of the observed scene, where they can alter the behavior of other actors and observe the scene from new viewpoints. Such a rendering method can then be used for virtual testing and verification of AD systems. Berzelius has enabled the team to increase their iteration speed, where they can learn hundreds of digital clones in parallel and verify their methods at scale. For future work, the team aims to integrate generative models, both to alter scene appearance and for improving rendering quality at large viewpoint changes.

Collected data (Left) — Neural rendering lets us explore new scenarios (Right) given collected data (Left)

Exploration of new scenarios (Right) — Neural rendering lets us explore new scenarios (Right) given collected data (Left)

Machine Learning Researches

Self-Supervised Understanding of Dynamic Scenes: Let Data Be the Teacher

This research project focuses on dynamic scene understanding for autonomous systems, emphasizing the self-supervised learning approach to eliminate the need for manual annotations, particularly in 3D point cloud environments. The members of Patric Jensfelt’s research team develop models that can estimate 3D motion in scenes. This enables the system to continuously refine its understanding of dynamic objects without relying on human-annotated data. More information about the work and the code can be found here.

Using Berzelius has significantly accelerated the team’s computational processes, allowing them to train large-scale 3D datasets more efficiently. With Berzelius’ powerful GPU cluster, they have been able to test and refine their self-supervised models faster. The team plans to extend their research by incorporating long-term memory blocks into their models, which will allow for a more stable and consistent understanding of dynamic environments over time.

Machine Learning assisted Quantum Computing

The prospect of building computers based on the principles of quantum mechanics, using quantum bits (qubits), has sparked a massive research effort involving both academic institutions and private enterprises. In Sweden, the Wallenberg Centre for Quantum Technology (WACQT) leads this effort.

In contrast to bits in a classical computer, qubits are highly susceptible to noise, which at present severely limits their practical use. Quantum error correction (QEC), distributing the protected information in an entangled logical qubit state over many physical qubits, will therefore be a necessary component of future quantum computers. QEC requires a decoder, a classical algorithm that interprets a set of measurements to predict the most likely set of errors. The group of Mats Granath at the University of Gothenburg works on using deep learning, such as graph neural networks, to train decoders using both simulated and real experimental data. For this continued effort the resources at Berzelius are instrumental, allowing for both training the deep networks and generating the massive amounts of simulated data required.

Identifying Key Elements in Self-Supervised Representation Learning

Representation learning is a process in machine learning where algorithms extract meaningful patterns from raw data to form representations that contain higher-level semantic concepts like objects. This approach is crucial for tasks such as classification, retrieval, and clustering, enabling deeper insights into complex data.

The research team’s focus is to develop a better understanding of current representation learning methods, particularly self-supervised learning techniques in the visual domain. They aim to analyze the effects of various inductive biases, such as object-centric bias or other architectural choices, and leverage these insights to develop better and more generalizable representation learning approaches.

The Berzelius supercomputer at the National Supercomputer Center (NSC) provides the computational power necessary for the large-scale experiments. It enables the researchers to efficiently train and compare object-centric models with large pre-trained vision foundation models. The cluster’s reliability and excellent user support have ensured substantial and seamless progress throughout the project.

Machine Learning for Computer Vision

Michael Felsberg’s research team has been working on learning-based approaches to computer vision for more than 20 years, but since the Deep Learning revolution this has becoming a main stream topic with increasing relevance to many application areas such as autonomous driving, remote sensing, and human-machine interaction.

The team addresses several scientific challenges within the areas of simulating quantum machine learning, human motion analysis from videos, learning for large scale remote sensing scene analysis, probabilistic 3D computation from time-of-flight measurements, spatio-temporal networks for scene flow estimation, injection of geometry into Deep Learning, using model knowledge in hybrid machine learning, and large multi-modal models for biodiversity.

The parallelism enabled Berzelius has facilitated the rapid training, validation, and testing of in the order of 50 different deep networks for the addressed visual tasks, including their comparisons to baseline and state-of-the-art methods as well as comprehensive ablation studies over architecture variations and parameter choices. Without these experiments, publications in top-tier venues such as CVPR, NeurIPS, ICML, ECCV, etc. are impossible.

The method-oriented research in the team will continue to explore new application areas, such as the analysis of newly developed materials and the use of remote sensing data in climate models.