The annual International Conference for High Performance Computing, Networking, Storage, and Analysis (SC) was held at the end of 2025 in St. Louis, Missouri. it brings together researchers, engineers, system professionals, and industry leaders to explore advances in computing, networking, storage, data analysis, and related fields.
For the past couple of years Research IT has sent representatives to the event but this was the first time in many years that we had a presence in the vast exhibition hall. Director of Research IT, Robert Haines, was joined by Abhijit Ghosh, Research Platform Engineer, and Rebecca Tyson from Campus Technology Services, who is leading our Data Centre work on campus.
Abhijit attended the event for the first time and provided a detailed account of his experiences, including key insights from the workshops.
Workshops and Tutorials
The conference features a wide-ranging technical program that includes workshops, hands-on tutorials, panel discussions, doctoral showcases, paper and poster presentations covering cutting-edge topics such as artificial intelligence, machine learning, quantum computing, and exascale systems and storage. I attended the following workshops and tutorials to represent the Research Platforms group.
Modern HPC systems are increasingly complex, requiring advanced expertise in system administration, configuration, deployment, and engineering. HPC systems professionals—including system engineers, system administrators, network and storage administrators, and operations staff—face challenges that are unique to large-scale, high-performance environments.
Sponsored by the ACM SIGHPC SYSPROS Virtual Chapter, this workshop focused on the specific needs of HPC systems practitioners and aimed to foster a supportive community for sharing knowledge and experience. The sessions emphasized best practices in HPC system deployment and maintenance, discussed emerging and upcoming technologies, and presented state-of-the-practice techniques for improving system performance, reliability, and operational efficiency. Overall, the workshop highlighted approaches that directly contribute to increased productivity for researchers and analysts relying on HPC infrastructure.
Of particular interest to me was an emerging technology called bootc, that can package operating systems as OCI containers to simplify cluster management. It is used for managing operating system deployments across the entire HPC clusters, which is an alternative method to the traditional image-based provisioning of clusters. It helps atomic OS updates, rollbacks, and version control very simple. This Container-Based technology could help in modernizing and simplifying HPC infrastructure management by addressing cluster management needs and improving operational workflows.
I also learnt about an easy method of creating and deploying container based softwares and a method of use of such containers by users which is transparent to the user and very easy to use. We intend to use this method in the University’s HPC system (the CSF).
Supercomputing centres play a critical role in enabling scientific discovery by supporting researchers across a wide range of computational disciplines. To help users navigate the complexity of HPC environments, these centres rely on dedicated user support teams.
The HUST-25 workshop provided a forum for discussing the challenges faced by HPC user support teams, particularly in environments characterized by multi-user systems, heterogeneous hardware, and rapidly evolving research software.
There is an increasing need to reduce barriers to entry and improve accessibility for researchers with varying levels of computational expertise. For more efficient interaction with HPC resources, many HPC platforms around the world are trying to provide a responsive and user-friendly web-based dashboard style access for users which is easier than the traditional command-line method. Most of these are built upon the Open OnDemand framework.
Tools like Open Composer and Drona Workflow Engine are a couple of examples which have been developed utilizing Open OnDemand framework which provides responsive and user-friendly web-based dashboards for users. These tools create the necessary scripts for submitting jobs to the HPC and submit them to the batch system of the HPC cluster, without the need of the user to painstakingly learn about creating the complex jobscripts, submitting the jobs and monitoring them using command-line. They also display useful information like announcements, system status, job information, resource utilization, etc. Open OnDemand framework can be used to provide an alternate way for users to access HPC and run interactive jobs on them.
This session was very useful as we are looking into providing similar services to users of our HPC systems.
The rapid adoption of containerization, virtualization, and modern orchestration models has significantly transformed how applications and services are developed, deployed, and managed across the broader computing landscape. This transformation is increasingly influencing the HPC community, driven by the emergence of HPC-optimized container runtimes and orchestration platforms such as Kubernetes.
This workshop explored both the opportunities and challenges associated with adopting container-based workflows in HPC environments. Topics included best practices, foundational concepts, tooling, and standards, as well as real-world experiences with deploying and optimizing containerized applications for HPC use cases. We see a lot of use of containers these days by the users of CSF3 and the workshop was helpful.
At the workshop I heard more about the use of Generative Artificial Intelligence (GenAI) for assisting users with containerization of software and software pipelines and for running them in HPC clusters using various schedulers. GenAI applications built from specialized components like inference servers, object storage, vector and graph databases, and user interfaces interconnected via web-based APIs, are often containerized and deployed in cloud environments. I also learnt about deploying such GenAI workloads and other Cloud-Native workloads within a traditional on-premises HPC systems.
Interactive HPC enables users to remain actively involved during job execution, allowing them to monitor progress, steer experiments, or visualize results in real time in order to make immediate decisions. This paradigm opens up new and innovative ways of exploiting HPC resources. Urgent computing builds on this concept by combining interactive modelling with near-real-time data ingestion to support rapid decision-making during unfolding events such as natural disasters.
Supporting interactive and urgent workloads presents significant technical and organizational challenges, requiring expertise across scheduling, networking, visualization, system architecture, and user workflows. This workshop brought together researchers, practitioners, and stakeholders from the interactive and urgent computing communities to share tools, implementation strategies, and lessons learned. Presentations focused on practical solutions, challenges encountered during deployment, approaches to overcoming those challenges, and ongoing and future work aimed at improving the interactive and urgent HPC experience for users.
The workshop discussed how the strategy of enabling oversubscribing of resources can reduce queue waiting times while maintaining overall system performance of the HPC cluster. We intend to run such tests of our own and if successful we will modify the queuing policy of our scheduler accordingly. I also learnt about various other strategies that are used in HPC scheduling for supporting mixed-urgency workloads which can be implemented in our HPC systems and we intend to explore them too.
SC25 Exhibition
The exhibition halls hosted a record 559 exhibitors showcasing a wide range of hardware and software solutions. One notable trend was the strong presence of quantum computing and related technologies. Given the growing interest in this emerging field, the organizers dedicated a specific area—Quantum Village—where companies could present their quantum products and solutions. Vendors across the HPC ecosystem showcased GPUs, processors, memory, storage, servers, networking equipment, cooling technologies, power and backup systems, and HPC software solutions.
Many universities from around the world participated in SC25 and hosted exhibition booths, with strong representation from institutions in the United States, Japan, South Korea, Singapore, and elsewhere. The Consortium of Research Computing at UK Universities, represented by Durham University, Imperial College London, the University of Leeds, UCL, the University of Warwick, and The University of Manchester, had a dedicated booth at the exhibition, staffed by representatives from each participating institution.
Robert Haines, Director of Research IT explained “Our stand was well located at a junction between sections of the exhibition hall, so we had lots of people passing through, and over 430 people left their contact details with us for follow-up conversations about our research.
As an extra bit of fun we ran a competition where people who visited our stand could get a “UK @ SC25” passport stamp – anyone who got the stamps from all the participating UK stands was entered into a raffle to win a fancy Lego set.
Teaming up with our fellow UK universities to run the exhibition stand allowed us all to have a greater impact at the event than if we had all been working separately. We were able to support a larger stand (both in size and visually), and sharing the booth rota out amongst more people meant that we could maximise availability while ensuring people could take advantage of the rest of the conference programme”.

Vendor Meetings
During the conference, I also assisted our Director in meetings with various vendors during which we discussed our current infrastructure, upcoming requirements, and long-term vision. These discussions provided valuable insight into vendor offerings, emerging technologies, and future roadmaps, and were an important part of understanding how external solutions could align with our strategic goals.
We had a packed programme of meetings, which included Dell, NVIDIA, Intel, Hitachi and AWS. The main advantage of meeting these companies at an event such as Supercomputing is that they use the conference to make important announcements and reveal updates to their internal roadmaps, so you can have up-to-date conversations about the latest products and technology, often before it is even available to buy. They also have all of their world experts in one place for the week to facilitate these conversations, so we get access to people and information that we wouldn’t get at home for some considerable time.
The trip was a very productive one for both myself and Rob. The research community at the University will benefit from the information and knowledge that we gathered as it will be used to improve our services, both from a technical and customer service point of view. We look forward to seeing what comes next!