Research IT recently worked with Dave Topping from School of Natural Sciences to help him and his research group utilise Julia and the JlBox package on the University’s HPC resource, the computational shared facility (CSF).
Julia is a relatively new programming language that is designed to offer the performance of lower level languages, such as C++ and Fortran, that HPC users demand, without sacrificing the flexibility and comparative ease of use of higher level languages like Python. JlBox is inspired by the PyBox Python package, originally developed by Dave, but PyBox suffered from computational performance issues started to restrict the ability to include increasingly complex chemical and physical process descriptions.
Developed by Langwen Huang, an ex-student of Earth and Environmental Sciences who worked with Dave, and is now at ETH Zurich, JLBox is ~10-100x faster whilst retaining the flexibility and readability of Python frameworks. Julia also offers native parallel computing capability, without the user having to learn and use complicated frameworks such as MPI. Thanks to this piece of work, Julia is available on the CSF3 for all researchers to use.
In Dave’s case they have used the presence of Julia on the CSF to simulate the detailed chemistry and physical processes that lead to the formation of particulate matter, a key determinant of air quality. Dave and colleagues across atmospheric science develop mechanistic models of atmospheric chemistry, with the aim of quantifying the impact of chemical processes and complexity of air quality on our climate. With ever improving atmospheric monitoring technologies, the research community are continually hypothesising and identifying new chemical processes and molecular species that are deemed important to improving our understanding of their impacts.
This continually expanding knowledge base presents both numerical and computational challenges for the development of the next generation of mechanistic models. It also poses important questions regarding the design of community driven process models that can not only adapt to these increases in complexity, but also exploit emerging computational paradigms where beneficial.
For example, the increasing use of data driven approaches across most scientific domains means that the next generation of earth system models are likely to merge machine learning (ML) with traditional process driven models in an effort to solve these challenges in complexity whilst exploiting the rich growing datasets of global observations.
However, researchers are not necessarily experienced software developers, which is where the relatively intuitive Julia comes into its own. Julia is being used in the development of machine learning frameworks, with libraries such as Flux-ML enabling researchers to embed process driven models within the so-called back propagation pipeline used to train the artificial neural networks used in ML.
Dave explains: "By enabling me to use Julia on the CSF, the Research IT team are once again giving me a platform for myself and research community to be at the front of new developments in tackling complexity. This allows me to rapidly prototype models on new and emerging hardware platforms, whilst taking advantage of the scalability the CSF has to offer."
If you are interested in using the CSF or would like to chat to one of our Rsearch Infrastructure Engineers, please get in touch.