Skip to main content

An online, continuous learning framework for training surrogates of ocean models

Andrew
Shao
Hewlett Packard Enterprise
Talk
Artificial intelligence and machine learning methods are rapidly becoming an additional tool for scientific research. The intersection of these data-driven techniques with traditional, numerically based models largely centers around training compact models for sub-gridscale parameterizations or full-model surrogates. Within the oceanographic literature in particular, most of these applications train on static data, i.e. model output from a previously run simulation. For practical reasons, these data are must be coarsened in time and/or space and thus may alias some of the fundamental dynamics of the underlying system. In this presentation, we introduce a completely in-memory framework based on the open-source library SmartSim that addresses problems with data volume and model training. Using this framework we demonstrate the practicality of training a neural network surrogate using data streamed from a 1/4-degree global, MOM6 simulation. By intelligently sampling the data and continuously training a neural network, we demonstrate how this ability to train on timestep-level data can lead to a smaller, more compact network which can mimic the solution space with better fidelity. We conclude with other examples of hybrid AI/simulation techniques, used in other scientific domains, that also benefit from this framework.