The E4S (Ecosystem for Scientific Software) distribution integrates a growing set of AI and machine learning (AI/ML) libraries and tools that complement traditional scientific computing workflows. These capabilities help users harness the power of machine learning for data analysis, surrogate modeling, uncertainty quantification, and hybrid AI–HPC workflows that accelerate discovery.

AI/ML tools in E4S are designed to interoperate with established programming models (such as MPI, Kokkos, and OpenMP) and numerical libraries (such as PETSc and Trilinos), enabling scientific applications to adopt AI methods without compromising portability, performance, or maintainability across diverse architectures.

The Value of Using AI/ML Libraries and Tools

AI/ML libraries and tools provide new ways to accelerate scientific computing and data-driven discovery. They help researchers create predictive models that learn from data, automate analysis of large-scale simulations, and bridge the gap between traditional numerical modeling and emerging data-centric approaches.

By leveraging AI/ML libraries that are part of E4S, users benefit from:

  • Portability and performance: E4S ensures compatibility with DOE leadership-class systems and major hardware platforms (NVIDIA, AMD, Intel, ARM).
  • Interoperability: AI/ML frameworks integrate with high-performance data and math libraries, facilitating hybrid workflows that combine simulation and learning.
  • Reproducibility: E4S provides well-tested, versioned, and build-cached environments, simplifying deployment across HPC facilities.
  • Extensibility: The ecosystem supports emerging AI models, scientific surrogates, and domain-aware learning frameworks built by national labs and research partners.

E4S thus bridges the growing intersection between traditional simulation-based computing and modern AI-driven workflows.


Industry-Standard AI/ML Products Supported in E4S

Product Description Website
TensorFlow Widely used deep learning framework supporting CPUs, GPUs, and TPUs, integrated with Python and C++ APIs https://www.tensorflow.org
PyTorch Popular deep learning framework with dynamic computation graphs and strong GPU support https://pytorch.org
ONNX Open Neural Network Exchange format enabling model interoperability across frameworks https://onnx.ai
Horovod Distributed deep learning training framework for TensorFlow, PyTorch, and MXNet https://github.com/horovod/horovod
scikit-learn Classical machine learning library for Python, featuring regression, classification, and clustering https://scikit-learn.org
XGBoost Gradient boosting framework for scalable supervised learning https://xgboost.ai
LightGBM Gradient boosting framework optimized for performance and memory efficiency https://lightgbm.readthedocs.io

Emerging Scientific AI/ML Products

Product Description Website
TorchSWE Torch-based shallow water equation solver demonstrating physics-informed AI for PDEs https://github.com/torchswe/torchswe
Modulus NVIDIA-supported framework for physics-informed neural networks (PINNs) and operator learning https://developer.nvidia.com/modulus
Colmena Toolkit for steering AI-driven campaigns and adaptive sampling in HPC environments https://github.com/exalearn/colmena
DeepHyper Scalable hyperparameter optimization and neural architecture search framework for HPC systems https://deephyper.readthedocs.io
SmartSim Orchestration framework for coupling AI/ML models with HPC simulations https://www.craylabs.org/docs/smartsim
AI4S (AI for Science) Tools DOE-developed toolkits for applying machine learning to scientific simulations and experiments https://ai4science.energy.gov
Maestro Workflow Workflow manager enabling AI-driven orchestration and resource coordination https://github.com/LLNL/maestrowf

Vendor-Provided AI/ML Tools from NVIDIA, AMD, Intel, and ARM

Vendor Product Description Website
NVIDIA cuDNN GPU-accelerated primitives for deep neural networks, used by TensorFlow and PyTorch https://developer.nvidia.com/cudnn
NVIDIA TensorRT Inference optimizer and runtime for high-performance deep learning inference https://developer.nvidia.com/tensorrt
AMD MIOpen Deep learning primitives library optimized for ROCm GPUs https://rocmdocs.amd.com/projects/miopen
AMD ROCm AI Stack Integrated AI software stack including PyTorch and TensorFlow with ROCm backend https://rocmdocs.amd.com/en/latest/
Intel oneDNN Deep Neural Network Library optimized for Intel CPUs and GPUs https://github.com/oneapi-src/oneDNN
Intel OpenVINO Toolkit for optimizing and deploying AI inference across Intel devices https://www.intel.com/openvino
ARM Compute Library Optimized AI and computer vision functions for ARM-based architectures https://developer.arm.com/solutions/machine-learning-on-arm/developer-material/compute-library

Summary

E4S’s integration of AI/ML tools complements its strong foundation in numerical, performance, and I/O libraries. By combining domain-tested HPC libraries with industry-standard and emerging AI frameworks, E4S enables the next generation of hybrid simulation–AI workflows that are portable, scalable, and sustainable across future exascale and AI-accelerated systems.