Construct your prompt from the instructions below then use the E4S Guide Bot
Chat with the E4S Guide Bot
Introduction
Selecting the right data and visualization libraries or tools is an important step in building efficient, scalable, and maintainable workflows for scientific computing and high-performance data analysis. The E4S 25.06 release includes a rich collection of data management, I/O, and visualization libraries designed for parallel performance, portability, and ecosystem integration.
A newcomer should think about attributes related to their data structures, access patterns, parallelism model, visualization scale, supported hardware, and compatibility with programming environments. Understanding these attributes helps identify tools that meet the technical and scientific needs of a given project—whether the goal is to efficiently store and move simulation output, visualize multi-terabyte datasets, or integrate AI-assisted analytics into scientific workflows.
When defining selection attributes, it helps to consider both broadly meaningful characteristics (like portability or API support) and situation-specific attributes (like in-situ visualization, streaming data, or exascale readiness).
These attributes can then be described in a structured form that allows a chatbot or decision assistant to guide the selection of appropriate E4S products such as ADIOS2, HDF5, ParaView, Ascent, VisIt, VTK-m, or Catalyst2.
Example Prompt
I need a parallel data I/O library that can handle structured mesh output from a multi-GPU simulation on Frontier. The workflow runs in MPI with C++ and uses Kokkos for parallelism. The output should be compatible with ParaView Catalyst for in-situ visualization. I prefer an open-source library with strong E4S support and minimal code changes for integration.
Broadly Meaningful Attributes
| Attribute |
Description |
| Parallel I/O support |
Whether the library supports scalable I/O for distributed applications using MPI or similar models |
| Data model |
Type of data supported (structured, unstructured, tabular, hierarchical, key-value, etc.) |
| Supported APIs |
Available language bindings such as C, C++, Fortran, Python |
| Portability |
Extent to which the library is portable across CPU and GPU systems and HPC platforms |
| Metadata management |
Ability to manage and query metadata efficiently at scale |
| Compression |
Support for data compression and decompression, including lossy or lossless modes |
| Checkpoint/restart capabilities |
Whether the library supports state checkpointing and restart for resilience |
| File format compatibility |
Interoperability with common file formats like HDF5, NetCDF, or BP5 |
| Data streaming |
Ability to handle real-time or asynchronous data streams |
| E4S integration |
Availability as part of E4S releases with verified Spack package support |
| License |
Software license type and compatibility with project policies |
| Community support |
Availability of documentation, tutorials, and active maintenance |
| Performance tuning |
Options for optimizing data layout, buffering, and I/O scheduling |
For Simulation Output
| Attribute |
Description |
| In-situ I/O |
Direct output to visualization or analysis frameworks during simulation runtime |
| Temporal data management |
Support for time series and multi-timestep data |
| Large-file scalability |
Ability to handle multi-terabyte output efficiently |
| Domain decomposition mapping |
Handling of partitioned data in structured or unstructured domains |
For Machine Learning Data Pipelines
| Attribute |
Description |
| Tensor data support |
Ability to read/write multi-dimensional tensor data efficiently |
| Streaming ingestion |
Support for high-frequency or batched input data |
| Integration with AI frameworks |
Compatibility with TensorFlow, PyTorch, or ONNX data formats |
For Exascale and Heterogeneous Systems
| Attribute |
Description |
| GPU-direct I/O |
Capability to move data between GPU memory and storage without CPU involvement |
| Burst buffer support |
Awareness of fast intermediate storage tiers on supercomputers |
| Resilience features |
Fault-tolerant I/O and restart mechanisms at extreme scale |
Broadly Meaningful Attributes
| Attribute |
Description |
| Rendering model |
Type of rendering (rasterization, ray tracing, volume rendering) supported |
| Parallel rendering |
Ability to distribute rendering across multiple nodes or GPUs |
| Supported data formats |
Input formats supported (e.g., VTK, HDF5, ADIOS2, Exodus) |
| Integration with simulation frameworks |
Ability to couple visualization with simulation codes for in-situ workflows |
| Language bindings |
Availability of Python, C++, or scripting interfaces |
| Interactivity |
Support for interactive exploration versus batch rendering |
| Scalability |
Suitability for large data visualization on clusters or supercomputers |
| Extensibility |
Ease of adding custom filters, readers, or visualization modules |
| E4S integration |
Inclusion in E4S releases and maintained Spack recipes |
| Portability |
Availability across Linux, macOS, Windows, and major HPC systems |
| Automation |
Support for scripting and reproducible visualization workflows |
For In-Situ Visualization
| Attribute |
Description |
| Catalyst or Ascent compatibility |
Support for in-situ data coupling through Catalyst2 or Ascent |
| Data coupling mechanism |
Method used to connect simulation data streams to visualization components |
| Memory sharing |
Ability to operate without full data copies |
| Runtime overhead |
Impact on simulation performance during coupled visualization |
For Post-Processing and Analysis
| Attribute |
Description |
| File import range |
Ability to load data from multiple simulation tools or formats |
| Parallel batch rendering |
Capability to render large datasets without interactivity |
| Derived field generation |
Tools for computing secondary quantities from raw data |
| Scripting interface |
Availability of programmable interfaces for automation (e.g., Python scripting in ParaView) |
For Interactive or Immersive Visualization
| Attribute |
Description |
| Interactive mode |
Support for real-time user manipulation and exploration |
| Remote visualization |
Ability to interact with large data sets remotely via client-server models |
| VR/AR readiness |
Compatibility with immersive visualization systems |
| Web interface |
Support for web-based exploration using WebGL or similar technologies |