API

async.h

Contains the async Writer loop and functions that the Reader process can use to communicate with it

Reader side

type varid_t

Remote handle to a variable in the output file on the Writer process

varid_t open_variable_async(const char *varname, size_t len, int async_writer_rank)

Open a variable in the async writer.

Return
a handle to the variable in the output file
Parameters
  • varname: Variable name
  • len: Length of varname, including the closing ‘/0’
  • async_writer_rank: MPI rank of writer process

void variable_info_async(varid_t varid, size_t ndims, size_t chunk[], int *deflate, int *deflate_level, int *shuffle, int async_writer_rank)

Get info about a variable in the output file.

For direct chunk writes to work the compression parameters must match between the input and output files. Use this function to query the parameters of the output file, then compare the results against the values returned by nc_inq_var_deflate() on the input file.

Parameters
  • varid: Variable handle obtained with open_variable_async
  • ndims: Number of dimensions
  • chunk[ndims]: (out): Chunk shape
  • deflate: (out): Compression enabled
  • deflate_level: (out): Compression level
  • shuffle: (out): Shuffle filter enabled
  • async_writer_rank: MPI rank of writer process

void write_uncompressed_async(varid_t varid, size_t ndims, const size_t offset[], const size_t shape[], const void *buffer, nc_type type, int async_writer_rank, MPI_Request *request)

Write data to the file, using the dataset filters.

This writes uncompressed data, as obtained by nc_get_vara() on the input file. It is slower than direct chunk writes, as the data must be put through a compression filter, but is more flexible as it can be used to write partial or unaligned chunks.

request must be sent to MPI_Wait for the message to complete

Parameters
  • varid: Variable handle obtained with open_variable_async
  • ndims: Number of dimensions
  • offset[ndims]: Offset of this data’s origin in the collated dataset
  • shape[ndims]: Shape of this data array
  • buffer: Compressed chunk data
  • type: NetCDF type of the data
  • async_writer_rank: MPI rank of writer process
  • request: (out): MPI request for the communication

void write_chunk_async(varid_t varid, size_t ndims, uint32_t filter_mask, const hsize_t offset[], size_t data_size, const void *buffer, int async_writer_rank, MPI_Request *request)

Write a compressed chunk directly to the file.

This writes compressed chunks, as obtained by opening the input file in HDF5 and calling H5DOread_chunk(). This is faster than copying uncompressed data, but the chunking and compression parameters must be identical on the input and output files, and the chunk must lay on the chunk boundary of the output file. variable_info_async() can be used to determine the chunk layout and compression settings of the variable in the output file.

request must be sent to MPI_Wait for the message to complete

Parameters
  • varid: Variable handle obtained with open_variable_async
  • ndims: Number of dimensions
  • filter_mask: HDF5 filter information (must match the output file)
  • offset[ndims]: Offset of this chunk’s origin in the collated dataset (must be on a chunk boundary of the output file)
  • data_size: Size of the compressed chunk in bytes
  • buffer: Compressed chunk data
  • async_writer_rank: MPI rank of writer process
  • request: [out]: MPI request for the communication

void close_variable_async(varid_t varid, int async_writer_rank)

Close a variable in the async writer.

varid is no longer a valid handle after this call

Parameters
  • varid: Variable handle obtained with open_variable_async
  • async_writer_rank: MPI rank of writer process

void close_async(int async_writer_rank)

Close the async writer.

Parameters
  • async_writer_rank: MPI rank of writer process

Writer side

size_t run_async_writer(const char *filename)

Async runner to accept writes.

Called by the Writer to accept async messages sent by the Readers. Once all Readers have sent close_async() messages this will return

Return
total size written
Parameters
  • filename: Output filename

error.h

Functions for reporting errors from the various libraries used

NCERR(x)

NetCDF error handler.

If a NetCDF call has errored reports the error and exits

Parameters
  • x: The return code of a NetCDF library call

H5ERR(x)

HDF5 error handler.

If a HDF5 call has errored reports the error and exits

Parameters
  • x: The return code of a HDF5 library call

CERR(x, message)

C error handler.

If a C library call has errored reports the error and exits

Parameters
  • x: The return code of a C library call
  • message: Error message

void set_log_level(int level)

Set the output level for log messages.

Parameters
  • level: Messages with a level less than or equal to this will be output

void log_message(int level, const char *message, ...)

Send a message to the log.

Parameters
  • level: Message log level
  • message: Message (printf-like format string)
  • ...: Message arguments

Available log levels are:

LOG_DEBUG
LOG_INFO
LOG_WARNING
LOG_ERROR

read_chunked.h

Functions the Readers use to read chunks from the input files and send them to the Writer

bool is_collated(int ncid, int varid)

Check if any of the dimensions of a variable are collated.

Return
true if any dimension of varid is collated, false otherwise
Parameters
  • ncid: NetCDF4 file handle
  • varid: NetCDF4 variable handle

bool get_collation_info(int ncid, int varid, size_t out_offset[], size_t local_size[], size_t total_size[], int ndims)

Get collation info from a variable.

Return
true if any of the dimensions are collated
Parameters
  • ncid: NetCDF4 file handle
  • varid: NetCDF4 variable handle
  • out_offset[ndims]: (out): The offset in the collated array of this file’s data
  • local_size[ndims]: (out): This file’s data size
  • total_size[ndims]: (out): The total collated size of this variable
  • ndims: Variable dimensions

bool get_collated_dim_decomp(int ncid, const char *varname, int decomposition[4])
size_t get_collated_dim_len(int ncid, const char *varname)

Get the global length of a collated variable.

Return
: Collated variable length
Parameters
  • ncid: NetCDF4 file handle
  • varname: Variable name

void copy_chunked(const char *filename, int async_writer_rank)

Copy all collated variables to the Writer.

The main function for Reader processes. Iterates over all collated variables in the file, choosing to send each variable in compressed (HDF5) or uncompressed (NetCDF4) mode to the Writer.

Parameters
  • filename: Input filename
  • async_writer_rank: MPI rank of the writer process