API
async.h
Contains the async Writer loop and functions that the Reader process can use to communicate with it
Reader side
-
type varid_t
Remote handle to a variable in the output file on the Writer process
-
varid_t open_variable_async(const char *varname, size_t len, int async_writer_rank)
Open a variable in the async writer.
- Parameters:
varname – Variable name
len – Length of varname, including the closing ‘/0’
async_writer_rank – MPI rank of writer process
- Returns:
a handle to the variable in the output file
-
void variable_info_async(varid_t varid, size_t ndims, size_t chunk[], int *deflate, int *deflate_level, int *shuffle, int async_writer_rank)
Get info about a variable in the output file.
For direct chunk writes to work the compression parameters must match between the input and output files. Use this function to query the parameters of the output file, then compare the results against the values returned by nc_inq_var_deflate() on the input file.
- Parameters:
varid – Variable handle obtained with open_variable_async
ndims – Number of dimensions
chunk[ndims] – (out): Chunk shape
deflate – (out): Compression enabled
deflate_level – (out): Compression level
shuffle – (out): Shuffle filter enabled
async_writer_rank – MPI rank of writer process
-
void write_uncompressed_async(varid_t varid, size_t ndims, const size_t offset[], const size_t shape[], const void *buffer, nc_type type, int async_writer_rank, MPI_Request *request)
Write data to the file, using the dataset filters.
This writes uncompressed data, as obtained by nc_get_vara() on the input file. It is slower than direct chunk writes, as the data must be put through a compression filter, but is more flexible as it can be used to write partial or unaligned chunks.
request must be sent to MPI_Wait for the message to complete
- Parameters:
varid – Variable handle obtained with open_variable_async
ndims – Number of dimensions
offset[ndims] – Offset of this data’s origin in the collated dataset
shape[ndims] – Shape of this data array
buffer – Compressed chunk data
type – NetCDF type of the data
async_writer_rank – MPI rank of writer process
request – (out): MPI request for the communication
-
void write_chunk_async(varid_t varid, size_t ndims, uint32_t filter_mask, const hsize_t offset[], size_t data_size, const void *buffer, int async_writer_rank, MPI_Request *request)
Write a compressed chunk directly to the file.
This writes compressed chunks, as obtained by opening the input file in HDF5 and calling H5DOread_chunk(). This is faster than copying uncompressed data, but the chunking and compression parameters must be identical on the input and output files, and the chunk must lay on the chunk boundary of the output file. variable_info_async() can be used to determine the chunk layout and compression settings of the variable in the output file.
request must be sent to MPI_Wait for the message to complete
- Parameters:
varid – Variable handle obtained with open_variable_async
ndims – Number of dimensions
filter_mask – HDF5 filter information (must match the output file)
offset[ndims] – Offset of this chunk’s origin in the collated dataset (must be on a chunk boundary of the output file)
data_size – Size of the compressed chunk in bytes
buffer – Compressed chunk data
async_writer_rank – MPI rank of writer process
request – [out]: MPI request for the communication
-
void close_variable_async(varid_t varid, int async_writer_rank)
Close a variable in the async writer.
varid is no longer a valid handle after this call
- Parameters:
varid – Variable handle obtained with open_variable_async
async_writer_rank – MPI rank of writer process
-
void close_async(int async_writer_rank)
Close the async writer.
- Parameters:
async_writer_rank – MPI rank of writer process
Writer side
-
size_t run_async_writer(const char *filename)
Async runner to accept writes.
Called by the Writer to accept async messages sent by the Readers. Once all Readers have sent close_async() messages this will return
- Parameters:
filename – Output filename
- Returns:
total size written
error.h
Functions for reporting errors from the various libraries used
-
NCERR(x)
NetCDF error handler.
If a NetCDF call has errored reports the error and exits
- Parameters:
x – The return code of a NetCDF library call
-
H5ERR(x)
HDF5 error handler.
If a HDF5 call has errored reports the error and exits
- Parameters:
x – The return code of a HDF5 library call
-
CERR(x, message)
C error handler.
If a C library call has errored reports the error and exits
- Parameters:
x – The return code of a C library call
message – Error message
-
void set_log_level(int level)
Set the output level for log messages.
- Parameters:
level – Messages with a level less than or equal to this will be output
-
void log_message(int level, const char *message, ...)
Send a message to the log.
- Parameters:
level – Message log level
message – Message (printf-like format string)
... – Message arguments
Available log levels are:
-
LOG_DEBUG
-
LOG_INFO
-
LOG_WARNING
-
LOG_ERROR
read_chunked.h
Functions the Readers use to read chunks from the input files and send them to the Writer
-
bool is_collated(int ncid, int varid)
Check if any of the dimensions of a variable are collated.
- Parameters:
ncid – NetCDF4 file handle
varid – NetCDF4 variable handle
- Returns:
true if any dimension of varid is collated, false otherwise
-
bool get_collation_info(int ncid, int varid, size_t out_offset[], size_t local_size[], size_t total_size[], int ndims)
Get collation info from a variable.
- Parameters:
ncid – NetCDF4 file handle
varid – NetCDF4 variable handle
out_offset[ndims] – (out): The offset in the collated array of this file’s data
local_size[ndims] – (out): This file’s data size
total_size[ndims] – (out): The total collated size of this variable
ndims – Variable dimensions
- Returns:
true if any of the dimensions are collated
-
bool get_collated_dim_decomp(int ncid, const char *varname, int decomposition[4])
-
size_t get_collated_dim_len(int ncid, const char *varname)
Get the global length of a collated variable.
- Parameters:
ncid – NetCDF4 file handle
varname – Variable name
- Returns:
: Collated variable length
-
void copy_chunked(const char *filename, int async_writer_rank)
Copy all collated variables to the Writer.
The main function for Reader processes. Iterates over all collated variables in the file, choosing to send each variable in compressed (HDF5) or uncompressed (NetCDF4) mode to the Writer.
- Parameters:
filename – Input filename
async_writer_rank – MPI rank of the writer process