Details of our parallelization strategy are given in [8],
so we only describe it briefly here.
For JTPACK90, PGSLIB provides global reduction (e.g. dot product, etc.)
and gather/scatter functionality. The latter is used for the
indirect addressing inherent in forming matrix-vector products
using sparse storage for matrices.
That is, recall the matrix-vector kernel
for a matrix stored in ELL format shown previously. Using
PGSLIB this kernel becomes:
y = zero
call PGSLib_gather (y, x_pe, ja_pe, ja, trace, mask=(ja_pe /= 0))
y_pe = SUM(a_pe*y, dim=2)
where _pe in a variable name denotes the segment of the
array local to a particular processor, and trace is a PGSLIB type
containing information about how the data is distributed, etc.