Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions CHANGELOG
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,10 @@ If not stated, FINUFFT is assumed (old cuFINUFFT <=1.3 is listed separately).

v2.6.0-dev

* Added C-level simple API to cuFINUFFT
(`cufinufft{,f}{1,2,3}d{1,2,3}[many]`) mirroring the CPU `finufft1d1`-style
one-shot wrappers. 36 new entry points that allocate a plan, set points,
execute, and destroy in a single call; inputs are device pointers. (Barbone)
* Added `threadsafe_execute` regression test verifying concurrent `execute()`
calls on the same plan produce correct results. Added sanitizer mode selection
via `FINUFFT_USE_SANITIZERS=OFF|ON|MEMSAN|TSAN`, and extended the sanitizer
Expand Down
61 changes: 61 additions & 0 deletions docs/c_gpu.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,67 @@ You will also want to read the examples in ``examples/cuda`` and ``test/cuda/cuf
*Note*: The interface to cuFINUFFT has changed between versions 1.3 and 2.2.
Please see :ref:`Migration to cuFINUFFT v2.2<cufinufft_migration>` for details.

Simple interface
----------------

For users who only need to perform a single transform per plan, cuFINUFFT
provides one-shot wrappers that combine all four plan steps (make, setpts,
execute, destroy) into a single call. These mirror the CPU ``finufft1d1``
family (see :ref:`c`) and follow the same naming and argument order; the
only difference is that all input/output array pointers refer to memory on
the GPU. They have double (``cufinufft``) and single (``cufinufftf``)
precision versions, which we document together.

For each dimension ``d`` (1, 2, or 3) and transform type ``t`` (1, 2, or 3),
there are four entry points::

cufinufft{d}d{t}(...); // double precision, single transform
cufinufftf{d}d{t}(...); // single precision, single transform
cufinufft{d}d{t}many(n_transf, ...); // double precision, many transforms
cufinufftf{d}d{t}many(n_transf, ...); // single precision, many transforms

This gives 36 entry points in total. The full prototypes are declared in
``include/cufinufft.h``. As an example, the 1D type-1 simple call in single
precision is:

.. code-block:: c

int cufinufftf1d1(int64_t nj, const float *xj, const cuFloatComplex *cj,
int iflag, float eps, int64_t ms, cuFloatComplex *fk,
const cufinufft_opts *opts);

Inputs:

* ``nj`` — number of nonuniform points
* ``xj`` — length-``nj`` device array of NU point coordinates (in :math:`[-\pi,\pi)`,
values outside are folded)
* ``cj`` — length-``nj`` device array of NU strengths
* ``iflag`` — sign in the complex exponential (>=0 for +, <0 for -)
* ``eps`` — requested relative tolerance
* ``ms`` — number of Fourier modes
* ``fk`` — length-``ms`` device array, output
* ``opts`` — optional options pointer (NULL for defaults)

Returns ``0`` on success, otherwise an error code (see ``finufft_errors.h``).

A complete example using the simple interface in place of the 4-step calls
from the :ref:`Getting started <c_gpu>` walkthrough below is:

.. code-block:: c

cufinufftf1d1(M, d_x, d_c, 1, 1e-6f, N, d_f, NULL);

This single call replaces the ``makeplan``/``setpts``/``execute``/``destroy``
sequence. Type-3 calls also take frequency-target arrays (``s`` for 1D,
``s,t`` for 2D, ``s,t,u`` for 3D), exactly as in the CPU API. The ``many``
variants take an ``n_transf`` parameter as the first argument and use the
same set of NU points for ``n_transf`` consecutive transforms (with input
and output arrays sized accordingly).

The simple interface is intended for cases where the plan would otherwise be
used exactly once. When the same NU points are reused across several
transforms, use the 4-step plan interface below for better performance.

Getting started
---------------

Expand Down
148 changes: 148 additions & 0 deletions include/cufinufft.h
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,154 @@ FINUFFT_EXPORT int cufinufftf_execute(cufinufftf_plan d_plan, cuFloatComplex *d_

FINUFFT_EXPORT int cufinufft_destroy(cufinufft_plan d_plan);
FINUFFT_EXPORT int cufinufftf_destroy(cufinufftf_plan d_plan);

// Simple (one-shot) interfaces. Pointers are device pointers. Behavior matches
// the 4-step plan API above. 36 entry points (3 dims x 3 types x {single,many}
// x {double,float}).

// Dimension 1111111111111111111111111111111111111111111111111111111111111111
FINUFFT_EXPORT int cufinufft1d1many(
int n_transf, int64_t nj, const double *xj, const cuDoubleComplex *cj, int iflag,
double eps, int64_t ms, cuDoubleComplex *fk, const cufinufft_opts *opts);
FINUFFT_EXPORT int cufinufftf1d1many(
int n_transf, int64_t nj, const float *xj, const cuFloatComplex *cj, int iflag,
float eps, int64_t ms, cuFloatComplex *fk, const cufinufft_opts *opts);
FINUFFT_EXPORT int cufinufft1d1(int64_t nj, const double *xj, const cuDoubleComplex *cj,
int iflag, double eps, int64_t ms, cuDoubleComplex *fk,
const cufinufft_opts *opts);
FINUFFT_EXPORT int cufinufftf1d1(int64_t nj, const float *xj, const cuFloatComplex *cj,
int iflag, float eps, int64_t ms, cuFloatComplex *fk,
const cufinufft_opts *opts);

FINUFFT_EXPORT int cufinufft1d2many(
int n_transf, int64_t nj, const double *xj, cuDoubleComplex *cj, int iflag,
double eps, int64_t ms, const cuDoubleComplex *fk, const cufinufft_opts *opts);
FINUFFT_EXPORT int cufinufftf1d2many(
int n_transf, int64_t nj, const float *xj, cuFloatComplex *cj, int iflag, float eps,
int64_t ms, const cuFloatComplex *fk, const cufinufft_opts *opts);
FINUFFT_EXPORT int cufinufft1d2(int64_t nj, const double *xj, cuDoubleComplex *cj,
int iflag, double eps, int64_t ms,
const cuDoubleComplex *fk, const cufinufft_opts *opts);
FINUFFT_EXPORT int cufinufftf1d2(int64_t nj, const float *xj, cuFloatComplex *cj,
int iflag, float eps, int64_t ms,
const cuFloatComplex *fk, const cufinufft_opts *opts);

FINUFFT_EXPORT int cufinufft1d3many(int n_transf, int64_t nj, const double *xj,
const cuDoubleComplex *cj, int iflag, double eps,
int64_t nk, const double *s, cuDoubleComplex *fk,
const cufinufft_opts *opts);
FINUFFT_EXPORT int cufinufftf1d3many(int n_transf, int64_t nj, const float *xj,
const cuFloatComplex *cj, int iflag, float eps,
int64_t nk, const float *s, cuFloatComplex *fk,
const cufinufft_opts *opts);
FINUFFT_EXPORT int cufinufft1d3(int64_t nj, const double *xj, const cuDoubleComplex *cj,
int iflag, double eps, int64_t nk, const double *s,
cuDoubleComplex *fk, const cufinufft_opts *opts);
FINUFFT_EXPORT int cufinufftf1d3(int64_t nj, const float *xj, const cuFloatComplex *cj,
int iflag, float eps, int64_t nk, const float *s,
cuFloatComplex *fk, const cufinufft_opts *opts);

// Dimension 22222222222222222222222222222222222222222222222222222222222222222
FINUFFT_EXPORT int cufinufft2d1many(int n_transf, int64_t nj, const double *xj,
const double *yj, const cuDoubleComplex *cj,
int iflag, double eps, int64_t ms, int64_t mt,
cuDoubleComplex *fk, const cufinufft_opts *opts);
FINUFFT_EXPORT int cufinufftf2d1many(int n_transf, int64_t nj, const float *xj,
const float *yj, const cuFloatComplex *cj, int iflag,
float eps, int64_t ms, int64_t mt,
cuFloatComplex *fk, const cufinufft_opts *opts);
FINUFFT_EXPORT int cufinufft2d1(
int64_t nj, const double *xj, const double *yj, const cuDoubleComplex *cj, int iflag,
double eps, int64_t ms, int64_t mt, cuDoubleComplex *fk, const cufinufft_opts *opts);
FINUFFT_EXPORT int cufinufftf2d1(
int64_t nj, const float *xj, const float *yj, const cuFloatComplex *cj, int iflag,
float eps, int64_t ms, int64_t mt, cuFloatComplex *fk, const cufinufft_opts *opts);

FINUFFT_EXPORT int cufinufft2d2many(
int n_transf, int64_t nj, const double *xj, const double *yj, cuDoubleComplex *cj,
int iflag, double eps, int64_t ms, int64_t mt, const cuDoubleComplex *fk,
const cufinufft_opts *opts);
FINUFFT_EXPORT int cufinufftf2d2many(
int n_transf, int64_t nj, const float *xj, const float *yj, cuFloatComplex *cj,
int iflag, float eps, int64_t ms, int64_t mt, const cuFloatComplex *fk,
const cufinufft_opts *opts);
FINUFFT_EXPORT int cufinufft2d2(int64_t nj, const double *xj, const double *yj,
cuDoubleComplex *cj, int iflag, double eps, int64_t ms,
int64_t mt, const cuDoubleComplex *fk,
const cufinufft_opts *opts);
FINUFFT_EXPORT int cufinufftf2d2(int64_t nj, const float *xj, const float *yj,
cuFloatComplex *cj, int iflag, float eps, int64_t ms,
int64_t mt, const cuFloatComplex *fk,
const cufinufft_opts *opts);

FINUFFT_EXPORT int cufinufft2d3many(
int n_transf, int64_t nj, const double *xj, const double *yj,
const cuDoubleComplex *cj, int iflag, double eps, int64_t nk, const double *s,
const double *t, cuDoubleComplex *fk, const cufinufft_opts *opts);
FINUFFT_EXPORT int cufinufftf2d3many(
int n_transf, int64_t nj, const float *xj, const float *yj, const cuFloatComplex *cj,
int iflag, float eps, int64_t nk, const float *s, const float *t, cuFloatComplex *fk,
const cufinufft_opts *opts);
FINUFFT_EXPORT int cufinufft2d3(int64_t nj, const double *xj, const double *yj,
const cuDoubleComplex *cj, int iflag, double eps,
int64_t nk, const double *s, const double *t,
cuDoubleComplex *fk, const cufinufft_opts *opts);
FINUFFT_EXPORT int cufinufftf2d3(int64_t nj, const float *xj, const float *yj,
const cuFloatComplex *cj, int iflag, float eps,
int64_t nk, const float *s, const float *t,
cuFloatComplex *fk, const cufinufft_opts *opts);

// Dimension 3333333333333333333333333333333333333333333333333333333333333333
FINUFFT_EXPORT int cufinufft3d1many(
int n_transf, int64_t nj, const double *xj, const double *yj, const double *zj,
const cuDoubleComplex *cj, int iflag, double eps, int64_t ms, int64_t mt, int64_t mu,
cuDoubleComplex *fk, const cufinufft_opts *opts);
FINUFFT_EXPORT int cufinufftf3d1many(
int n_transf, int64_t nj, const float *xj, const float *yj, const float *zj,
const cuFloatComplex *cj, int iflag, float eps, int64_t ms, int64_t mt, int64_t mu,
cuFloatComplex *fk, const cufinufft_opts *opts);
FINUFFT_EXPORT int cufinufft3d1(int64_t nj, const double *xj, const double *yj,
const double *zj, const cuDoubleComplex *cj, int iflag,
double eps, int64_t ms, int64_t mt, int64_t mu,
cuDoubleComplex *fk, const cufinufft_opts *opts);
FINUFFT_EXPORT int cufinufftf3d1(int64_t nj, const float *xj, const float *yj,
const float *zj, const cuFloatComplex *cj, int iflag,
float eps, int64_t ms, int64_t mt, int64_t mu,
cuFloatComplex *fk, const cufinufft_opts *opts);

FINUFFT_EXPORT int cufinufft3d2many(
int n_transf, int64_t nj, const double *xj, const double *yj, const double *zj,
cuDoubleComplex *cj, int iflag, double eps, int64_t ms, int64_t mt, int64_t mu,
const cuDoubleComplex *fk, const cufinufft_opts *opts);
FINUFFT_EXPORT int cufinufftf3d2many(
int n_transf, int64_t nj, const float *xj, const float *yj, const float *zj,
cuFloatComplex *cj, int iflag, float eps, int64_t ms, int64_t mt, int64_t mu,
const cuFloatComplex *fk, const cufinufft_opts *opts);
FINUFFT_EXPORT int cufinufft3d2(int64_t nj, const double *xj, const double *yj,
const double *zj, cuDoubleComplex *cj, int iflag,
double eps, int64_t ms, int64_t mt, int64_t mu,
const cuDoubleComplex *fk, const cufinufft_opts *opts);
FINUFFT_EXPORT int cufinufftf3d2(int64_t nj, const float *xj, const float *yj,
const float *zj, cuFloatComplex *cj, int iflag,
float eps, int64_t ms, int64_t mt, int64_t mu,
const cuFloatComplex *fk, const cufinufft_opts *opts);

FINUFFT_EXPORT int cufinufft3d3many(
int n_transf, int64_t nj, const double *xj, const double *yj, const double *zj,
const cuDoubleComplex *cj, int iflag, double eps, int64_t nk, const double *s,
const double *t, const double *u, cuDoubleComplex *fk, const cufinufft_opts *opts);
FINUFFT_EXPORT int cufinufftf3d3many(
int n_transf, int64_t nj, const float *xj, const float *yj, const float *zj,
const cuFloatComplex *cj, int iflag, float eps, int64_t nk, const float *s,
const float *t, const float *u, cuFloatComplex *fk, const cufinufft_opts *opts);
FINUFFT_EXPORT int cufinufft3d3(
int64_t nj, const double *xj, const double *yj, const double *zj,
const cuDoubleComplex *cj, int iflag, double eps, int64_t nk, const double *s,
const double *t, const double *u, cuDoubleComplex *fk, const cufinufft_opts *opts);
FINUFFT_EXPORT int cufinufftf3d3(
int64_t nj, const float *xj, const float *yj, const float *zj,
const cuFloatComplex *cj, int iflag, float eps, int64_t nk, const float *s,
const float *t, const float *u, cuFloatComplex *fk, const cufinufft_opts *opts);
#ifdef __cplusplus
}
#endif
Loading
Loading