Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
125 changes: 120 additions & 5 deletions doc/source/reference/c-api/generalized-ufuncs.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ what the "core" dimensionality of the inputs is, as well as the
corresponding dimensionality of the outputs (the element-wise ufuncs
have zero core dimensions). The list of the core dimensions for all
arguments is called the "signature" of a ufunc. For example, the
ufunc numpy.add has signature ``(),()->()`` defining two scalar inputs
ufunc ``numpy.add`` has signature ``(),()->()`` defining two scalar inputs
and one scalar output.

Another example is the function ``inner1d(a, b)`` with a signature of
Expand Down Expand Up @@ -57,10 +57,12 @@ taken when calling such a function. An example would be the function
``euclidean_pdist(a)``, with signature ``(n,d)->(p)``, that given an array of
``n`` ``d``-dimensional vectors, computes all unique pairwise Euclidean
distances among them. The output dimension ``p`` must therefore be equal to
``n * (n - 1) / 2``, but it is the caller's responsibility to pass in an
output array of the right size. If the size of a core dimension of an output
``n * (n - 1) / 2``, but by default, it is the caller's responsibility to pass
in an output array of the right size. If the size of a core dimension of an output
cannot be determined from a passed in input or output array, an error will be
raised.
raised. This can be changed by defining a ``PyUFunc_ProcessCoreDimsFunc`` function
and assigning it to the ``proces_core_dims_func`` field of the ``PyUFuncObject``
structure. See below for more details.

Note: Prior to NumPy 1.10.0, less strict checks were in place: missing core
dimensions were created by prepending 1's to the shape as necessary, core
Expand All @@ -77,7 +79,7 @@ Elementary Function
(e.g. adding two numbers is the most basic operation in adding two
arrays). The ufunc applies the elementary function multiple times
on different parts of the arrays. The input/output of elementary
functions can be vectors; e.g., the elementary function of inner1d
functions can be vectors; e.g., the elementary function of ``inner1d``
takes two vectors as input.

Signature
Expand Down Expand Up @@ -214,3 +216,116 @@ input/output arrays ``a``, ``b``, ``c``. Furthermore, ``dimensions`` will be
``[N, I, J]`` to define the size of ``N`` of the loop and the sizes ``I`` and ``J``
for the core dimensions ``i`` and ``j``. Finally, ``steps`` will be
``[a_N, b_N, c_N, a_i, a_j, b_i]``, containing all necessary strides.

Customizing core dimension size processing
------------------------------------------

The optional function of type ``PyUFunc_ProcessCoreDimsFunc``, stored
on the ``process_core_dims_func`` attribute of the ufunc, provides the
author of the ufunc a "hook" into the processing of the core dimensions
of the arrays that were passed to the ufunc. The two primary uses of
this "hook" are:

* Check that constraints on the core dimensions required
by the ufunc are satisfied (and set an exception if they are not).
* Compute output shapes for any output core dimensions that were not
determined by the input arrays.

As an example of the first use, consider the generalized ufunc ``minmax``
with signature ``(n)->(2)`` that simultaneously computes the minimum and
maximum of a sequence. It should require that ``n > 0``, because
the minimum and maximum of a sequence with length 0 is not meaningful.
In this case, the ufunc author might define the function like this:

.. code-block:: c

int minmax_process_core_dims(PyUFuncObject ufunc,
npy_intp *core_dim_sizes)
{
npy_intp n = core_dim_sizes[0];
if (n == 0) {
PyExc_SetString("minmax requires the core dimension "
"to be at least 1.");
return -1;
}
return 0;
}

In this case, the length of the array ``core_dim_sizes`` will be 2.
The second value in the array will always be 2, so there is no need
for the function to inspect it. The core dimension ``n`` is stored
in the first element. The function sets an exception and returns -1
if it finds that ``n`` is 0.

The second use for the "hook" is to compute the size of output arrays
when the output arrays are not provided by the caller and one or more
core dimension of the output is not also an input core dimension.
If the ufunc does not have a function defined on the
``process_core_dims_func`` attribute, an unspecified output core
dimension size will result in an exception being raised. With the
"hook" provided by ``process_core_dims_func``, the author of the ufunc
can set the output size to whatever is appropriate for the ufunc.

In the array passed to the "hook" function, core dimensions that
were not determined by the input are indicating by having the value -1
in the ``core_dim_sizes`` array. The function can replace the -1 with
whatever value is appropriate for the ufunc, based on the core dimensions
that occurred in the input arrays.

.. warning::
The function must never change a value in ``core_dim_sizes`` that
is not -1 on input. Changing a value that was not -1 will generally
result in incorrect output from the ufunc, and could result in the
Python interpreter crashing.

For example, consider the generalized ufunc ``conv1d`` for which
the elementary function computes the "full" convolution of two
one-dimensional arrays ``x`` and ``y`` with lengths ``m`` and ``n``,
respectively. The output of this convolution has length ``m + n - 1``.
To implement this as a generalized ufunc, the signature is set to
``(m),(n)->(p)``, and in the "hook" function, if the core dimension
``p`` is found to be -1, it is replaced with ``m + n - 1``. If ``p``
is *not* -1, it must be verified that the given value equals ``m + n - 1``.
If it does not, the function must set an exception and return -1.
For a meaningful result, the operation also requires that ``m + n``
is at least 1, i.e. both inputs can't have length 0.

Here's how that might look in code:

.. code-block:: c

int conv1d_process_core_dims(PyUFuncObject *ufunc,
npy_intp *core_dim_sizes)
{
// core_dim_sizes will hold the core dimensions [m, n, p].
// p will be -1 if the caller did not provide the out argument.
npy_intp m = core_dim_sizes[0];
npy_intp n = core_dim_sizes[1];
npy_intp p = core_dim_sizes[2];
npy_intp required_p = m + n - 1;

if (m == 0 && n == 0) {
// Disallow both inputs having length 0.
PyErr_SetString(PyExc_ValueError,
"conv1d: both inputs have core dimension 0; the function "
"requires that at least one input has size greater than 0.");
return -1;
}
if (p == -1) {
// Output array was not given in the call of the ufunc.
// Set the correct output size here.
core_dim_sizes[2] = required_p;
return 0;
}
// An output array *was* given. Validate its core dimension.
if (p != required_p) {
PyErr_Format(PyExc_ValueError,
"conv1d: the core dimension p of the out parameter "
"does not equal m + n - 1, where m and n are the "
"core dimensions of the inputs x and y; got m=%zd "
"and n=%zd so p must be %zd, but got p=%zd.",
m, n, required_p, p);
return -1;
}
return 0;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What a really nice & clear addition to the documentation!

3 changes: 2 additions & 1 deletion numpy/_core/code_generators/cversions.txt
Original file line number Diff line number Diff line change
Expand Up @@ -74,5 +74,6 @@
0x00000011 = ca1aebdad799358149567d9d93cbca09

# Version 18 (NumPy 2.0.0)
# Version 18 (NumPy 2.1.0) No change
0x00000012 = 2b8f1f4da822491ff030b2b37dff07e3
# Version 19 (NumPy 2.1.0) Only header additions
0x00000013 = 2b8f1f4da822491ff030b2b37dff07e3
3 changes: 3 additions & 0 deletions numpy/_core/include/numpy/numpyconfig.h
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,7 @@
#define NPY_1_24_API_VERSION 0x00000010
#define NPY_1_25_API_VERSION 0x00000011
#define NPY_2_0_API_VERSION 0x00000012
#define NPY_2_1_API_VERSION 0x00000013


/*
Expand Down Expand Up @@ -160,6 +161,8 @@
#define NPY_FEATURE_VERSION_STRING "1.25"
#elif NPY_FEATURE_VERSION == NPY_2_0_API_VERSION
#define NPY_FEATURE_VERSION_STRING "2.0"
#elif NPY_FEATURE_VERSION == NPY_2_1_API_VERSION
#define NPY_FEATURE_VERSION_STRING "2.1"
#else
#error "Missing version string define for new NumPy version."
#endif
Expand Down
39 changes: 39 additions & 0 deletions numpy/_core/include/numpy/ufuncobject.h
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,39 @@ typedef int (PyUFunc_TypeResolutionFunc)(
PyObject *type_tup,
PyArray_Descr **out_dtypes);

/*
* This is the signature for the functions that may be assigned to the
* `process_core_dims_func` field of the PyUFuncObject structure.
* Implementation of this function is optional. This function is only used
* by generalized ufuncs (i.e. those with the field `core_enabled` set to 1).
* The function is called by the ufunc during the processing of the arguments
* of a call of the ufunc. The function can check the core dimensions of the
* input and output arrays and return -1 with an exception set if any
* requirements are not satisfied. If the caller of the ufunc didn't provide
* output arrays, the core dimensions associated with the output arrays (i.e.
* those that are not also used in input arrays) will have the value -1 in
* `core_dim_sizes`. This function can replace any output core dimensions
* that are -1 with a value that is appropriate for the ufunc.
*
* Parameter Description
* --------------- ------------------------------------------------------
* ufunc The ufunc object
* core_dim_sizes An array with length `ufunc->core_num_dim_ix`.
* The core dimensions of the arrays passed to the ufunc
* will have been set. If the caller of the ufunc didn't
* provide the output array(s), the output-only core
* dimensions will have the value -1.
*
* The function must not change any element in `core_dim_sizes` that is
* not -1 on input. Doing so will result in incorrect output from the
* ufunc, and could result in a crash of the Python interpreter.
*
* The function must return 0 on success, -1 on failure (with an exception
* set).
*/
typedef int (PyUFunc_ProcessCoreDimsFunc)(
Comment thread
mhvk marked this conversation as resolved.
struct _tagPyUFuncObject *ufunc,
npy_intp *core_dim_sizes);

typedef struct _tagPyUFuncObject {
PyObject_HEAD
Expand Down Expand Up @@ -191,6 +224,12 @@ typedef struct _tagPyUFuncObject {
/* A PyListObject of `(tuple of DTypes, ArrayMethod/Promoter)` */
PyObject *_loops;
#endif
#if NPY_FEATURE_VERSION >= NPY_2_1_API_VERSION
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noticed, but just noting to not forget: NPY_2_1_API_VERSION isn't defined yet I think. But don't worry about it, I can just do it (maybe even later today) since evertying else should be good.

/*
* Optional function to process core dimensions of a gufunc.
*/
PyUFunc_ProcessCoreDimsFunc *process_core_dims_func;
#endif
} PyUFuncObject;

#include "arrayobject.h"
Expand Down
3 changes: 2 additions & 1 deletion numpy/_core/meson.build
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,8 @@ C_ABI_VERSION = '0x02000000'
# 0x00000010 - 1.24.x
# 0x00000011 - 1.25.x
# 0x00000012 - 2.0.x
C_API_VERSION = '0x00000012'
# 0x00000013 - 2.1.x
C_API_VERSION = '0x00000013'

# Check whether we have a mismatch between the set C API VERSION and the
# actual C API VERSION. Will raise a MismatchCAPIError if so.
Expand Down
118 changes: 117 additions & 1 deletion numpy/_core/src/umath/_umath_tests.c.src
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
#undef NPY_INTERNAL_BUILD
#endif
// for add_INT32_negative_indexed
#define NPY_TARGET_VERSION NPY_2_0_API_VERSION
#define NPY_TARGET_VERSION NPY_2_1_API_VERSION
#include "numpy/arrayobject.h"
#include "numpy/ufuncobject.h"
#include "numpy/ndarrayobject.h"
Expand Down Expand Up @@ -761,6 +761,95 @@ add_INT32_negative_indexed(PyObject *module, PyObject *dict) {
return 0;
}

// - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
// Define the gufunc 'conv1d_full'
// - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

#define MIN(a, b) (((a) < (b)) ? (a) : (b))
#define MAX(a, b) (((a) < (b)) ? (b) : (a))

int conv1d_full_process_core_dims(PyUFuncObject *ufunc,
npy_intp *core_dim_sizes)
{
//
// core_dim_sizes will hold the core dimensions [m, n, p].
// p will be -1 if the caller did not provide the out argument.
//
npy_intp m = core_dim_sizes[0];
npy_intp n = core_dim_sizes[1];
npy_intp p = core_dim_sizes[2];
npy_intp required_p = m + n - 1;

if (m == 0 && n == 0) {
PyErr_SetString(PyExc_ValueError,
"conv1d_full: both inputs have core dimension 0; the function "
"requires that at least one input has positive size.");
return -1;
}
if (p == -1) {
core_dim_sizes[2] = required_p;
return 0;
}
if (p != required_p) {
PyErr_Format(PyExc_ValueError,
"conv1d_full: the core dimension p of the out parameter "
"does not equal m + n - 1, where m and n are the core "
"dimensions of the inputs x and y; got m=%zd and n=%zd so "
"p must be %zd, but got p=%zd.",
m, n, required_p, p);
return -1;
}
return 0;
}

static void
conv1d_full_double_loop(char **args,
npy_intp const *dimensions,
npy_intp const *steps,
void *NPY_UNUSED(func))
{
// Input and output arrays
char *p_x = args[0];
char *p_y = args[1];
char *p_out = args[2];
// Number of loops of pdist calculations to execute.
npy_intp nloops = dimensions[0];
// Core dimensions
npy_intp m = dimensions[1];
npy_intp n = dimensions[2];
npy_intp p = dimensions[3]; // Must be m + n - 1.
// Core strides
npy_intp x_stride = steps[0];
npy_intp y_stride = steps[1];
npy_intp out_stride = steps[2];
// Inner strides
npy_intp x_inner_stride = steps[3];
npy_intp y_inner_stride = steps[4];
npy_intp out_inner_stride = steps[5];

for (npy_intp loop = 0; loop < nloops; ++loop, p_x += x_stride,
p_y += y_stride,
p_out += out_stride) {
// Basic implementation of 1d convolution
for (npy_intp k = 0; k < p; ++k) {
double sum = 0.0;
for (npy_intp i = MAX(0, k - n + 1); i < MIN(m, k + 1); ++i) {
double x_i = *(double *)(p_x + i*x_inner_stride);
double y_k_minus_i = *(double *)(p_y + (k - i)*y_inner_stride);
sum += x_i * y_k_minus_i;
}
*(double *)(p_out + k*out_inner_stride) = sum;
}
}
}

static PyUFuncGenericFunction conv1d_full_functions[] = {
(PyUFuncGenericFunction) &conv1d_full_double_loop
};
static void *const conv1d_full_data[] = {NULL};
static const char conv1d_full_typecodes[] = {NPY_DOUBLE, NPY_DOUBLE, NPY_DOUBLE};


static PyMethodDef UMath_TestsMethods[] = {
{"test_signature", UMath_Tests_test_signature, METH_VARARGS,
"Test signature parsing of ufunc. \n"
Expand Down Expand Up @@ -830,6 +919,33 @@ PyMODINIT_FUNC PyInit__umath_tests(void) {
return NULL;
}

// - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
// Define the gufunc 'conv1d_full'
// Shape signature is (m),(n)->(p) where p must be m + n - 1.
// - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

PyUFuncObject *gufunc = (PyUFuncObject *) PyUFunc_FromFuncAndDataAndSignature(
conv1d_full_functions,
conv1d_full_data,
conv1d_full_typecodes,
1, 2, 1, PyUFunc_None, "conv1d_full",
"convolution of x and y ('full' mode)",
0, "(m),(n)->(p)");
if (gufunc == NULL) {
Py_DECREF(m);
return NULL;
}
gufunc->process_core_dims_func = &conv1d_full_process_core_dims;

int status = PyModule_AddObject(m, "conv1d_full", (PyObject *) gufunc);
if (status == -1) {
Py_DECREF(gufunc);
Py_DECREF(m);
return NULL;
}

// - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

#if Py_GIL_DISABLED
// signal this module supports running with the GIL disabled
PyUnstable_Module_SetGIL(m, Py_MOD_GIL_NOT_USED);
Expand Down
Loading