Skip to content

[FEAT] Adding vecdot implementation#86

Open
SwayamInSync wants to merge 1 commit into
numpy:mainfrom
SwayamInSync:vecdot
Open

[FEAT] Adding vecdot implementation#86
SwayamInSync wants to merge 1 commit into
numpy:mainfrom
SwayamInSync:vecdot

Conversation

@SwayamInSync
Copy link
Copy Markdown
Member

Implementing NumPy's vecdot for quaddtype

AI declaration:
Claude is used to extend test_dot.py for adding test related vecdot and possible edge cases

@SwayamInSync
Copy link
Copy Markdown
Member Author

This PR also adds some helpers and a new ufunc registering helper function, which will be use for register matvec and vecmat so I will make those PR after this gets merged

@SwayamInSync SwayamInSync requested a review from ngoldbaum May 12, 2026 15:48
Copy link
Copy Markdown
Member

@ngoldbaum ngoldbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good, a few comments below.

Comment thread src/csrc/umath/matmul.cpp
#ifndef DISABLE_QUADBLAS

PyType_Slot slots[] = {
{NPY_METH_resolve_descriptors, (void *)&quad_matmul_resolve_descriptors},
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude points out that quad_matmul_resolve_desctriptors includes an error message that references matmul. It needs to be generalized to use the ufunc name rather than hardcode it:

"QBLAS-accelerated matmul only supports SLEEF backend. "
"Please raise the issue at SwayamInSync/QBLAS for longdouble support");

Also because resolve_descriptors errors out for non-SLEEF backends, longdouble will never actually be able to call the naive loops. This is a pre-existing issue but it's copied here.

Comment thread src/csrc/umath/matmul.cpp
if (descr->backend != BACKEND_SLEEF) {
PyErr_SetString(PyExc_NotImplementedError,
"QBLAS-accelerated vecdot only supports SLEEF backend.");
return -1;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this check is unnecessary because resolve_descriptors would have already triggered the same error.

Comment thread src/csrc/umath/matmul.cpp
Sleef_quad a_val, b_val;
memcpy(&a_val, x + k * x_n_stride, sizeof(Sleef_quad));
memcpy(&b_val, y + k * y_n_stride, sizeof(Sleef_quad));
sum = Sleef_fmaq1_u05(a_val, b_val, sum);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unaligned matmul does a different thing and calls into qblas_dot:

Sleef_quad *temp_A_buffer = new Sleef_quad[n];
Sleef_quad *temp_B_buffer = new Sleef_quad[n];
memcpy(temp_A_buffer, A_ptr, n * sizeof(Sleef_quad));
memcpy(temp_B_buffer, B_ptr, n * sizeof(Sleef_quad));
size_t incx = 1;
size_t incy = 1;
result = qblas_dot(n, temp_A_buffer, incx, temp_B_buffer, incy, C_ptr);
delete[] temp_A_buffer;
delete[] temp_B_buffer;
break;
}

Why the difference? Claude seems to think the matmul approach has better numerical accuracy.

Comment thread tests/test_dot.py
x = create_quad_array([1, 2])
y = create_quad_array([1, 2, 3])
with pytest.raises(ValueError):
np.vecdot(x, y)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be worth adding a test that does vecdot on an empty array, e.g. np.vecdot(create_quad_array([]), create_quad_array([])).

@SwayamInSync
Copy link
Copy Markdown
Member Author

Thanks @ngoldbaum , I was thinking to re-patch this and #88 after #95
So if possible we you can next checkout #95 and then I will easily wire the rest ones in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants