-
-
Notifications
You must be signed in to change notification settings - Fork 12.4k
BUG: Fix new DTypes and new string promotion when signature is involved #26744
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
a4596d7
0b723bc
6347369
817c9e4
1d1c0c0
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -1028,6 +1028,25 @@ all_strings_promoter(PyObject *NPY_UNUSED(ufunc), | |
| PyArray_DTypeMeta *const signature[], | ||
| PyArray_DTypeMeta *new_op_dtypes[]) | ||
| { | ||
| if ((op_dtypes[0] != &PyArray_StringDType && | ||
| op_dtypes[1] != &PyArray_StringDType && | ||
| op_dtypes[2] != &PyArray_StringDType)) { | ||
| /* | ||
| * This promoter was triggered with only unicode arguments, so use | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This seems confusing - should
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Things would just fail the operation, I am very sure I added it for a reason. The problem is that you would have to decide that The other thing you might not like is that it matches at all, but that would need one of two new features (which is fine):
The other solution for the particular case is that if there wasn't legacy promotion involved, I would like a default promoter that ensures that
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It was probably good to get this in, but it still feels weird to have a promotor for a given dtype return a result that does not involve that type at all - how can it decide for another type what is acceptable? Two of your solutions sound reasonable: returning the equivalent of |
||
| * unicode. This can happen due to `dtype=` support which sets the | ||
| * output DType/signature. | ||
| */ | ||
| new_op_dtypes[0] = NPY_DT_NewRef(&PyArray_UnicodeDType); | ||
| new_op_dtypes[1] = NPY_DT_NewRef(&PyArray_UnicodeDType); | ||
| new_op_dtypes[2] = NPY_DT_NewRef(&PyArray_UnicodeDType); | ||
| return 0; | ||
| } | ||
| if ((signature[0] == &PyArray_UnicodeDType && | ||
| signature[1] == &PyArray_UnicodeDType && | ||
| signature[2] == &PyArray_UnicodeDType)) { | ||
| /* Unicode forced, but didn't override a string input: invalid */ | ||
| return -1; | ||
|
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This part makes me wonder if I should just check it after the promoter is done and invalidate the result if this is violated. But it is OK here also.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree it would be better to enforce that there, if only because IMO DType authors shouldn't have to worry about that case or add code to account for it to write a correct DType.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Indeed, it seems strange one would even get here if the signature is already clear that |
||
| } | ||
| new_op_dtypes[0] = NPY_DT_NewRef(&PyArray_StringDType); | ||
| new_op_dtypes[1] = NPY_DT_NewRef(&PyArray_StringDType); | ||
| new_op_dtypes[2] = NPY_DT_NewRef(&PyArray_StringDType); | ||
|
|
@@ -2532,6 +2551,17 @@ init_stringdtype_ufuncs(PyObject *umath) | |
| return -1; | ||
| } | ||
|
|
||
| PyArray_DTypeMeta *out_strings_promoter_dtypes[] = { | ||
| &PyArray_UnicodeDType, | ||
| &PyArray_UnicodeDType, | ||
| &PyArray_StringDType, | ||
| }; | ||
|
|
||
| if (add_promoter(umath, "add", out_strings_promoter_dtypes, 3, | ||
| all_strings_promoter) < 0) { | ||
| return -1; | ||
| } | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good catch! |
||
|
|
||
| INIT_MULTIPLY(Int64, int64); | ||
| INIT_MULTIPLY(UInt64, uint64); | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -828,6 +828,31 @@ def test_add_promoter(string_list): | |
| assert_array_equal(op + arr, lresult) | ||
| assert_array_equal(arr + op, rresult) | ||
|
|
||
| # The promoter should be able to handle things if users pass `dtype=` | ||
| res = np.add("hello", string_list, dtype=StringDType) | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Probably not worth using the
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That doesn't work. Signatures are DType classes. It should work at least for
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah thanks for explaining. I saw that error before but thought this change made dtype instances OK.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, not hat happens much earlier, I explicitly allowed the singleton instancs of legacy dtypes (or maybe all singleton instances, not sure), because otherwise things would be tricky. But, we have the "give me the DType" now also, which maybe (not sure!) makes |
||
| assert res.dtype == StringDType() | ||
|
|
||
| # The promoter should not kick in if users override the input, | ||
| # which means arr is cast, this fails because of the unknown length. | ||
| with pytest.raises(TypeError, match="cannot cast dtype"): | ||
| np.add(arr, "add", signature=("U", "U", None), casting="unsafe") | ||
|
|
||
| # But it must simply reject the following: | ||
| with pytest.raises(TypeError, match=".*did not contain a loop"): | ||
| np.add(arr, "add", signature=(None, "U", None)) | ||
|
|
||
| with pytest.raises(TypeError, match=".*did not contain a loop"): | ||
| np.add("a", "b", signature=("U", "U", StringDType)) | ||
|
|
||
|
|
||
| def test_add_no_legacy_promote_with_signature(): | ||
| # Possibly misplaced, but useful to test with string DType. We check that | ||
| # if there is clearly no loop found, a stray `dtype=` doesn't break things | ||
| # Regression test for the bad error in gh-26735 | ||
| # (If legacy promotion is gone, this can be deleted...) | ||
| with pytest.raises(TypeError, match=".*did not contain a loop"): | ||
| np.add("3", 6, dtype=StringDType) | ||
|
|
||
|
|
||
| def test_add_promoter_reduce(): | ||
| # Exact TypeError could change, but ensure StringDtype doesn't match | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there an issue for this? Otherwise, we'll probably find this for numpy 2.13 or so...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created a specific one that is milestoned.