Skip to content

TYP: np.char.array overloads not totally accurate for unicode arg #29376

@MarcoGorelli

Description

@MarcoGorelli

Describe the issue:

import numpy as np
from typing import reveal_type

reveal_type(np.char.array('foo', unicode=False))

outputs

t.py:4: note: Revealed type is "numpy._core.defchararray.chararray[builtins.tuple[Any, ...], numpy.dtype[numpy.str_]]"

I'd have expected

t.py:4: note: Revealed type is "numpy._core.defchararray.chararray[builtins.tuple[Any, ...], numpy.dtype[numpy.bytes_]]"

I think the issue is

@overload
def array(
obj: U_co,
itemsize: int | None = ...,
copy: bool = ...,
unicode: L[False] = ...,
order: _OrderKACF = ...,
) -> _CharArray[str_]: ...

The default for unicode is None, not Literal[False]. It seems the idea is:

  • unicode=True: return np.str_ type
  • unicode=False: return np.bytes_ type
  • unicode=None` (default): return either of the above, depending on the input

Spotted this while trying out https://github.com/MarcoGorelli/fix-overload-defaults

Reproduce the code example:

see above

Error message:

Python and NumPy Versions:

2.4.0.dev0+git20250714.cc92651
3.12.11 | packaged by conda-forge | (main, Jun 4 2025, 14:45:31) [GCC 13.3.0]

Type-checker version and settings:

mypy 1.16

Additional typing packages.

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions