Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Surprising list overallocation from .split() #91146

Closed
tim-one opened this issue Mar 11, 2022 · 4 comments
Closed

Surprising list overallocation from .split() #91146

tim-one opened this issue Mar 11, 2022 · 4 comments
Labels
3.12 interpreter-core performance

Comments

@tim-one
Copy link
Member

@tim-one tim-one commented Mar 11, 2022

BPO 46990
Nosy @tim-one, @JelleZijlstra

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2022-03-11.22:37:20.039>
labels = ['interpreter-core', '3.11', 'performance']
title = 'Surprising list overallocation from .split()'
updated_at = <Date 2022-03-12.03:24:49.892>
user = 'https://github.com/tim-one'

bugs.python.org fields:

activity = <Date 2022-03-12.03:24:49.892>
actor = 'tim.peters'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Interpreter Core']
creation = <Date 2022-03-11.22:37:20.039>
creator = 'tim.peters'
dependencies = []
files = []
hgrepos = []
issue_num = 46990
keywords = []
message_count = 3.0
messages = ['414942', '414958', '414968']
nosy_count = 2.0
nosy_names = ['tim.peters', 'JelleZijlstra']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'resource usage'
url = 'https://bugs.python.org/issue46990'
versions = ['Python 3.11']

@tim-one
Copy link
Member Author

@tim-one tim-one commented Mar 11, 2022

When looking into a StackOverflow question about surprisingly high memory use, I stumbled into this (under 3.10.1, Win64):

>>> import sys
>>> s = "1 2 3 4 5".split()
>>> s
['1', '2', '3', '4', '5']
>>> sys.getsizeof(s)
152
>>> _ - sys.getsizeof([])
96
>>> 96 / 8
12.0

That is, we allocated enough space in the list to store 12(!) elements, despite that only 5 are used. Other ways of building a 5-element list I've tried overallocate by at most 3 slots:

>>> sys.getsizeof([ch for ch in "12345"])
120
>>> sys.getsizeof([1, 2, 3, 4, 5])
120

(and 120 - 56 = 64, room for 8 pointers)

Then there's this curiosity, which allocates space for exactly the 5 needed:

>>> sys.getsizeof(list(tuple("1 2 3 4 5".split())))
96

(and 96 - 56 = 40, room for the 5 pointers needed)

I don't expect this to be consistent, but allocating space for 12 when only 5 are needed is unreasonable. Even allocating space for 8 is pushing it ;-)

@tim-one tim-one added type-bug 3.11 interpreter-core performance and removed type-bug labels Mar 11, 2022
@JelleZijlstra
Copy link
Member

@JelleZijlstra JelleZijlstra commented Mar 12, 2022

The value 12 is hardcoded here:

#define MAX_PREALLOC 12

The comment there says that this is because most .split() calls are on lines of human-readable text, which has about 11 words per line. I don't know if I believe that.

@tim-one
Copy link
Member Author

@tim-one tim-one commented Mar 12, 2022

Well, that's annoying ;-) In context, the OP was saving a list of 10 million splits. So each overallocation by a single element burned 80 million bytes of RAM. Overallocating by 7 burned 560 million bytes.

Which is unusual. Usually a split result is short-lived, consumed once then thrown away.

OTOH, the overwhelming motivation for overallocating at all is to acheive O(1) amortized time after a long _sequence_ of appends, and split results typically aren't appended to at all. split() appears to be using it as a timing micro-optimization for tiny lists instead.

So, like I said, it's annoying ;-) For "small" lists, split() really shouldn't overallocate at all (because, as before, split results are rarely appended to). A compromise could be to save pointers to the first N (12, whatever) instances of the splitting string in a stack ("auto") vector, before any list object (or result string object) is created. If it's out of stuff to do before reaching N, fine, build a result out of exactly what was found. If there's more to do, build a result from the first N, and go on as currently (letting PyList_Append deal with it - overallocation is huge in percentage terms when the list is short, but not so much as the list gets longer).

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
corona10 added a commit to corona10/cpython that referenced this issue Jul 30, 2022
corona10 added a commit to corona10/cpython that referenced this issue Jul 30, 2022
@corona10 corona10 added 3.12 and removed 3.11 labels Jul 31, 2022
corona10 added a commit to corona10/cpython that referenced this issue Jul 31, 2022
…plit

Co-authored-by: Inada Naoki <songofacandy@gmail.com>
corona10 added a commit to corona10/cpython that referenced this issue Jul 31, 2022
…plit

Co-authored-by: Inada Naoki <songofacandy@gmail.com>
corona10 added a commit that referenced this issue Aug 1, 2022
@corona10
Copy link
Member

@corona10 corona10 commented Aug 1, 2022

@tim-one cc @methane

Now the size is a lot reduced. I would like to close the issue until we need to squeeze more.

Python 3.12.0a0 (heads/main:fb75d015f4, Aug  1 2022, 22:16:13) [Clang 13.1.6 (clang-1316.0.21.2)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> s = "1 2 3 4 5".split()
>>> sys.getsizeof(s)
104

Feel free to reopen the issue if we need :)

@corona10 corona10 closed this as completed Aug 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.12 interpreter-core performance
Projects
None yet
Development

No branches or pull requests

3 participants