Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-95861: Add support for Spearman's rank correlation coefficient #95863

Merged
merged 9 commits into from Aug 18, 2022

Conversation

rhettinger
Copy link
Contributor

@rhettinger rhettinger commented Aug 10, 2022

Still need to think about whether by_rank is the best name for the flag.

@rhettinger rhettinger added type-feature A feature request or enhancement stdlib Python modules in the Lib dir 3.12 labels Aug 10, 2022
@rhettinger rhettinger requested a review from stevendaprano Aug 10, 2022
@stevendaprano
Copy link
Member

stevendaprano commented Aug 13, 2022

See my early comment on #95861

I think we should make _rank public, and remove the "by_rank" parameter to correlation, moving that functionality into a thin wrapper function spearman. Otherwise this is good work, thank you.

Lib/statistics.py Outdated Show resolved Hide resolved
Copy link
Member

@stevendaprano stevendaprano left a comment

If you categorically disagree with my suggestion to make rank public and move spearman into its own named function, I will accept this (you've done the hard work writing the code and I do like the feature). But I do strongly request that we keep to the design of separate named functions rather than overloading the one function with a flag.

If you categorically want to keep the single correlation function, then we could future-proof it and make the type of correlation more explicit with an enumerated flag rather than a bool:

def correlation(x, y, /, kind='pearson'):
    if kind == 'spearman':
        x = _rank(x)
        y = _rank(y)
   elif kind != 'pearson':
       raise ValueError('unsupported correlation kind')

If we ever add another sort of correlation (say, Kendall tau) is would be awkward to handle a second bool flag.

@bedevere-bot
Copy link

bedevere-bot commented Aug 13, 2022

When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.

@rhettinger
Copy link
Contributor Author

rhettinger commented Aug 13, 2022

Thank you for responding quickly and thoughtfully.

I added some comments to the issue to help us get to a good meeting of the minds on the API.

@rhettinger rhettinger changed the title GH-95861: Add support for Pearson's correlation coefficient GH-95861: Add support for Spearman's rank correlation coefficient Aug 14, 2022
@rhettinger rhettinger merged commit 29c8f80 into python:main Aug 18, 2022
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.12 stdlib Python modules in the Lib dir type-feature A feature request or enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants