Skip to content

Python: Fix points-to for unrelated modules with the same name.#3628

Closed
tausbn wants to merge 4 commits into
github:mainfrom
tausbn:python-fix-overlapping-module-resolution
Closed

Python: Fix points-to for unrelated modules with the same name.#3628
tausbn wants to merge 4 commits into
github:mainfrom
tausbn:python-fix-overlapping-module-resolution

Conversation

@tausbn
Copy link
Copy Markdown
Contributor

@tausbn tausbn commented Jun 5, 2020

The setup for the test case is as follows:

copy1/
    test1.py
    module/
        __init__.py
        test.py
copy2/
    test2.py
    module/
        __init__.py
        test.py

In both cases, module/test.py defines a variable (i.e. module attribute) named value with the value 1 and 2 respectively. The __init__.py files are empty (needed in order to get module to be a package in Python 2).

Each of the test1.py and test2.py files import value from their local version of module.test, and print the value.

As the test shows, we are getting two potential values for value in each of these imports. One is the correct one, and one is from the other copy.

The trouble here is that we have an implicit assumption that modules are uniquely determined by their names. This means even something like getPackage may pick the wrong module, as it just tests whether the right relationship exists between the getNames of the relevant modules.

I think the best way of addressing this is to introduce a notion of "fully qualified name". In the above test example this would result in the full names copy1.module.test and copy2.module.test, and these would then be sufficiently disambiguated.

@tausbn tausbn added the Python label Jun 5, 2020
@tausbn tausbn requested a review from a team as a code owner June 5, 2020 10:07
@RasmusWL
Copy link
Copy Markdown
Member

RasmusWL commented Jun 5, 2020

You can make test1.py import copy2/module/test.py if you try hard enough:

$ PYTHONPATH=copy2 python -m copy1.test1
2

tausbn added 3 commits June 7, 2020 17:30
Git was getting a bit confused, and thinking an init file in `copy1`
was being renamed into its counterpart in `copy2`. This should avoid
that problem.
In general, we cannot expect overlapping modules to be disambiguated
based on the _packages_ they reside in (as they may be inside
directories that cannot be interpreted as packages). To better model
this, I have changed the names from `copy1` to `copy-1` etc.

This also cleans up the test case a bit. Note that it still contains
twice as many results as we want.
Basically, this predicate looks for a common `Container` that contains
both the module itself and the module being imported. In general there
will be many such containers (since if a given container has this
property, then so does its parent), so we require that the container
itself is "close enough" to properly disambiguate the imported module
based solely on its name.
@tausbn tausbn added Awaiting evaluation Do not merge yet, this PR is waiting for an evaluation to finish WIP This is a work-in-progress, do not merge yet! labels Jun 8, 2020
Copy link
Copy Markdown
Member

@RasmusWL RasmusWL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got going with the reviewing before I noticed the WIP label. If it's still so much a WIP that you don't want a review, you can just convert it to a draft pr 😉

Comment on lines +695 to +697
/* Holds if `import name` will import the module `m`. */
cached
predicate module_imported_as(ModuleObjectInternal m, string name) {
deprecated predicate module_imported_as(ModuleObjectInternal m, string name) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment should say what predicate should be used instead.

predicate module_imported_as(ModuleObjectInternal m, string name) {
deprecated predicate module_imported_as(ModuleObjectInternal m, string name) {
/* Normal imports */
m.getName() = name
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: Can the body of this method be replaced with module_imported_as_with_context(m, name, _)? (that would be a bit more clean in my opinion)

c = mod.getPath().getParent*() and
imp.getEnclosingModule() = this and
imp.getAnImportedModuleName() = mod.getName() and
count(Module other | other.getName() = mod.getName() and other.getPath().getParent*() = c) = 1
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a slight recollection of strictcount being preferred to count whenever possible. I don't quite remember why, so we'll have to dig that up tomorrow.

Suggested change
count(Module other | other.getName() = mod.getName() and other.getPath().getParent*() = c) = 1
strictcount(Module other | other.getName() = mod.getName() and other.getPath().getParent*() = c) = 1

@adityasharad adityasharad changed the base branch from master to main August 14, 2020 18:34
@tausbn tausbn marked this pull request as draft May 10, 2022 21:32
@tausbn tausbn closed this Jul 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Awaiting evaluation Do not merge yet, this PR is waiting for an evaluation to finish Python WIP This is a work-in-progress, do not merge yet!

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants