New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
C++: Ensure only one Variable exists for every global variable
#9700
base: main
Are you sure you want to change the base?
Conversation
|
Note that this also maps the following declarations of This is consistent with linker behaviour, but might not be desirable from a CodeQl perspective? Note that the there will still be two |
Assuming these are linked together, that's an ODR violation. I don't think we need to go to great lengths to analyse code with ODR violations considering it's UB and we don't even have enough information to guess at the symptoms (e.g. if If we do want to "handle" such cases, I think the best thing to do would be to detect ODR violations in a dedicated query and assume they're not present in all other queries. Since as you say there are two However what happens if we tweak the example to be valid C++ by giving a.cpp's |
|
I think I agree with @sashabu that we shouldn't try to mitigate ODR issues like this on the extractor side. However, I do have a comment with regards to this:
ODR violations have given us plenty of performance issues in the past because people didn't know their code contained ODR violations, and it turned some analyses into awful exponential-time algorithms. Luckily, they're normally quite simple fixes on the QL side (see for example the old GVN library and the new GVN library for some of the mitigation stuff we've done to guard ourselves against stuff like "a variable with multiple types", or "a field lookup returning multiple fields"). So we can't quite ignore ODR issues in the analyses because we've alerted about it in some other query, since the presence of an ODR violation can prevent the suite from completing. |
@MathiasVP - I think having the two You're quite right that my comment was overly general. More precisely, what I was trying to say is that it's good to have a way to detect ODR violations, but we don't necessarily need to try and model the semantics of a C++-like language where ODR violations are well-defined (e.g. by defining differently-typed "overloads" of the variable). I think we're in agreement on that? |
Yes, totally in agreement on that |
We have one internally. |
Global variables with internal linkage have different name mangling. One of the internal fixes for this PR actually corrected some issues there and includes tests. |
I think the aggressive use of |
Depending on the extraction order, before this change there might be multiple
`GlobalVariable`s per declared global variable. See the tests in
`cpp/ql/test/library-tests/variables/global`. This change ensures that only one
of those `GlobalVariable`s is visible to the user if we can locate a unique
definition. If not, the old situation persists.
Note that an exception needs to be made for templated variables. Here, the
definition refers to the non-instantiated template, while a declaration that
is not a definition refers to an instantiation. In case the instantiation refers
to a template parameter, the mangled names of the template and the instantiation
will be identical. This happens for example in the following case:
```
template <typename T>
T x = T(42); // Uninstantiated templated variable
template <typename T>
class C {
T y = x<T>; // Instantiation using a template parameter
};
```
Since the uninstantiated template and the instantiation are two different
entities, we do not unify them as described above.
4be2d68
to
06c4119
Compare
06c4119
to
a7956ad
Compare
This LGTM
!
I'll approve the PR and leave the merging to whoever merges the internal PR.
Depending on the extraction order, before this change there might be multiple
GlobalVariables per declared global variable. See the tests incpp/ql/test/library-tests/variables/global. This change ensures that only one of thoseGlobalVariables is visible to the user if we can locate a unique definition. If not, the old situation persists.Note that an exception needs to be made for templated variables. Here, the definition refers to the non-instantiated template, while a declaration that is not a definition refers to an instantiation. In case the instantiation refers to a template parameter, the mangled names of the template and the instantiation will be identical. This happens for example in the following case:
Since the uninstantiated template and the instantiation are two different entities, we do not unify them as described above.
To note:
ResolveGlobalVariable.qllwas resolved fromResolveClass.qll, so it's fairly aggressive withpragma[noinline]. I have not checked what happens performance-wise when I remove them. Let me know if it's worth checking this.cpp/ql/test/library-tets/templates/variables