Java: Unsafe deserialization with Jackson#5900
Conversation
smowton
left a comment
There was a problem hiding this comment.
Looks sensible to me. I've suggested removing the getAnAccess() trickery because there should be use-use flow from earlier to later uses of the same qualifier variable anyhow. Do you know of a particular case where this was necessary?
@aschackmull this will need your review as not targeting experimental
java/ql/src/semmle/code/java/security/UnsafeDeserialization.qll
Outdated
Show resolved
Hide resolved
java/ql/src/semmle/code/java/security/UnsafeDeserialization.qll
Outdated
Show resolved
Hide resolved
java/ql/src/semmle/code/java/security/UnsafeDeserialization.qll
Outdated
Show resolved
Hide resolved
@smowton @aschackmull As I mentioned in the description, I am not sure whether this should go to |
|
I favour inclusion in the existing query but the final call is @aschackmull |
7cd351d to
2faaa75
Compare
|
Also what of the Either way you should add a change-note. |
I added the
I've added a change note. I've never done it before, so I used https://github.com/github/codeql/blob/main/java/change-notes/2020-10-07-fastjson-deserialization-sink.md as an example. |
smowton
left a comment
There was a problem hiding this comment.
There are quite a few additional taint steps being introduced, which aren't easy for me to see the purpose of without an example. Could you write down a short example for each that uses one of the particular new steps?
In some cases these examples might be useful in the relevant qldoc too; in others they might just be to help me review.
| lgtm,codescanning | ||
| * The "Deserialization of user-controlled data" (`java/unsafe-deserialization`) query | ||
| now recognizes `Jackson` deserialization. | ||
|
|
| } | ||
| } | ||
|
|
||
| class ObjectMapperReadMethod extends Method { |
There was a problem hiding this comment.
Every class here: either make private or add qldoc
| JsonParser() { hasQualifiedName("com.fasterxml.jackson.core", "JsonParser") } | ||
| } | ||
|
|
||
| class JacksonType extends RefType { |
There was a problem hiding this comment.
| class JacksonType extends RefType { | |
| class JacksonTypeDescriptorType extends RefType { |
Existing name suggests any type in the com.fasterxml.jackson namespace
| } | ||
| } | ||
|
|
||
| class EnabledJacksonDefaultTyping extends DataFlow2::Configuration { |
There was a problem hiding this comment.
| class EnabledJacksonDefaultTyping extends DataFlow2::Configuration { | |
| class EnableJacksonDefaultTypingConfig extends DataFlow2::Configuration { |
Less similar to existing class EnableJacksonDefaultTyping
| * that configures or creates an `ObjectMapper` via a builder. | ||
| */ | ||
| override predicate isAdditionalFlowStep(DataFlow::Node fromNode, DataFlow::Node toNode) { | ||
| exists(MethodAccess ma, Method m, Expr q | m = ma.getMethod() and q = ma.getQualifier() | |
There was a problem hiding this comment.
| exists(MethodAccess ma, Method m, Expr q | m = ma.getMethod() and q = ma.getQualifier() | | |
| exists(MethodAccess ma, Method m | m = ma.getMethod() | |
Inline the one use of ma.getQualifier()
| override predicate isAdditionalFlowStep(DataFlow::Node fromNode, DataFlow::Node toNode) { | ||
| exists(MethodAccess ma, Method m, Expr q | m = ma.getMethod() and q = ma.getQualifier() | | ||
| m.getDeclaringType() instanceof MapperBuilder and | ||
| m.getReturnType() |
There was a problem hiding this comment.
Could you give an end-to-end example that these steps are designed to address?
There was a problem hiding this comment.
ObjectMapper may be created by JsonMapper.Builder that extends MapperBuilder and uses fluent API. In other words, its methods return this:
JsonMapper$Builder has quite a lot of methods. Therefore, I thought that it maybe better to cover all methods that return the builder instead of listing all the available methods. Furthermore, the list should be updated once new methods are added.
Here is an example:
private static <T> T deserialize(/* tainted */ String string, Class<T> clazz) throws IOException {
PolymorphicTypeValidator ptv =
BasicPolymorphicTypeValidator.builder()
.allowIfSubType("com.test.safe")
.build();
// use a builder to create an ObjectMapper
ObjectMapper mapper = JsonMapper.builder()
.polymorphicTypeValidator(ptv) // sanitizer
.build();
mapper.enableDefaultTyping();
return mapper.readValue(string, clazz); // deserialization happens here
}Builders need to be covered in the query because they have polymorphicTypeValidator() that is a sanitizer.
| * Holds if `fromNode` to `toNode` is a dataflow step that looks like resolving a class. | ||
| */ | ||
| override predicate isAdditionalTaintStep(DataFlow::Node fromNode, DataFlow::Node toNode) { | ||
| exists(MethodAccess ma, RefType returnType | returnType = ma.getMethod().getReturnType() | |
There was a problem hiding this comment.
Could you give an end-to-end example that these steps are designed to address?
There was a problem hiding this comment.
ObjectMapper.readValue() and other similar methods take a serialized object and a target object type. It is dangerous when an attacker can control both the data and the type. That's what happened in CVE-2016-8749.
The type maybe either Class, JavaType or TypeReference. This predicate checks if a method seems to take a class name as a string and convert it to one of these classes. Here is an example:
private static Object deserialize(/* tainted */ String string, /* tainted */ String className) throws IOException {
ObjectMapper mapper = new ObjectMapper();
mapper.enableDefaultTyping();
return mapper.readValue(string, resolveType(className));
}
private static Class resolveType(String className) {
return Class.forName(className);
}Here, resolveType() is simple -- it just calls Class.forName(). But its logic might be more complex that could potentially result it false-negatives. The predicate looks for methods like resolveType() assuming that they create a type descriptor that is known to Jackson. Of course, that may result in false-positives. I thought that should be acceptable trade-off because deserialization vulnerabilities are quite dangerous.
| /** | ||
| * Holds if `fromNode` to `toNode` is a dataflow step that creates a Jackson parser. | ||
| */ | ||
| predicate createJacksonJsonParserStep(DataFlow::Node fromNode, DataFlow::Node toNode) { |
There was a problem hiding this comment.
Could you give an end-to-end example that these steps are designed to address?
There was a problem hiding this comment.
ObjectMapper has multiple readValue() and readValues() methods that take a JsonParser. In turn, JsonParser takes JSON that may come from untrusted sources.
This predicate defines steps that create a JsonParser. Here is an example:
private static <T> List<T> deserialize(/* tainted */ String string, Class<T> clazz)
throws IOException {
ObjectMapper mapper = new ObjectMapper();
mapper.enableDefaultTyping();
JsonParser parser = new JsonFactory().createParser(string);
return mapper.readValues(parser, clazz).readAll();
}There was a problem hiding this comment.
Could the createParser be a sink, or could we still encounter sanitisers on the road to readValues?
There was a problem hiding this comment.
Could the
createParserbe a sink,
I don't think so. JsonParser parses JSON and returns primitive values, strings, JSON tokens, etc. It can return a type id or descriptor but as far as I know the parser doesn't create instances of these types.
or could we still encounter sanitisers on the road to
readValues?
I am aware of only two reliable sanitizers that are covered in the query. I was considering adding ObjectMapper.activateDefaultTyping() because it takes PolymorphicTypeValidator. But it turned out that in this case the validator is not always called. I reported this to the project, but they said that's expected. Please see FasterXML/jackson-databind#2524 . Therefore, I didn't add this method as a sanitizer.
| /** | ||
| * Holds if `fromNode` to `toNode` is a dataflow step that creates a Jackson `TreeNode`. | ||
| */ | ||
| predicate createJacksonTreeNodeStep(DataFlow::Node fromNode, DataFlow::Node toNode) { |
There was a problem hiding this comment.
Could you give an end-to-end example that these steps are designed to address?
There was a problem hiding this comment.
ObjectMapper has treeToValue() method is sink that takes a TreeNode. Raw JSON data may be converted to a TreeNode and then supplied to treeToValue(). A TreeNode can be created by ObjectMapper.readTree() or ObjectMapper.readValueAsTree() methods. Here is an example:
private static <T> T deserialize(/* tainted */ String string, Class<T> clazz) throws IOException {
ObjectMapper mapper = new ObjectMapper();
mapper.enableDefaultTyping();
TreeNode tree = mapper.readTree(string);
return mapper.treeToValue(tree, clazz);
}
@smowton Sure, I've added examples and some comments. Please let me know if you have questions. You can find complete examples in https://github.com/artem-smotrakov/ql-fun/tree/master/src/main/java/com/gypsyengineer/ql/fun/jackson I noticed that Thanks for the review. I've addressed other comments as well. |
c53305f to
2b16a8e
Compare
|
There were several conflicts in the stub lib. I rebased the branch and had to force-push. |
smowton
left a comment
There was a problem hiding this comment.
OK, this makes sense. The String -> Class heuristic is rather dubious; could you give an example of a Class.forName replacement that we found too difficult to model, and which led you to this overapproximation?
| } | ||
| } | ||
|
|
||
| private class EnableJacksonDefaultTypingConfig extends DataFlow2::Configuration { |
There was a problem hiding this comment.
| private class EnableJacksonDefaultTypingConfig extends DataFlow2::Configuration { | |
| /** | |
| * Tracks flow from qualifiers to `enableDefaultTyping(...)` calls to a subsequent `readValue` or similar call. | |
| * | |
| * Such a `readValue` call could deserialize an arbitrary user-specified class. | |
| */ | |
| private class EnableJacksonDefaultTypingConfig extends DataFlow2::Configuration { |
| override predicate isSink(DataFlow::Node sink) { sink instanceof ObjectMapperReadSink } | ||
| } | ||
|
|
||
| private class SafeObjectMapperConfig extends DataFlow2::Configuration { |
There was a problem hiding this comment.
| private class SafeObjectMapperConfig extends DataFlow2::Configuration { | |
| /** | |
| * Tracks flow from qualifiers to `[set]PolymorphicTypeValidator(...)` calls to a subsequent `readValue` or similar call, including across builder method calls. | |
| * | |
| * Such a `readValue` call is safe because validation will likely prevent instantiating unexpected types. | |
| */ | |
| private class SafeObjectMapperConfig extends DataFlow2::Configuration { |
| } | ||
| } | ||
|
|
||
| private class UnsafeTypeConfig extends TaintTracking2::Configuration { |
There was a problem hiding this comment.
| private class UnsafeTypeConfig extends TaintTracking2::Configuration { | |
| /** | |
| * Tracks flow from a remote source to a type descriptor (e.g. a `java.lang.Class` instance) passed to a Jackson deserialization method. | |
| * | |
| * If this is user-controlled, arbitrary code could be executed while instantiating the user-specified type. | |
| */ | |
| private class UnsafeTypeConfig extends TaintTracking2::Configuration { |
| } | ||
|
|
||
| /** | ||
| * Holds if `fromNode` to `toNode` is a dataflow step that looks like resolving a class. |
There was a problem hiding this comment.
| * Holds if `fromNode` to `toNode` is a dataflow step that looks like resolving a class. | |
| * Holds if `fromNode` to `toNode` is a dataflow step that looks like resolving a class. | |
| * | |
| * Note any method that returns a `Class` or similar is assumed to propagate taint from all | |
| * of its arguments, so methods that accept user-controlled data but sanitize it or use it for some | |
| * completely different purpose before returning a `Class` could result in false positives. | |
| */ |
| /** | ||
| * Holds if `fromNode` to `toNode` is a dataflow step that creates a Jackson parser. | ||
| */ | ||
| predicate createJacksonJsonParserStep(DataFlow::Node fromNode, DataFlow::Node toNode) { |
There was a problem hiding this comment.
Could the createParser be a sink, or could we still encounter sanitisers on the road to readValues?
| } | ||
|
|
||
| /** | ||
| * Holds if `fromNode` to `toNode` is a dataflow step that creates a Jackson parser. |
There was a problem hiding this comment.
| * Holds if `fromNode` to `toNode` is a dataflow step that creates a Jackson parser. | |
| * Holds if `fromNode` to `toNode` is a dataflow step that creates a Jackson parser. | |
| * | |
| * For example, a `createParser(userString)` call yields a `JsonParser` which becomes dangerous | |
| * if passed to an unsafely-configured `ObjectMapper`'s `readValue` method. |
| } | ||
|
|
||
| /** | ||
| * Holds if `fromNode` to `toNode` is a dataflow step that creates a Jackson `TreeNode`. |
There was a problem hiding this comment.
| * Holds if `fromNode` to `toNode` is a dataflow step that creates a Jackson `TreeNode`. | |
| * Holds if `fromNode` to `toNode` is a dataflow step that creates a Jackson `TreeNode`. | |
| * | |
| * These are parse trees of user-supplied JSON, which may lead to arbitrary code execution | |
| * if passed to an unsafely-configured `ObjectMapper`'s `treeToValue` method. |
| or | ||
| exists(EnableJacksonDefaultTypingConfig config | config.hasFlowToExpr(ma.getQualifier())) | ||
| or | ||
| exists(RefType argType, int i | i > 0 and argType = ma.getArgument(i).getType() | |
There was a problem hiding this comment.
Add a comment here giving an example of a sink method
For example, you can look at how Apache Camel resolves classes Camel has an interface For some well-known or primitive classes, Otherwise, Technically, we can add a dataflow config that checks whether tainted data reaches one of the methods from the standard Java library which resolve a class. Unfortunately, such a config won't detect flows when an application uses a library or just a separate utility component that resolves classes (for example, imagine that I agree that the current |
|
How about as a compromise, the rule: it's probably a class-loading function if it (a) is external (otherwise we implicitly look inside it for libraries it in turn calls), (b) takes a String, (c) returns a Class, and (d) has a likely name: something containing |
@smowton Sounds good to me. One question though.
How would you prefer to check it? Should we call external everything that don't belong to |
|
Checking for |
|
Actually just remembered |
|
Thanks for the suggestions @smowton ! I've updated |
| | | ||
| m.getReturnType() instanceof JacksonTypeDescriptorType and | ||
| m.getName().toLowerCase().regexpMatch("resolve|load|class|type") and | ||
| m.fromSource() and |
There was a problem hiding this comment.
| m.fromSource() and | |
| not m.fromSource() and |
Methods that are from source should themselves call Class.forName or something else matching this heuristic
There was a problem hiding this comment.
I think I didn't understand what you mean by "external". I'd like this predicate to cover cases explained by the example above
Something like the following:
private static void deserialize(/* tainted */ String data, /* tainted */ String type) throws Exception {
ObjectMapper mapper = new ObjectMapper();
mapper.readValue(data, resolveTypeImpl(type));
}
// this method has complex logic for resolving types
private static JavaType resolveTypeImpl(String type) throws Exception {
...
}I'd like the predicate to catch resolveTypeImpl() if it is in application's code, not only in an external library. Now, I'd rather drop m.fromSource() check. I've also updated the query to explicitly cover Class.forName() and ClassLoader.load() calls.
|
Over to @aschackmull as this is targeting the main query pack |
|
I resolved the merge conflicts, and also removed |
…trakov/ql into unsafe-jackson-deserialization
|
@aschackmull I've updated the qldoc and tried to make imports as private as possible. Unfortunately, the docs don't say much about https://codeql.github.com/docs/ql-language-reference/annotations/#private I am wondering what kind of benefits private imports have besides limiting scope. Does it speed the query up, or maybe reduce memory consumption? |
It's primarily just to limit scope and help maintainability. I guess it might affect compilation speed slightly, but that's just speculation on my part. |
Co-authored-by: Anders Schack-Mulligen <aschackmull@users.noreply.github.com>
| */ | ||
| private predicate hasJsonTypeInfoAnnotation(RefType type) { | ||
| hasFieldWithJsonTypeAnnotation(type.getASupertype*()) or | ||
| hasFieldWithJsonTypeAnnotation(type.getAField().getType()) |
There was a problem hiding this comment.
Should this line be deleted? The predicate hasFieldWithJsonTypeAnnotation already includes the .getAField() bit, so this line means that you allow jumping from a type to one of its fields exactly once or twice, which seems odd and inconsistent with the qldoc.
There was a problem hiding this comment.
Hmm, it looks like the tests actually use this case? But then the restriction to only one or two .getAField() still seem arbitrary. Should it instead be the following?
| hasFieldWithJsonTypeAnnotation(type.getAField().getType()) | |
| hasJsonTypeInfoAnnotation(type.getAField().getType()) |
There was a problem hiding this comment.
Yes, you're right. I've updated the query.
| m = ma.getMethod() and arg = ma.getArgument(i) | ||
| | | ||
| m.getReturnType() instanceof JacksonTypeDescriptorType and | ||
| m.getName().toLowerCase().regexpMatch("resolve|load|class|type") and |
There was a problem hiding this comment.
This doesn't quite do what you want, since regexpMatch matches the entire receiver.
There was a problem hiding this comment.
Yup, I'll update the regex. Thanks!
java/ql/src/semmle/code/java/security/UnsafeDeserializationQuery.qll
Outdated
Show resolved
Hide resolved
java/ql/src/semmle/code/java/security/UnsafeDeserializationQuery.qll
Outdated
Show resolved
Hide resolved
java/ql/src/semmle/code/java/security/UnsafeDeserializationQuery.qll
Outdated
Show resolved
Hide resolved
java/ql/src/semmle/code/java/security/UnsafeDeserializationQuery.qll
Outdated
Show resolved
Hide resolved
aschackmull
left a comment
There was a problem hiding this comment.
I've looked through everything now, still a few inline comments, otherwise I think this is looking good.
Co-authored-by: Anders Schack-Mulligen <aschackmull@users.noreply.github.com>
Co-authored-by: Anders Schack-Mulligen <aschackmull@users.noreply.github.com>
@aschackmull I've addressed your comments. Thanks! |
Deserialization of untrusted data with Jackson is known to be dangerous:
https://www.github.com/mbechler/marshalsec/blob/master/marshalsec.pdf?raw=true
I'd like to propose to add sinks for Jackson to
UnsafeDeserialization.ql:ObjectMapper.ObjectMapper.enableDefaultTyping()call.JsonTypeInfoannotation.See https://cowtowncoder.medium.com/on-jackson-cves-dont-panic-here-is-what-you-need-to-know-54cd0d6e8062
There are multiples CVEs for deserialization gadgets for Jackson Databind
https://www.cvedetails.com/vulnerability-list/vendor_id-15866/product_id-42991/Fasterxml-Jackson-databind.html
Those CVEs create quite a lot of noise because many applications don't use polymorphic typing. The query would help check if applications are really affected by such CVEs. On the other hand, if an application actually turns on polymorphic typing, and deserializes data from remote peers, that would be a significant security risk that can be identified by this query.
The changeset is quite big and the query is in the supported query set. Would it be better to create a separate experimental query instead?
I'm going to add tests and update docs once we know whether this should be part of the supported query or a separate one. Until the questions above gets an answer, I'd like to keep this pull request in draft mode but earlier feedback is of course very welcome.Results:
2.16.4.