Java: add models of JSON-java, aka org.json#6325
Conversation
javaGenerated file changes for java
- Others,"``com.esotericsoftware.kryo.io``, ``com.esotericsoftware.kryo5.io``, ``com.fasterxml.jackson.databind``, ``com.unboundid.ldap.sdk``, ``org.apache.commons.codec``, ``org.apache.commons.jexl2``, ``org.apache.commons.jexl3``, ``org.apache.directory.ldap.client.api``, ``org.apache.ibatis.jdbc``, ``org.dom4j``, ``org.hibernate``, ``org.jooq``, ``org.xml.sax``, ``org.xmlpull.v1``, ``play.mvc``",7,12,82,,,,14,18,,
+ Others,"``com.esotericsoftware.kryo.io``, ``com.esotericsoftware.kryo5.io``, ``com.fasterxml.jackson.databind``, ``com.unboundid.ldap.sdk``, ``org.apache.commons.codec``, ``org.apache.commons.jexl2``, ``org.apache.commons.jexl3``, ``org.apache.directory.ldap.client.api``, ``org.apache.ibatis.jdbc``, ``org.dom4j``, ``org.hibernate``, ``org.jooq``, ``org.json``, ``org.xml.sax``, ``org.xmlpull.v1``, ``play.mvc``",7,237,82,,,,14,18,,
- Totals,,84,2428,296,13,6,6,107,33,1,66
+ Totals,,84,2653,296,13,6,6,107,33,1,66
+ org.json,,,225,,,,,,,,,,,,,,,195,30 |
org.jsonorg.json
Marcono1234
left a comment
There was a problem hiding this comment.
Great that this pull request covers the complete JSON-java project!
Hopefully the following review comments are helpful.
| "org.json;XMLXsiTypeConverter;true;convert;;;Argument[0];ReturnValue;taint", | ||
| "org.json;CDL;false;rowToJSONArray;;;Argument[0];ReturnValue;taint", | ||
| "org.json;CDL;false;rowToJSONObject;;;Argument[0..1];ReturnValue;taint", | ||
| "org.json;CDL;false;toJSONArray;;;Argument[0..1];ReturnValue;taint", |
| "org.json;JSONTokener;false;nextTo;;;Argument[-1];ReturnValue;taint", | ||
| "org.json;JSONTokener;false;nextValue;;;Argument[-1];ReturnValue;taint", | ||
| "org.json;JSONTokener;false;syntaxError;;;Argument[0..1];ReturnValue;taint", | ||
| "org.json;JSONTokener;false;toString;;;Argument[-1];ReturnValue;taint", |
There was a problem hiding this comment.
These should probably all consider subtypes because JSONTokener has the subclasses HTTPTokener and XMLTokener.
| "org.json;JSONPointer$Builder;false;append;;;Argument[-1];ReturnValue;value", | ||
| "org.json;JSONPointer$Builder;false;build;;;Argument[-1];ReturnValue;taint", | ||
| "org.json;JSONStringer;false;toString;;;Argument[-1];ReturnValue;taint", | ||
| "org.json;JSONTokener;false;JSONTokener;;;Argument[0];Argument[-1];taint", |
There was a problem hiding this comment.
Should this also cover the static method dehexchar?
There was a problem hiding this comment.
I left individual chars out as too far-fetched a taint vector
| "org.json;JSONArray;false;optNumber;;;Argument[-1];ReturnValue;taint", | ||
| "org.json;JSONArray;false;optQuery;;;Argument[-1];ReturnValue;taint", | ||
| "org.json;JSONArray;false;optString;;;Argument[-1];ReturnValue;taint", | ||
| // Default values that may be returned by the `opt*` functions above: |
There was a problem hiding this comment.
Really minor, but for Java they are usually called "method":
| // Default values that may be returned by the `opt*` functions above: | |
| // Default values that may be returned by the `opt*` methods above: |
| "org.json;JSONArray;false;optLong;;;Argument[1];ReturnValue;value", | ||
| "org.json;JSONArray;false;optNumber;;;Argument[1];ReturnValue;value", | ||
| "org.json;JSONArray;false;optString;;;Argument[1];ReturnValue;value", | ||
| "org.json;JSONArray;false;put;(boolean);;Argument[0];Argument[-1];taint", |
There was a problem hiding this comment.
Might be missing value flow from put to the return value (since the methods return this).
| "org.json;JSONTokener;false;nextClean;;;Argument[-1];ReturnValue;taint", | ||
| "org.json;JSONTokener;false;nextString;;;Argument[-1];ReturnValue;taint", | ||
| "org.json;JSONTokener;false;nextTo;;;Argument[-1];ReturnValue;taint", | ||
| "org.json;JSONTokener;false;nextValue;;;Argument[-1];ReturnValue;taint", |
There was a problem hiding this comment.
Should this also cover skipTo? Maybe as taint flow from argument 0 to the return value?
| "org.json;JSONTokener;false;nextValue;;;Argument[-1];ReturnValue;taint", | ||
| "org.json;JSONTokener;false;syntaxError;;;Argument[0..1];ReturnValue;taint", | ||
| "org.json;JSONTokener;false;toString;;;Argument[-1];ReturnValue;taint", | ||
| "org.json;JSONWriter;true;JSONWriter;;;Argument[-1];Argument[0];taint", |
There was a problem hiding this comment.
Out of pure interest, is this actually supported by CodeQL? I.e. modeling flow from the to-be-created instance to one of the constructor arguments.
There was a problem hiding this comment.
Not yet (Go can do this, Java can't). Will add a comment that this doesn't work yet.
There was a problem hiding this comment.
I'd be curious to know more about how this is designed to work in Go. As-is, this line is never going to work so might as well remove it (the input Argument[-1] to a constructor in Java is the result of the implicit malloc, which is passed into the constructor as the value of the this parameter, and there isn't really any way that flow can reach that Node).
There was a problem hiding this comment.
The single-step relation (via the FunctionInput helper class) does some backward SSA use -> def walking, here: https://github.com/github/codeql-go/blob/main/ql/src/semmle/go/dataflow/FunctionInputsAndOutputs.qll#L124
For Java I expect this step would be something like: whenever taint should propagate from the result of a constructor (denoted Argument[-1]), introduce an edge from any post-update node concerning any variable that gets assigned the result of the constructor (or of the constructor itself if there is none) to the post-update node of the argument being tainted.
So here if we had
void f(Appendable sink, String source) {
JSONWriter w = new JSONWriter(sink);
w.write(source);
Then the w.write(source) would propagate taint from source to the post-update node of w as usual, but then the rule for new JSONWriter(sink) would wire that post-update node to sink's post-update node (nb. Go's use of SSA definitions as the post-update nodes of corresponding uses means less SSA graph walking is necessary there).
This is pretty brittle in Go and would be here too, and is designed to pick up the simplest cases of writer-wrapping.
| "org.json;Property;false;toProperties;;;Argument[0];MapKey of ReturnValue;taint", | ||
| "org.json;Property;false;toProperties;;;Argument[0];MapValue of ReturnValue;taint", | ||
| "org.json;Property;false;toJSONObject;;;MapKey of Argument[0];ReturnValue;taint", | ||
| "org.json;Property;false;toJSONObject;;;MapValue of Argument[0];ReturnValue;taint", |
There was a problem hiding this comment.
For consistency might be good to switch this because toJSONObject comes before toProperties (alphabetically and in the javadoc).
| "org.json;Property;false;toJSONObject;;;MapValue of Argument[0];ReturnValue;taint", | ||
| "org.json;XML;false;escape;;;Argument[0];ReturnValue;taint", | ||
| "org.json;XML;false;stringToValue;;;Argument[0];ReturnValue;taint", | ||
| "org.json;XML;false;toJSONObject;;;Argument[0];ReturnValue;taint", |
There was a problem hiding this comment.
Should this also cover taint from the XMLParserConfiguration argument to the return value? XMLParserConfiguration allows specifying how xsi:type elements are converted, see withXsiTypeMap (then it would also be necessary to model that class).
Though maybe covering this is not worth it.
There was a problem hiding this comment.
I'll leave this as relevant taint flow seems very far-fetched
| "org.json;XML;false;escape;;;Argument[0];ReturnValue;taint", | ||
| "org.json;XML;false;stringToValue;;;Argument[0];ReturnValue;taint", | ||
| "org.json;XML;false;toJSONObject;;;Argument[0];ReturnValue;taint", | ||
| "org.json;XML;false;toString;;;Argument[0];ReturnValue;taint", |
There was a problem hiding this comment.
Should this also cover the tagName argument of the toString methods?
|
Thanks @Marcono1234 @aschackmull this is ready for re-review |
javaGenerated file changes for java
- Others,"``com.esotericsoftware.kryo.io``, ``com.esotericsoftware.kryo5.io``, ``com.fasterxml.jackson.databind``, ``com.unboundid.ldap.sdk``, ``org.apache.commons.codec``, ``org.apache.commons.jexl2``, ``org.apache.commons.jexl3``, ``org.apache.directory.ldap.client.api``, ``org.apache.ibatis.jdbc``, ``org.dom4j``, ``org.hibernate``, ``org.jooq``, ``org.xml.sax``, ``org.xmlpull.v1``, ``play.mvc``",7,12,82,,,,14,18,,
+ Others,"``com.esotericsoftware.kryo.io``, ``com.esotericsoftware.kryo5.io``, ``com.fasterxml.jackson.databind``, ``com.unboundid.ldap.sdk``, ``org.apache.commons.codec``, ``org.apache.commons.jexl2``, ``org.apache.commons.jexl3``, ``org.apache.directory.ldap.client.api``, ``org.apache.ibatis.jdbc``, ``org.dom4j``, ``org.hibernate``, ``org.jooq``, ``org.json``, ``org.xml.sax``, ``org.xmlpull.v1``, ``play.mvc``",7,248,82,,,,14,18,,
- Totals,,84,2465,296,13,6,6,107,33,1,66
+ Totals,,84,2701,296,13,6,6,107,33,1,66
+ org.json,,,236,,,,,,,,,,,,,,,198,38 |
Remove unnecessary import Co-authored-by: Anders Schack-Mulligen <aschackmull@users.noreply.github.com>
javaGenerated file changes for java
- Others,"``com.esotericsoftware.kryo.io``, ``com.esotericsoftware.kryo5.io``, ``com.fasterxml.jackson.databind``, ``com.unboundid.ldap.sdk``, ``org.apache.commons.codec``, ``org.apache.commons.jexl2``, ``org.apache.commons.jexl3``, ``org.apache.directory.ldap.client.api``, ``org.apache.ibatis.jdbc``, ``org.dom4j``, ``org.hibernate``, ``org.jooq``, ``org.mvel2``, ``org.xml.sax``, ``org.xmlpull.v1``, ``play.mvc``",7,12,98,,,,14,18,,
+ Others,"``com.esotericsoftware.kryo.io``, ``com.esotericsoftware.kryo5.io``, ``com.fasterxml.jackson.databind``, ``com.unboundid.ldap.sdk``, ``org.apache.commons.codec``, ``org.apache.commons.jexl2``, ``org.apache.commons.jexl3``, ``org.apache.directory.ldap.client.api``, ``org.apache.ibatis.jdbc``, ``org.dom4j``, ``org.hibernate``, ``org.jooq``, ``org.json``, ``org.mvel2``, ``org.xml.sax``, ``org.xmlpull.v1``, ``play.mvc``",7,248,98,,,,14,18,,
- Totals,,84,2465,313,13,6,6,107,33,1,66
+ Totals,,84,2701,313,13,6,6,107,33,1,66
+ org.json,,,236,,,,,,,,,,,,,,,,198,38 |
Strategy: like
javax.json, I have adopted the approach of associated monolithic taint with JSON objects as a whole rather than elements, map-values and so on, on the assumption that 99% of use cases are about de/serialization and not using the objects like containers.