Skip to content

Java: improve qhelp for java/unsafe-deserialization#21807

Open
owen-mc wants to merge 8 commits intogithub:mainfrom
owen-mc:java/improve-qhelp-unsafe-deserialization
Open

Java: improve qhelp for java/unsafe-deserialization#21807
owen-mc wants to merge 8 commits intogithub:mainfrom
owen-mc:java/improve-qhelp-unsafe-deserialization

Conversation

@owen-mc
Copy link
Copy Markdown
Contributor

@owen-mc owen-mc commented May 7, 2026

Clarify that deserialization that follows a schema is safe.

Copilot AI review requested due to automatic review settings May 7, 2026 09:52
@owen-mc owen-mc requested a review from a team as a code owner May 7, 2026 09:52
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 7, 2026

QHelp previews:

csharp/ql/src/Security Features/CWE-502/UnsafeDeserialization.qhelp

Unsafe deserializer

Deserializing an object from untrusted input may result in security problems, such as denial of service or remote code execution.

Note that a deserialization method is only dangerous if it can instantiate arbitrary classes. Serialization frameworks that use a schema to instantiate only expected, predefined types are generally not tracked by this query. Such frameworks are generally safe with respect to arbitrary-class-instantiation and gadget-chain attacks when the schema is trusted and does not permit user-controlled type resolution. However, care must be taken to ensure the schema strictly limits the allowed types. Permitting common standard library classes can still leave the application vulnerable to gadget-chain attacks.

Recommendation

Avoid using an unsafe deserialization framework.

Example

In this example, a string is deserialized using a JavaScriptSerializer with a simple type resolver. Using a type resolver means that arbitrary code may be executed.

using System.Web.Script.Serialization;

class Bad
{
    public static object Deserialize(string s)
    {
        JavaScriptSerializer sr = new JavaScriptSerializer(new SimpleTypeResolver());
        // BAD
        return sr.DeserializeObject(s);
    }
}

To fix this specific vulnerability, we avoid using a type resolver. In other cases, it may be necessary to use a different deserialization framework.

using System.Web.Script.Serialization;

class Good
{
    public static object Deserialize(string s)
    {
        // GOOD
        JavaScriptSerializer sr = new JavaScriptSerializer();
        return sr.DeserializeObject(s);
    }
}

References

csharp/ql/src/Security Features/CWE-502/UnsafeDeserializationUntrustedInput.qhelp

Deserialization of untrusted data

Deserializing an object from untrusted input may result in security problems, such as denial of service or remote code execution.

Note that a deserialization method is only dangerous if it can instantiate arbitrary classes. Serialization frameworks that use a schema to instantiate only expected, predefined types are generally not tracked by this query. Such frameworks are generally safe with respect to arbitrary-class-instantiation and gadget-chain attacks when the schema is trusted and does not permit user-controlled type resolution. However, care must be taken to ensure the schema strictly limits the allowed types. Permitting common standard library classes can still leave the application vulnerable to gadget-chain attacks.

Recommendation

Avoid deserializing objects from an untrusted source, and if not possible, make sure to use a safe deserialization framework.

Example

In this example, text from an HTML text box is deserialized using a JavaScriptSerializer with a simple type resolver. Using a type resolver means that arbitrary code may be executed.

using System.Web.UI.WebControls;
using System.Web.Script.Serialization;

class Bad
{
    public static object Deserialize(TextBox textBox)
    {
        JavaScriptSerializer sr = new JavaScriptSerializer(new SimpleTypeResolver());
        // BAD
        return sr.DeserializeObject(textBox.Text);
    }
}

To fix this specific vulnerability, we avoid using a type resolver. In other cases, it may be necessary to use a different deserialization framework.

using System.Web.UI.WebControls;
using System.Web.Script.Serialization;

class Good
{
    public static object Deserialize(TextBox textBox)
    {
        JavaScriptSerializer sr = new JavaScriptSerializer();
        // GOOD: no unsafe type resolver
        return sr.DeserializeObject(textBox.Text);
    }
}

In the following example potentially untrusted stream and type is deserialized using a DataContractJsonSerializer which is known to be vulnerable with user supplied types.

using System.Runtime.Serialization.Json;
using System.IO;
using System;

class BadDataContractJsonSerializer
{
    public static object Deserialize(string type, Stream s)
    {
        // BAD: stream and type are potentially untrusted
        var ds = new DataContractJsonSerializer(Type.GetType(type));
        return ds.ReadObject(s);
    }
}

To fix this specific vulnerability, we are using hardcoded Plain Old CLR Object (POCO) type. In other cases, it may be necessary to use a different deserialization framework.

using System.Runtime.Serialization.Json;
using System.IO;
using System;

class Poco
{
    public int Count;

    public string Comment;
}

class GoodDataContractJsonSerializer
{
    public static Poco Deserialize(Stream s)
    {
        // GOOD: while stream is potentially untrusted, the instantiated type is hardcoded
        var ds = new DataContractJsonSerializer(typeof(Poco));
        return (Poco)ds.ReadObject(s);
    }
}

References

java/ql/src/Security/CWE/CWE-502/UnsafeDeserialization.qhelp

Deserialization of user-controlled data

Deserializing untrusted data using any deserialization framework that allows the construction of arbitrary serializable objects is easily exploitable and in many cases allows an attacker to execute arbitrary code. Even before a deserialized object is returned to the caller of a deserialization method a lot of code may have been executed, including static initializers, constructors, and finalizers. Automatic deserialization of fields means that an attacker may craft a nested combination of objects on which the executed initialization code may have unforeseen effects, such as the execution of arbitrary code.

There are many different serialization frameworks. This query currently supports Kryo, XmlDecoder, XStream, SnakeYaml, JYaml, JsonIO, YAMLBeans, HessianBurlap, Castor, Burlap, Jackson, Jabsorb, Jodd JSON, Flexjson, Gson, JMS, and Java IO serialization through ObjectInputStream/ObjectOutputStream.

Note that a deserialization method is only dangerous if it can instantiate arbitrary classes. Serialization frameworks that use a schema to instantiate only expected, predefined types are generally not tracked by this query. For example, Apache Avro's deserialization methods follow a schema and are therefore generally safe with respect to arbitrary-class-instantiation and gadget-chain attacks when the schema is trusted and does not permit user-controlled type resolution. However, care must be taken to ensure the schema strictly limits the allowed types. Permitting common standard library classes can still leave the application vulnerable to gadget-chain attacks.

Recommendation

Avoid deserialization of untrusted data if at all possible. If the architecture permits it then use other formats instead of serialized objects, for example JSON or XML. However, these formats should not be deserialized into complex objects because this provides further opportunities for attack. For example, XML-based deserialization attacks are possible through libraries such as XStream and XmlDecoder.

Alternatively, a tightly controlled whitelist can limit the vulnerability of code, but be aware of the existence of so-called Bypass Gadgets, which can circumvent such protection measures.

Recommendations specific to particular frameworks supported by this query:

FastJson - com.alibaba:fastjson

  • Secure by Default: Partially
  • Recommendation: Call com.alibaba.fastjson.parser.ParserConfig#setSafeMode with the argument true before deserializing untrusted data.

FasterXML - com.fasterxml.jackson.core:jackson-databind

  • Secure by Default: Yes
  • Recommendation: Don't call com.fasterxml.jackson.databind.ObjectMapper#enableDefaultTyping and don't annotate any object fields with com.fasterxml.jackson.annotation.JsonTypeInfo passing either the CLASS or MINIMAL_CLASS values to the annotation. Read this guide.

Kryo - com.esotericsoftware:kryo and com.esotericsoftware:kryo5

  • Secure by Default: Yes for com.esotericsoftware:kryo5 and for com.esotericsoftware:kryo >= v5.0.0
  • Recommendation: Don't call com.esotericsoftware.kryo(5).Kryo#setRegistrationRequired with the argument false on any Kryo instance that may deserialize untrusted data.

ObjectInputStream - Java Standard Library

  • Secure by Default: No
  • Recommendation: Use a validating input stream, such as org.apache.commons.io.serialization.ValidatingObjectInputStream.

SnakeYAML - org.yaml:snakeyaml

  • Secure by Default: As of version 2.0.
  • Recommendation: For versions before 2.0, pass an instance of org.yaml.snakeyaml.constructor.SafeConstructor to org.yaml.snakeyaml.Yaml's constructor before using it to deserialize untrusted data.

XML Decoder - Standard Java Library

  • Secure by Default: No
  • Recommendation: Do not use with untrusted user input.

ObjectMesssage - Java EE/Jakarta EE

  • Secure by Default: Depends on the JMS implementation.
  • Recommendation: Do not use with untrusted user input.

Example

The following example calls readObject directly on an ObjectInputStream that is constructed from untrusted data, and is therefore inherently unsafe.

public MyObject {
  public int field;
  MyObject(int field) {
    this.field = field;
  }
}

public MyObject deserialize(Socket sock) {
  try(ObjectInputStream in = new ObjectInputStream(sock.getInputStream())) {
    return (MyObject)in.readObject(); // BAD: in is from untrusted source
  }
}

Rewriting the communication protocol to only rely on reading primitive types from the input stream removes the vulnerability.

public MyObject deserialize(Socket sock) {
  try(DataInputStream in = new DataInputStream(sock.getInputStream())) {
    return new MyObject(in.readInt()); // GOOD: read only an int
  }
}

References

python/ql/src/Security/CWE-502/UnsafeDeserialization.qhelp

Deserialization of user-controlled data

Deserializing untrusted data using any deserialization framework that allows the construction of arbitrary serializable objects is easily exploitable and in many cases allows an attacker to execute arbitrary code. Even before a deserialized object is returned to the caller of a deserialization method a lot of code may have been executed, including static initializers, constructors, and finalizers. Automatic deserialization of fields means that an attacker may craft a nested combination of objects on which the executed initialization code may have unforeseen effects, such as the execution of arbitrary code.

There are many different serialization frameworks. This query currently supports Pickle, Marshal and Yaml.

Note that a deserialization method is only dangerous if it can instantiate arbitrary classes. Serialization frameworks that use a schema to instantiate only expected, predefined types are generally not tracked by this query. Such frameworks are generally safe with respect to arbitrary-class-instantiation and gadget-chain attacks when the schema is trusted and does not permit user-controlled type resolution. However, care must be taken to ensure the schema strictly limits the allowed types. Permitting common standard library classes can still leave the application vulnerable to gadget-chain attacks.

Recommendation

Avoid deserialization of untrusted data if at all possible. If the architecture permits it then use other formats instead of serialized objects, for example JSON.

If you need to use YAML, use the yaml.safe_load function.

Example

The following example calls pickle.loads directly on a value provided by an incoming HTTP request. Pickle then creates a new value from untrusted data, and is therefore inherently unsafe.

from django.conf.urls import url
import pickle

def unsafe(pickled):
    return pickle.loads(pickled)

urlpatterns = [
    url(r'^(?P<object>.*)$', unsafe)
]

Changing the code to use json.loads instead of pickle.loads removes the vulnerability.

from django.conf.urls import url
import json

def safe(pickled):
    return json.loads(pickled)

urlpatterns = [
    url(r'^(?P<object>.*)$', safe)
]

References

ruby/ql/src/queries/security/cwe-502/UnsafeDeserialization.qhelp

Deserialization of user-controlled data

Deserializing untrusted data using any method that allows the construction of arbitrary objects is easily exploitable and, in many cases, allows an attacker to execute arbitrary code.

Note that a deserialization method is only dangerous if it can instantiate arbitrary classes or objects. Serialization frameworks that use a schema to instantiate only expected, predefined types are generally not tracked by this query. Such frameworks are generally safe with respect to arbitrary-class-instantiation and gadget-chain attacks when the schema is trusted and does not permit user-controlled type resolution. However, care must be taken to ensure the schema strictly limits the allowed types. Permitting common standard library classes can still leave the application vulnerable to gadget-chain attacks.

Recommendation

Avoid deserialization of untrusted data if possible. If the architecture permits it, use serialization formats that cannot represent arbitrary objects. For libraries that support it, such as the Ruby standard library's JSON module, ensure that the parser is configured to disable deserialization of arbitrary objects.

If deserializing an untrusted YAML document using the psych gem, prefer the safe_load and safe_load_file methods over load and load_file, as the former will safely handle untrusted data. Avoid passing untrusted data to the load_stream method. In psych version 4.0.0 and above, the load method can safely be used.

If deserializing an untrusted XML document using the ox gem, do not use parse_obj and load using the non-default :object mode. Instead use the load method in the default mode or better explicitly set a safe mode such as :hash.

To safely deserialize Property List files using the plist gem, ensure that you pass marshal: false when calling Plist.parse_xml.

Example

The following example calls the Marshal.load, JSON.load, YAML.load, Oj.load and Ox.parse_obj methods on data from an HTTP request. Since these methods are capable of deserializing to arbitrary objects, this is inherently unsafe.

require 'json'
require 'yaml'
require 'oj'

class UserController < ActionController::Base
  def marshal_example
    data = Base64.decode64 params[:data]
    object = Marshal.load data
    # ...
  end

  def json_example
    object = JSON.load params[:json]
    # ...
  end

  def yaml_example
    object = YAML.load params[:yaml]
    # ...
  end

  def oj_example
    object = Oj.load params[:json]
    # ...
  end

  def ox_example
    object = Ox.parse_obj params[:xml]
    # ...
  end
end

Using JSON.parse and YAML.safe_load instead, as in the following example, removes the vulnerability. Similarly, calling Oj.load with any mode other than :object is safe, as is calling Oj.safe_load. Note that there is no safe way to deserialize untrusted data using Marshal.

require 'json'

class UserController < ActionController::Base
  def safe_json_example
    object = JSON.parse params[:json]
    # ...
  end

  def safe_yaml_example
    object = YAML.safe_load params[:yaml]
    # ...
  end

  def safe_oj_example
    object = Oj.load params[:yaml], { mode: :strict }
    # or
    object = Oj.safe_load params[:yaml]
    # ...
  end
end

References

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the qhelp documentation for the java/unsafe-deserialization query to better explain when deserialization is dangerous, aiming to clarify that schema-driven deserialization (that doesn’t allow arbitrary class instantiation) is generally not in scope.

Changes:

  • Adds explanatory text distinguishing “arbitrary class instantiation” deserialization from schema-based deserialization.
  • Uses Apache Avro as an example of schema-following deserialization.
  • Minor whitespace/formatting cleanups in the qhelp text.
Show a summary per file
File Description
java/ql/src/Security/CWE/CWE-502/UnsafeDeserialization.qhelp Adds guidance about schema-based deserialization being generally safer / out of scope, plus minor formatting tweaks.

Copilot's findings

  • Files reviewed: 1/1 changed files
  • Comments generated: 1

Comment thread java/ql/src/Security/CWE/CWE-502/UnsafeDeserialization.qhelp Outdated
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@BazookaMusic
Copy link
Copy Markdown

Does it also make sense to add the same note to the qhelp for the same CWE in other languages too?

@owen-mc owen-mc requested review from a team as code owners May 8, 2026 13:07
@owen-mc
Copy link
Copy Markdown
Contributor Author

owen-mc commented May 8, 2026

Good idea @BazookaMusic . I've done that for the queries where it makes sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants