Skip to content

fix(spanner_dbapi): replace insecure pickle with json for partition deserialization#17014

Open
sinhasubham wants to merge 2 commits into
mainfrom
dbApi-vlnr
Open

fix(spanner_dbapi): replace insecure pickle with json for partition deserialization#17014
sinhasubham wants to merge 2 commits into
mainfrom
dbApi-vlnr

Conversation

@sinhasubham
Copy link
Copy Markdown
Contributor

@sinhasubham sinhasubham commented May 11, 2026

This PR resolves a critical Insecure Deserialization vulnerability (potential Remote Code Execution) in the spanner_dbapi module b/510871112 . Previously, the module utilized pickle.loads() to decode partition IDs provided by users via the RUN PARTITION statement, creating a possibility for arbitrary code execution attack payloads.
We have fully eliminated pickle usage in this module and migrated to standard json serialization.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request replaces the use of pickle with json for serializing and deserializing partition IDs to mitigate security risks associated with insecure deserialization. It introduces _serialize_value and _deserialize_value helper functions to handle specific types like bytes, datetime, and protobuf messages. Review feedback points out that MessageToDict defaults to camelCase, which could break compatibility with code expecting snake_case, and suggests using preserving_proto_field_name=True. Additionally, the reviewer noted that protobuf messages are currently deserialized as dictionaries rather than original message objects, which may lead to issues with nested field types.

Comment thread packages/google-cloud-spanner/google/cloud/spanner_dbapi/partition_helper.py Outdated
Comment thread packages/google-cloud-spanner/google/cloud/spanner_dbapi/partition_helper.py Outdated
Comment thread packages/google-cloud-spanner/google/cloud/spanner_dbapi/partition_helper.py Outdated
@sinhasubham sinhasubham marked this pull request as ready for review May 12, 2026 06:00
@sinhasubham sinhasubham requested a review from a team as a code owner May 12, 2026 06:00
}


def _serialize_value(val: Any) -> Any:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the chosen solution approach has a design flaw, and that is that it tries to manually serialize a high-level Statement object into JSON. That means that it needs to take into account every single query parameter type that a Statement can contain, and that list is long and, most importantly, not fixed. Someone could add support for a new high-level type tomorrow, and that would then break this serialization, unless they also add it to this serializer.

We can 'fix' all of this by adding all supported types here, and also add a test that tries to catch it if someone adds a new type in the future that is not in this list. I added 5ef6394 to illustrate all of this. But as you can see: The code becomes quite bloated, and the test uses meta-programming to verify that this serialization is consistent with the _make_value_pb function (which is a code smell).

I would suggest that we try a different strategy for this, and instead use a two-step serialization strategy:

  1. First convert the Statement into an ExecuteSqlRequest using the standard functions in the client library (or at least convert the query parameters of the Statement into the Struct that would have been used for the ExecuteSqlRequest).
  2. Then serialize the ExecuteSqlRequest (or the Struct with the query parameters).

This removes the need to manually map all possible high-level query parameter types, as ExecuteSqlRequest is guaranteed to only use well-known types for query parameters, and this list is fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants