feat: Reworked MutateRows to use the data client by gkevinzheng · Pull Request #1290 · googleapis/python-bigtable

gkevinzheng · 2026-02-05T22:15:27Z

Changes Made:

Rerouted MutateRows to use the data client's Table.bulk_mutate_rows function.
Separated bulk_mutate_rows into a function that constructs the _MutateRowsOperation object and the part that starts the operation. This is so we could inspect the operation's errors for custom error handling.
Reworked error handling in Table.mutate_rows. Since we must return a list of Status objects, we process the list of errors obtained from the _MutateRowsOperation back into a corresponding list of error statuses.
Reworked retries in Table.mutate_rows as follows:
- retry.deadline -> operation_timeout
- timeout/mutation_timeout -> attempt_timeout
- A null retry deadline now defaults to the table default mutate rows operation timeout instead of being defined as no timeout. The docstring has been updated to reflect that. A retry deadline of 0.0 will still be a no-retry option, and is implemented by setting the list of retriable errors to the empty list, and the operation timeout to the default mutate rows operation timeout.
A timestamp of None in set_cell, which is the default, will now generate a client-side timestamp representing the current time, rather than defaulting to using the server-side timestamp. The docstrings for this have been updated to reflect this.

gemini-code-assist · 2026-02-05T22:15:56Z

Summary of Changes

Hello @gkevinzheng, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant refactoring of the Table.mutate_rows method, transitioning its implementation to leverage the data client's bulk mutation capabilities. This change not only streamlines the mutation process but also brings substantial improvements to error handling and retry logic, providing more precise control and predictable behavior. Additionally, the default timestamp generation for set_cell has been clarified to ensure client-side timestamp usage when no explicit timestamp is provided.

Highlights

Rerouted MutateRows: The Table.mutate_rows method has been refactored to utilize the data client's Table.bulk_mutate_rows function, aligning with modern client architecture.
Separated Operation Construction: The bulk_mutate_rows functionality is now split into two parts: one for constructing the _MutateRowsOperation object and another for initiating the operation. This separation allows for more granular inspection of operation errors.
Reworked Error Handling: Error handling in Table.mutate_rows has been enhanced. It now processes exceptions from _MutateRowsOperation into a corresponding list of google.rpc.status_pb2.Status objects, ensuring a consistent return type for all mutations.
Updated Retry Logic: The retry mechanism for Table.mutate_rows has been updated. retry.deadline is now mapped to operation_timeout, and timeout/mutation_timeout are mapped to attempt_timeout. A None retry deadline now defaults to the table's default mutate rows operation timeout, while a 0.0 deadline explicitly disables retries.
Client-Side Timestamp for set_cell: When timestamp is None in set_cell, it will now generate a client-side timestamp representing the current time, rather than defaulting to a server-side timestamp. Docstrings have been updated to reflect this change.

Changelog

google/cloud/bigtable/data/_async/client.py
- Introduced _get_mutate_rows_operation to encapsulate the creation of the bulk mutate operation object.
- Modified bulk_mutate_rows to utilize the new _get_mutate_rows_operation method.
google/cloud/bigtable/data/_sync_autogen/client.py
- Added _get_mutate_rows_operation for synchronous bulk mutate operation object creation.
- Updated bulk_mutate_rows to call _get_mutate_rows_operation.
google/cloud/bigtable/row.py
- Updated docstrings for _set_cell and set_cell to clarify that timestamp=None now defaults to the current system time (client-side).
- Adjusted _set_cell logic to generate a client-side timestamp when timestamp is None or _SERVER_SIDE_TIMESTAMP.
google/cloud/bigtable/table.py
- Removed timeout and RetryError imports, added GoogleAPICallError, MutationsExceptionGroup, RowMutationEntry, TABLE_DEFAULT, code_pb2, and status_pb2 imports.
- Removed _retriable_internal_server_error import.
- Completely re-implemented mutate_rows to use the data client's _get_mutate_rows_operation.
- Implemented new error handling within mutate_rows to catch MutationsExceptionGroup and translate individual errors into status_pb2.Status objects.
- Revised retry logic: retry.deadline is now operation_timeout, timeout/mutation_timeout is attempt_timeout.
- Changed None retry deadline to default to TABLE_DEFAULT.MUTATE_ROWS.
- Implemented 0.0 retry deadline by setting retryable_errors to an empty list.
- Removed the _RetryableMutateRowsWorker class and associated helper functions (_compile_mutation_entries, _check_row_table_name, _check_row_type).
tests/system/v2_client/test_data_api.py
- Removed the _add_test_error_handler utility function.
- Updated test_table_mutate_rows_retries_timeout to reflect the new retry mechanism, specifically testing the 0.0 deadline behavior.
- Modified test_table_mutate_rows_input_errors to expect ValueError for invalid mutation counts (zero or too many) instead of InvalidArgument or TooManyMutationsError.
tests/unit/v2_client/test_row.py
- Added constants _TIME_NS_RETURN_VALUE and _EXPECTED_DEFAULT_TIMESTAMP_MICROS for mocking time.
- Updated _set_cell_helper to use mock.patch("time.time_ns") for consistent timestamp generation in tests.
- Added test_direct_row_set_cell_with_server_side_timestamp to verify handling of _SERVER_SIDE_TIMESTAMP.
- Removed unused google.rpc imports.
tests/unit/v2_client/test_table.py
- Removed extensive unit tests for the deprecated _compile_mutation_entries, _check_row_table_name, _check_row_type, and _RetryableMutateRowsWorker components.
- Refactored _table_mutate_rows_helper to mock the new _get_mutate_rows_operation and verify the new error processing logic.
- Introduced _MockRow helper class for testing.
- Updated test cases for mutate_rows to assert against the new operation_timeout, attempt_timeout, and retryable_errors parameters passed to the underlying operation.

Activity

No specific activity (comments, reviews, progress updates) was provided in the context for this pull request.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request effectively refactors MutateRows to use the new data client, which is a great improvement for consistency and maintainability. The separation of operation creation and execution, along with the updated error and retry handling, are well-implemented. I've added a couple of suggestions to improve code clarity in the new logic. Overall, this is a solid set of changes.

gemini-code-assist · 2026-02-05T22:17:44Z

google/cloud/bigtable/row.py

+        if timestamp is None or timestamp == mutations._SERVER_SIDE_TIMESTAMP:
+            timestamp_micros = timestamp


The logic for handling the timestamp argument is a bit condensed and could be made more explicit for better readability and maintainability. Expanding the if condition to handle the None and _SERVER_SIDE_TIMESTAMP cases separately would make the intent clearer.

Suggested change

if timestamp is None or timestamp == mutations._SERVER_SIDE_TIMESTAMP:

timestamp_micros = timestamp

if timestamp is None:

# Let SetCell handle client-side timestamp generation by passing None.

timestamp_micros = None

elif timestamp == mutations._SERVER_SIDE_TIMESTAMP:

# Use server-side timestamp.

timestamp_micros = mutations._SERVER_SIDE_TIMESTAMP

I kind of agree with this, it's easier to skim. But I also think the current code is totally fine

Another way to improve readability would just be to add a comment:

if timestamp is None or timestamp == mutations._SERVER_SIDE_TIMESTAMP: # preserve special-case values (client-side or server-side timestamp) timestamp_micros = timestamp

gemini-code-assist · 2026-02-05T22:17:44Z

google/cloud/bigtable/table.py

+        retryable_errors = RETRYABLE_MUTATION_ERRORS
+        operation_timeout = retry.deadline
+
+        # The data client cannot take in zero or null values for deadline, so we set it to
+        # the default if that is the case. It shouldn't affect the behavior of the retry
+        # if a 0.0 deadline is set.
+        if not retry.deadline:
+            operation_timeout = TABLE_DEFAULT.MUTATE_ROWS
+            if retry.deadline == 0.0:
+                retryable_errors = []


This block of code for handling retry.deadline can be refactored for better clarity. The current implementation with if not retry.deadline: handles both None and 0.0 cases, which can be a bit confusing. Explicitly checking for None and 0.0 would make the logic easier to follow and more maintainable.

Suggested change

retryable_errors = RETRYABLE_MUTATION_ERRORS

operation_timeout = retry.deadline

# The data client cannot take in zero or null values for deadline, so we set it to

# the default if that is the case. It shouldn't affect the behavior of the retry

# if a 0.0 deadline is set.

if not retry.deadline:

operation_timeout = TABLE_DEFAULT.MUTATE_ROWS

if retry.deadline == 0.0:

retryable_errors = []

retryable_errors = RETRYABLE_MUTATION_ERRORS

if retry.deadline is None:

# The data client cannot take in null values for deadline, so we set it to the default.

operation_timeout = TABLE_DEFAULT.MUTATE_ROWS

elif retry.deadline == 0.0:

# A deadline of 0.0 indicates a no-retry strategy.

# We set a default operation timeout and an empty list of retryable errors.

operation_timeout = TABLE_DEFAULT.MUTATE_ROWS

retryable_errors = []

else:

operation_timeout = retry.deadline

I agree this looks a bit cleaner

I'd also suggest doing retry.deadline == 0, instead of the 0.0 comparison

daniel-sanche · 2026-02-07T00:15:05Z

google/cloud/bigtable/table.py

+        retryable_errors = RETRYABLE_MUTATION_ERRORS
+        operation_timeout = retry.deadline
+
+        # The data client cannot take in zero or null values for deadline, so we set it to
+        # the default if that is the case. It shouldn't affect the behavior of the retry
+        # if a 0.0 deadline is set.
+        if not retry.deadline:
+            operation_timeout = TABLE_DEFAULT.MUTATE_ROWS
+            if retry.deadline == 0.0:
+                retryable_errors = []


I agree this looks a bit cleaner

I'd also suggest doing retry.deadline == 0, instead of the 0.0 comparison

daniel-sanche · 2026-02-07T00:29:45Z

google/cloud/bigtable/table.py

+            # and set the status of that row entry to that grpc status. if none of the
+            # errors for a given index have gRPC status codes, return an UNKNOWN status
+            # with the first error message of each index.
+            for idx, errors in operation.errors.items():


You shouldn't have to go into the operation object like this. Can't you do something like:

try: self._table_impl.mutate_rows( mutation_entries, operation_timeout=operation_timeout, attempt_timeout=attempt_timeout, retryable_errors=retryable_errors, ) except MutationsExceptionGroup as exc_group: for error: FailedMutationEntryError in exc_group.exceptions: return_statuses[error.index] = status_pb2.Status( code=error.__cause__.grpc_status_code.value[0], message=error.__cause__.message, details=error.__cause__.details, ) return_statuses

(It looks like it could be a bit more complicated, since MutationExceptionGroup doesn't guarantee all exceptions are FailedMutationEntryErrors. But looking at the only place this exception seems to be raised, maybe that comment is wrong? I think you can change it to guarantee all sub-seceptions are FailedMutationEntryErrors, which would make parsing here simple)

daniel-sanche · 2026-02-07T00:34:07Z

google/cloud/bigtable/data/_async/client.py

+            attempt_timeout,
+            retryable_exceptions=retryable_excs,
+        )
+        return operation


I'm not a big fan of polluting this file like this.

I mentioned in another comment that we should be able to get the information we need by parsing the exception group, which means we wouldn't need to do this

But even if we ignore that thread, I'd say it could be cleaner to just copy these three lines building the operation directly in the shim, instead of adding this helper. That way we're polluting the shim codebase, instead of the data client

What do you think?

daniel-sanche · 2026-02-07T00:36:23Z

google/cloud/bigtable/row.py

+        if timestamp is None or timestamp == mutations._SERVER_SIDE_TIMESTAMP:
+            timestamp_micros = timestamp


I kind of agree with this, it's easier to skim. But I also think the current code is totally fine

daniel-sanche · 2026-02-07T00:43:08Z

google/cloud/bigtable/row.py

+        if timestamp is None or timestamp == mutations._SERVER_SIDE_TIMESTAMP:
+            timestamp_micros = timestamp


Another way to improve readability would just be to add a comment:

if timestamp is None or timestamp == mutations._SERVER_SIDE_TIMESTAMP: # preserve special-case values (client-side or server-side timestamp) timestamp_micros = timestamp

feat: Reworked MutateRows to use the data client

9d83a7b

gkevinzheng requested review from a team as code owners February 5, 2026 22:15

product-auto-label bot added the size: xl Pull request size is extra large. label Feb 5, 2026

gkevinzheng requested a review from daniel-sanche February 5, 2026 22:15

product-auto-label bot added the api: bigtable Issues related to the googleapis/python-bigtable API. label Feb 5, 2026

gemini-code-assist bot reviewed Feb 5, 2026

View reviewed changes

daniel-sanche requested changes Feb 7, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Reworked MutateRows to use the data client#1290

feat: Reworked MutateRows to use the data client#1290
gkevinzheng wants to merge 1 commit intov3_stagingfrom
mutate-rows

gkevinzheng commented Feb 5, 2026

Uh oh!

gemini-code-assist bot commented Feb 5, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Feb 5, 2026

Uh oh!

daniel-sanche Feb 7, 2026

Uh oh!

daniel-sanche Feb 7, 2026

Uh oh!

gemini-code-assist bot Feb 5, 2026

Uh oh!

daniel-sanche Feb 7, 2026

Uh oh!

daniel-sanche Feb 7, 2026

Uh oh!

daniel-sanche Feb 7, 2026 •

edited

Loading

Uh oh!

daniel-sanche Feb 7, 2026

Uh oh!

daniel-sanche Feb 7, 2026

Uh oh!

daniel-sanche Feb 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		if timestamp is None or timestamp == mutations._SERVER_SIDE_TIMESTAMP:
		timestamp_micros = timestamp

-        if timestamp is None or timestamp == mutations._SERVER_SIDE_TIMESTAMP:
-            timestamp_micros = timestamp
+        if timestamp is None:
+            # Let SetCell handle client-side timestamp generation by passing None.
+            timestamp_micros = None
+        elif timestamp == mutations._SERVER_SIDE_TIMESTAMP:
+            # Use server-side timestamp.
+            timestamp_micros = mutations._SERVER_SIDE_TIMESTAMP

Conversation

gkevinzheng commented Feb 5, 2026

Uh oh!

gemini-code-assist bot commented Feb 5, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

daniel-sanche Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

daniel-sanche Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

daniel-sanche Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

daniel-sanche Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

daniel-sanche Feb 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

daniel-sanche Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

daniel-sanche Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

daniel-sanche Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

daniel-sanche Feb 7, 2026 •

edited

Loading