Summary
The 3.0 release bundles several large changes:
- Dropping Python 2 support. Python 3.8+ is now required.
- Complete rewrite of the protocol stack from static class definitions to
dynamic generation from Apache Kafka's JSON schema files.
- Significant performance work on the encode/decode hot path and byte
handling.
- Added broker-side feature support that has been accumulating (producer
transactional improvements from Kafka 2.5; likely some consumer features
from Kafka 2.4) while blocked on improved/flexible protocol support.
- Internal consolidation on an event-loop based network and task layer (
kafka.net)
and the removal of kafka.client_async and kafka.conn.
- Refactor of
KafkaConsumer internals to a single background IO thread,
matching the existing KafkaProducer architecture.
The public API of the high-level clients is expected to remain compatible
for users on Python 3.8+. This issue tracks the work items and invites
feedback from anyone who might be affected.
Goals
User-facing
- Drop Python 2.7 and 3.7 support. 3.0 requires Python 3.8+. Users on
older Python should pin to the 2.x line, which will continue to receive
critical fixes for a limited time after 3.0 ships.
- Added broker feature support:
- Producer transactional improvements from Apache Kafka 2.5 - KIP-360
producer epoch handling and related robustness fixes, and KIP-447
Producer scalability for exactly once semantics.
- Likely (pending decision): consumer features from Apache Kafka 2.4 -
KIP-320 log truncation detection, KIP-392 Fetch from follower replica, and
KIP-429 Incremental cooperative rebalancing.
- Performance improvements on the producer and consumer hot paths:
dynamically-compiled encode/decode methods via exec / compile rather
than interpreted field iteration, and reduced bytes copies in the
network and record-batch layers. Measurable throughput and latency wins
expected; will publish benchmark numbers before 3.0 ships.
Internal
- Complete rewrite of the protocol stack from hand-written static class
definitions to dynamic generation from the JSON schema files Apache Kafka
publishes alongside the broker. This makes it much easier to track
upstream protocol changes: a new broker version's new fields/requests can
be incorporated by updating the schema files rather than hand-editing
Python classes.
- Consolidate on
kafka.net as the sole network layer. Delete
kafka.client_async, kafka.conn and related sync-IO scaffolding. All three high-level
clients already use kafka.net.compat.KafkaNetClient by default on master,
this just finishes the migration and removes the dead legacy code path.
- Refactor
KafkaConsumer internals to a background-IO-thread model,
preserving the public API. This matches KafkaProducer's existing Sender
architecture and eliminates a class of latent concurrency bugs around the
current dual-thread model (main thread + HeartbeatThread). The JVM
client moved to a similar architecture in Apache Kafka 3.7.
- Clean up
KafkaAdminClient internals to take advantage of coroutines
(fewer synchronous poll loops, simpler retry and node-routing logic).
- Deterministic in-process mock broker test harness to provide reliable
regression coverage. Similar to JVM client mock features.
- Establish async architectural foundation for optionally exposing
async-native interfaces in a future release. No new public async API
ships in 3.0.
Work items
Expected breaking changes
Python 2.7 and 3.7 are no longer supported. 3.0 requires Python 3.8+.
This is the most impactful breaking change for users. If you cannot upgrade
your Python version, pin your dependency to the 2.x line.
The public API of KafkaConsumer, KafkaProducer, and KafkaAdminClient
is expected to remain compatible on Python 3.8+. Code that does not pass a
custom kafka_client argument or reach into private internals should
continue to work unchanged.
Users who explicitly pass a custom kafka_client will need to either drop the
argument (falling back to the default KafkaNetClient) or update their code.
We believe this affects a very small number of users, but please comment here
if this affects you and you have concerns.
The protocol module layout will change as part of the dynamic-generation
rewrite. Code that imports from kafka.protocol.* internals - request
classes, response classes, field definitions - may need updating. The
high-level client APIs remain stable.
Consumer threading semantics change subtly: the HeartbeatThread is
replaced by a single background IO thread that also handles fetching and
coordinator work. User-facing behavior (rebalance listener invocation
timing, poll() blocking semantics, commit_sync / commit_async
contracts) is intended to remain unchanged. If you rely on specific
undocumented threading behavior, please flag it here.
Dynamic broker version checks for very old brokers are no longer
supported. To connect to a broker that does not support ApiVersions
will require passing an explicit api_version= parameter. This only
impacts super-old brokers: 0.9 and earlier.
Internal modules and functions that are not part of the documented public
API may be removed, renamed, or restructured without deprecation.
Non-goals for 3.0
- No removal of broker-version support. Every Apache Kafka broker
version currently supported will continue to be supported; we're adding
feature support for newer brokers, not dropping older ones (though note
requirement to pass api_version explicitly for very old brokers.
- No new public async API. Internal use of coroutines lays groundwork,
but async def methods on the high-level clients are a separate decision
for a future release.
- No asyncio integration.
kafka.net keeps its own bespoke event loop;
bridging to asyncio.EventLoop is out of scope.
- No changes to message format, compression, or producer idempotence
semantics beyond the 2.5 transactional improvements listed above.
Feedback welcome
Particularly valuable would be feedback from:
- Users still on Python 2.7 or 3.7 who will need to upgrade - any concerns
with maintenance on 2.x release branches?
- Users passing a custom
kafka_client or subclassing KafkaClient.
- Users importing from
kafka.protocol.* internals in their own code.
- Users relying on specific consumer threading behavior (for example,
rebalance listener invocation timing relative to poll()).
- Users who have opinions on which of the 2.4 consumer features are worth
prioritizing (log truncation detection, fetch from follower,
incremental cooperative rebalancing), especially anyone affected by the
current non-cooperative rebalancing behavior.
Please comment below or open a separate discussion if you'd prefer.
Related PRs and sub-issues will be linked to this issue and to the 3.0
milestone as they land.
Summary
The 3.0 release bundles several large changes:
dynamic generation from Apache Kafka's JSON schema files.
handling.
transactional improvements from Kafka 2.5; likely some consumer features
from Kafka 2.4) while blocked on improved/flexible protocol support.
kafka.net)and the removal of
kafka.client_asyncandkafka.conn.KafkaConsumerinternals to a single background IO thread,matching the existing
KafkaProducerarchitecture.The public API of the high-level clients is expected to remain compatible
for users on Python 3.8+. This issue tracks the work items and invites
feedback from anyone who might be affected.
Goals
User-facing
older Python should pin to the 2.x line, which will continue to receive
critical fixes for a limited time after 3.0 ships.
producer epoch handling and related robustness fixes, and KIP-447
Producer scalability for exactly once semantics.
KIP-320 log truncation detection, KIP-392 Fetch from follower replica, and
KIP-429 Incremental cooperative rebalancing.
dynamically-compiled encode/decode methods via
exec/compileratherthan interpreted field iteration, and reduced
bytescopies in thenetwork and record-batch layers. Measurable throughput and latency wins
expected; will publish benchmark numbers before 3.0 ships.
Internal
definitions to dynamic generation from the JSON schema files Apache Kafka
publishes alongside the broker. This makes it much easier to track
upstream protocol changes: a new broker version's new fields/requests can
be incorporated by updating the schema files rather than hand-editing
Python classes.
kafka.netas the sole network layer. Deletekafka.client_async,kafka.connand related sync-IO scaffolding. All three high-levelclients already use
kafka.net.compat.KafkaNetClientby default on master,this just finishes the migration and removes the dead legacy code path.
KafkaConsumerinternals to a background-IO-thread model,preserving the public API. This matches
KafkaProducer's existingSenderarchitecture and eliminates a class of latent concurrency bugs around the
current dual-thread model (main thread +
HeartbeatThread). The JVMclient moved to a similar architecture in Apache Kafka 3.7.
KafkaAdminClientinternals to take advantage of coroutines(fewer synchronous poll loops, simpler retry and node-routing logic).
regression coverage. Similar to JVM client mock features.
async-native interfaces in a future release. No new public async API
ships in 3.0.
Work items
byte copies)
KafkaConsumerbackground IO thread refactor (multiple PRs)kafka.client_asyncandkafka.connkafka.client_asyncandkafka.connmodules and its dependenciesClusterMetadatatakes ownership of its own refresh schedulingKafkaAdminClientinternal async refactorExpected breaking changes
Python 2.7 and 3.7 are no longer supported. 3.0 requires Python 3.8+.
This is the most impactful breaking change for users. If you cannot upgrade
your Python version, pin your dependency to the 2.x line.
The public API of
KafkaConsumer,KafkaProducer, andKafkaAdminClientis expected to remain compatible on Python 3.8+. Code that does not pass a
custom
kafka_clientargument or reach into private internals shouldcontinue to work unchanged.
Users who explicitly pass a custom
kafka_clientwill need to either drop theargument (falling back to the default
KafkaNetClient) or update their code.We believe this affects a very small number of users, but please comment here
if this affects you and you have concerns.
The protocol module layout will change as part of the dynamic-generation
rewrite. Code that imports from
kafka.protocol.*internals - requestclasses, response classes, field definitions - may need updating. The
high-level client APIs remain stable.
Consumer threading semantics change subtly: the
HeartbeatThreadisreplaced by a single background IO thread that also handles fetching and
coordinator work. User-facing behavior (rebalance listener invocation
timing,
poll()blocking semantics,commit_sync/commit_asynccontracts) is intended to remain unchanged. If you rely on specific
undocumented threading behavior, please flag it here.
Dynamic broker version checks for very old brokers are no longer
supported. To connect to a broker that does not support
ApiVersionswill require passing an explicit
api_version=parameter. This onlyimpacts super-old brokers: 0.9 and earlier.
Internal modules and functions that are not part of the documented public
API may be removed, renamed, or restructured without deprecation.
Non-goals for 3.0
version currently supported will continue to be supported; we're adding
feature support for newer brokers, not dropping older ones (though note
requirement to pass
api_versionexplicitly for very old brokers.but
async defmethods on the high-level clients are a separate decisionfor a future release.
kafka.netkeeps its own bespoke event loop;bridging to
asyncio.EventLoopis out of scope.semantics beyond the 2.5 transactional improvements listed above.
Feedback welcome
Particularly valuable would be feedback from:
with maintenance on 2.x release branches?
kafka_clientor subclassingKafkaClient.kafka.protocol.*internals in their own code.rebalance listener invocation timing relative to
poll()).prioritizing (log truncation detection, fetch from follower,
incremental cooperative rebalancing), especially anyone affected by the
current non-cooperative rebalancing behavior.
Please comment below or open a separate discussion if you'd prefer.
Related PRs and sub-issues will be linked to this issue and to the
3.0milestone as they land.