Releases: dpkp/kafka-python
2.0.0
2.0.0 (Feb 10, 2020)
This release includes breaking changes for any application code that has not
migrated from older Simple-style classes to newer Kafka-style classes.
Deprecation
- Remove deprecated SimpleClient, Producer, Consumer, Unittest (jeffwidman / PR #1196)
Admin Client
- Use the controller for topic metadata requests (TylerLubeck / PR #1995)
- Implement list_topics, describe_topics, and describe_cluster (TylerLubeck / PR #1993)
- Implement eq and hash for ACL objects (TylerLubeck / PR #1955)
- Fixes KafkaAdminClient returning
IncompatibleBrokerVersion
when passing anapi_version
(ian28223 / PR #1953) - Admin protocol updates (TylerLubeck / PR #1948)
- Fix describe config for multi-broker clusters (jlandersen / PR #1869)
Miscellaneous Bugfixes / Improvements
- Enable SCRAM-SHA-256 and SCRAM-SHA-512 for sasl (swenzel / PR #1918)
- Fix slots usage and use more slots (carsonip / PR #1987)
- Optionally return OffsetAndMetadata from consumer.committed(tp) (dpkp / PR #1979)
- Reset conn configs on exception in conn.check_version() (dpkp / PR #1977)
- Do not block on sender thread join after timeout in producer.close() (dpkp / PR #1974)
- Implement methods to convert a Struct object to a pythonic object (TylerLubeck / PR #1951)
Test Infrastructure / Documentation / Maintenance
- Update 2.4.0 resource files for sasl integration (dpkp)
- Add kafka 2.4.0 to CI testing (vvuibert / PR #1972)
- convert test_admin_integration to pytest (ulrikjohansson / PR #1923)
- xfail test_describe_configs_topic_resource_returns_configs (dpkp / Issue #1929)
- Add crc32c to README and docs (dpkp)
- Improve docs for reconnect_backoff_max_ms (dpkp / PR #1976)
- Fix simple typo: managementment -> management (timgates42 / PR #1966)
- Fix typos (carsonip / PR #1938)
- Fix doc import paths (jeffwidman / PR #1933)
- Update docstring to match conn.py's (dabcoder / PR #1921)
- Do not log topic-specific errors in full metadata fetch (dpkp / PR #1980)
- Raise AssertionError if consumer closed in poll() (dpkp / PR #1978)
- Log retriable coordinator NodeNotReady, TooManyInFlightRequests as debug not error (dpkp / PR #1975)
- Remove unused import (jeffwidman)
- Remove some dead code (jeffwidman)
- Fix a benchmark to Use print() function in both Python 2 and Python 3 (cclauss / PR #1983)
- Fix a test to use ==/!= to compare str, bytes, and int literals (cclauss / PR #1984)
- Fix benchmarks to use pyperf (carsonip / PR #1986)
- Remove unused/empty .gitsubmodules file (jeffwidman / PR #1928)
- Remove deprecated
ConnectionError
(jeffwidman / PR #1816)
1.4.7
1.4.7 (Sep 30, 2019)
This release is focused on KafkaConsumer performance, Admin Client
improvements, and Client concurrency. The KafkaConsumer iterator implementation
has been greatly simplified so that it just wraps consumer.poll(). The prior
implementation will remain available for a few more releases using the optional
KafkaConsumer config: legacy_iterator=True
. This is expected to improve
consumer throughput substantially and help reduce heartbeat failures / group
rebalancing.
Major thanks to @carsonip @Baisang @iv-m @davidheitman @cardy31 @ulrikjohansson @iAnomaly @Wayde2014 @ossdev07 @commanderdishwasher @justecorruptio @melor @rustyrothwurt @sachiin @jacky15 and @rikonen for submitting PRs; thanks as well to everyone that
submitted bug reports and issues, and to @jeffwidman and @tvoinarovskyi for code
reviews, comments, testing, debugging, and helping to maintain kafka-python!
Client
- Send socket data via non-blocking IO with send buffer (@dpkp / PR #1912)
- Rely on socket selector to detect completed connection attempts (@dpkp / PR #1909)
- Improve connection lock handling; always use context manager (@melor @dpkp / PR #1895)
- Reduce client poll timeout when there are no in-flight requests (@dpkp / PR #1823)
KafkaConsumer
- Do not use wakeup when sending fetch requests from consumer (@dpkp / PR #1911)
- Wrap
consumer.poll()
for KafkaConsumer iteration (@dpkp / PR #1902) - Allow the coordinator to auto-commit on old brokers (@justecorruptio / PR #1832)
- Reduce internal client poll timeout for (legacy) consumer iterator interface (@dpkp / PR #1824)
- Use dedicated connection for group coordinator (@dpkp / PR #1822)
- Change coordinator lock acquisition order (@dpkp / PR #1821)
- Make
partitions_for_topic
a read-through cache (@Baisang / PR #1781,#1809) - Fix consumer hanging indefinitely on topic deletion while rebalancing (@commanderdishwasher / PR #1782)
Miscellaneous Bugfixes / Improvements
- Fix crc32c avilability on non-intel architectures (@ossdev07 / PR #1904)
- Load system default SSL CAs if
ssl_cafile
is not provided (@iAnomaly / PR #1883) - Catch py3 TimeoutError in BrokerConnection send/recv (@dpkp / PR #1820)
- Added a function to determine if bootstrap is successfully connected (@Wayde2014 / PR #1876)
Admin Client
- Add ACL api support to KafkaAdminClient (@ulrikjohansson / PR #1833)
- Add
sasl_kerberos_domain_name
config to KafkaAdminClient (@jeffwidman / PR #1852) - Update
security_protocol
config documentation for KafkaAdminClient (@cardy31 / PR #1849) - Break FindCoordinator into request/response methods in KafkaAdminClient (@jeffwidman / PR #1871)
- Break consumer operations into request / response methods in KafkaAdminClient (@jeffwidman / PR #1845)
- Parallelize calls to
_send_request_to_node()
in KafkaAdminClient (@davidheitman / PR #1807)
Test Infrastructure / Documentation / Maintenance
- Add Kafka 2.3.0 to test matrix and compatibility docs (dpkp / PR #1915)
- Convert remaining
KafkaConsumer
tests topytest
(@jeffwidman / PR #1886) - Bump integration tests to 0.10.2.2 and 0.11.0.3 (@jeffwidman / #1890)
- Cleanup handling of
KAFKA_VERSION
env var in tests (@jeffwidman / PR #1887) - Minor test cleanup (@jeffwidman / PR #1885)
- Use
socket.SOCK_STREAM
in test assertions (@iv-m / PR #1879) - Sanity test for
consumer.topics()
andconsumer.partitions_for_topic()
(@Baisang / PR #1829) - Cleanup seconds conversion in client poll timeout calculation (@jeffwidman / PR #1825)
- Remove unused imports (@jeffwidman / PR #1808)
- Cleanup python nits in RangePartitionAssignor (@jeffwidman / PR #1805)
- Update links to kafka consumer config docs (@jeffwidman)
- Fix minor documentation typos (@carsonip / PR #1865)
- Remove unused/weird comment line (@jeffwidman / PR #1813)
- Update docs for
api_version_auto_timeout_ms
(@jeffwidman / PR #1812)
1.4.6
1.4.6 (Apr 2, 2019)
This is a patch release primarily focused on bugs related to concurrency,
SSL connections and testing, and SASL authentication.
Major thanks to @pt2pham , @isamaru , @braedon , @gingercookiemage , for submitting
PRs to help fix many of these issues. And major thanks to everyone that
submitted bug reports and issues. And thanks always to @jeffwidman and @tvoinarovskyi for code
reviews, comments, testing, debugging, and helping to maintain this project!
Client Concurrency Issues (Race Conditions / Deadlocks)
- Fix race condition in
protocol.send_bytes
(isamaru / PR #1752) - Do not call
state_change_callback
with lock (dpkp / PR #1775) - Additional BrokerConnection locks to synchronize protocol/IFR state (dpkp / PR #1768)
- Send pending requests before waiting for responses (dpkp / PR #1762)
- Avoid race condition on
client._conns
in send() (dpkp / PR #1772) - Hold lock during
client.check_version
(dpkp / PR #1771)
Producer Wakeup / TimeoutError
- Dont wakeup during
maybe_refresh_metadata
-- it is only called by poll() (dpkp / PR #1769) - Dont do client wakeup when sending from sender thread (dpkp / PR #1761)
SSL - Python3.7 Support / Bootstrap Hostname Verification / Testing
- Wrap SSL sockets after connecting for python3.7 compatibility (dpkp / PR #1754)
- Allow configuration of SSL Ciphers (dpkp / PR #1755)
- Maintain shadow cluster metadata for bootstrapping (dpkp / PR #1753)
- Generate SSL certificates for local testing (dpkp / PR #1756)
- Rename ssl.keystore.location and ssl.truststore.location config files (dpkp)
- Reset reconnect backoff on SSL connection (dpkp / PR #1777)
SASL - OAuthBearer support / api version bugfix
- Fix 0.8.2 protocol quick detection / fix SASL version check (dpkp / PR #1763)
- Update sasl configuration docstrings to include supported mechanisms (dpkp)
- Support SASL OAuthBearer Authentication (pt2pham / PR #1750)
Miscellaneous Bugfixes
- Dont force metadata refresh when closing unneeded bootstrap connections (dpkp / PR #1773)
- Fix possible AttributeError during conn._close_socket (dpkp / PR #1776)
- Return connection state explicitly after close in connect() (dpkp / PR #1778)
- Fix flaky conn tests that use time.time (dpkp / PR #1758)
- Add py to requirements-dev (dpkp)
- Fixups to benchmark scripts for py3 / new KafkaFixture interface (dpkp)
1.4.5
1.4.5 (Mar 14, 2019)
This release is primarily focused on addressing lock contention
and other coordination issues between the KafkaConsumer and the
background heartbeat thread that was introduced in the 1.4 release.
Consumer
- connections_max_idle_ms must be larger than request_timeout_ms (jeffwidman / PR #1688)
- Avoid race condition during close() / join heartbeat thread (dpkp / PR #1735)
- Use last offset from fetch v4 if available to avoid getting stuck in compacted topic (keithks / PR #1724)
- Synchronize puts to KafkaConsumer protocol buffer during async sends (dpkp / PR #1733)
- Improve KafkaConsumer join group / only enable Heartbeat Thread during stable group (dpkp / PR #1695)
- Remove unused
skip_double_compressed_messages
(jeffwidman / PR #1677) - Fix commit_offsets_async() callback (Faqa / PR #1712)
Client
- Retry bootstrapping after backoff when necessary (dpkp / PR #1736)
- Recheck connecting nodes sooner when refreshing metadata (dpkp / PR #1737)
- Avoid probing broker versions twice on newer brokers (dpkp / PR #1738)
- Move all network connections and writes to KafkaClient.poll() (dpkp / PR #1729)
- Do not require client lock for read-only operations (dpkp / PR #1730)
- Timeout all unconnected conns (incl SSL) after request_timeout_ms (dpkp / PR #1696)
Admin Client
- Fix AttributeError in response topic error codes checking (jeffwidman)
- Fix response error checking in KafkaAdminClient send_to_controller (jeffwidman)
- Fix NotControllerError check (jeffwidman)
Core/Protocol
- Fix default protocol parser version / 0.8.2 version probe (dpkp / PR #1740)
- Make NotEnoughReplicasError/NotEnoughReplicasAfterAppendError retriable (le-linh / PR #1722)
Bugfixes
- Use copy() in metrics() to avoid thread safety issues (emeric254 / PR #1682)
Test Infrastructure
- Mock dns lookups in test_conn (dpkp / PR #1739)
- Use test.fixtures.version not test.conftest.version to avoid warnings (dpkp / PR #1731)
- Fix test_legacy_correct_metadata_response on x86 arch (stanislavlevin / PR #1718)
- Travis CI: 'sudo' tag is now deprecated in Travis (cclauss / PR #1698)
- Use Popen.communicate() instead of Popen.wait() (Baisang / PR #1689)
Compatibility
- Catch thrown OSError by python 3.7 when creating a connection (danjo133 / PR #1694)
- Update travis test coverage: 2.7, 3.4, 3.7, pypy2.7 (jeffwidman, dpkp / PR #1614)
- Drop dependency on sphinxcontrib-napoleon (stanislavlevin / PR #1715)
- Remove unused import from kafka/producer/record_accumulator.py (jeffwidman / PR #1705)
- Fix SSL connection testing in Python 3.7 (seanthegeek, silentben / PR #1669)
1.4.4
Bugfixes
- (Attempt to) Fix deadlock between consumer and heartbeat (zhgjun / dpkp #1628)
- Fix Metrics dict memory leak (kishorenc #1569)
Client
- Support Kafka record headers (hnousiainen #1574)
- Set socket timeout for the write-side of wake socketpair (Fleurer #1577)
- Add kerberos domain name config for gssapi sasl mechanism handshake (the-sea #1542)
- Support smaller topic metadata fetch during bootstrap (andyxning #1541)
- Use TypeError for invalid timeout type (jeffwidman #1636)
- Break poll if closed (dpkp)
Admin Client
- Add KafkaAdminClient class (llamahunter #1540)
- Fix list_consumer_groups() to query all brokers (jeffwidman #1635)
- Stop using broker-errors for client-side problems (jeffwidman #1639)
- Fix send to controller (jeffwidman #1640)
- Add group coordinator lookup (jeffwidman #1641)
- Fix describe_groups (jeffwidman #1642)
- Add list_consumer_group_offsets() (jeffwidman #1643)
- Remove support for api versions as strings from KafkaAdminClient (jeffwidman #1644)
- Set a clear default value for
validate_only
/include_synonyms
(jeffwidman #1645) - Bugfix: Always set this_groups_coordinator_id (jeffwidman #1650)
Consumer
- Fix linter warning on import of ConsumerRebalanceListener (ben-harack #1591)
- Remove ConsumerTimeout (emord #1587)
- Return future from commit_offsets_async() (ekimekim #1560)
Core / Protocol
- Add protocol structs for {Describe,Create,Delete} Acls (ulrikjohansson #1646/partial)
- Pre-compile pack/unpack function calls (billyevans / jeffwidman #1619)
- Don't use
kafka.common
internally (jeffwidman #1509) - Be explicit with tuples for %s formatting (jeffwidman #1634)
Documentation
- Document connections_max_idle_ms (jeffwidman #1531)
- Fix sphinx url (jeffwidman #1610)
- Update remote urls: snappy, https, etc (jeffwidman #1603)
- Minor cleanup of testing doc (jeffwidman #1613)
- Various docstring / pep8 / code hygiene cleanups (jeffwidman #1647)
Test Infrastructure
- Stop pinning
pylint
(jeffwidman #1611) - (partial) Migrate from
Unittest
topytest
(jeffwidman #1620) - Minor aesthetic cleanup of partitioner tests (jeffwidman #1618)
- Cleanup fixture imports (jeffwidman #1616)
- Fix typo in test file name (jeffwidman)
- Remove unused ivy_root variable (jeffwidman)
- Add test fixtures for kafka versions 1.0.2 -> 2.0.1 (dpkp)
- Bump travis test for 1.x brokers to 1.1.1 (dpkp)
Logging / Error Messages
- raising logging level on messages signalling data loss (sibiryakov #1553)
- Stop using deprecated log.warn() (jeffwidman #1615)
- Fix typo in logging message (jeffwidman)
Compatibility
1.4.3
Compatibility
- Fix for python 3.7 support: remove 'async' keyword from SimpleProducer (dpkp #1454)
Client
- Improve BrokerConnection initialization time (romulorosa #1475)
- Ignore MetadataResponses with empty broker list (dpkp #1506)
- Improve connection handling when bootstrap list is invalid (dpkp #1507)
Consumer
- Check for immediate failure when looking up coordinator in heartbeat thread (dpkp #1457)
Core / Protocol
- Always acquire client lock before coordinator lock to avoid deadlocks (dpkp #1464)
- Added AlterConfigs and DescribeConfigs apis (StephenSorriaux #1472)
- Fix CreatePartitionsRequest_v0 (StephenSorriaux #1469)
- Add codec validators to record parser and builder for all formats (tvoinarovskyi #1447)
- Fix MemoryRecord bugs re error handling and add test coverage (tvoinarovskyi #1448)
- Force lz4 to disable Kafka-unsupported block linking when encoding (mnito #1476)
- Stop shadowing
ConnectionError
(jeffwidman #1492)
Documentation
- Document methods that return None (jeffwidman #1504)
- Minor doc capitalization cleanup (jeffwidman)
- Adds add_callback/add_errback example to docs (Berkodev #1441)
- Fix KafkaConsumer docstring for request_timeout_ms default (dpkp #1459)
Test Infrastructure
- Skip flakey SimpleProducer test (dpkp)
- Fix skipped integration tests if KAFKA_VERSION unset (dpkp #1453)
Logging / Error Messages
- Stop using deprecated log.warn() (jeffwidman)
- Change levels for some heartbeat thread logging (dpkp #1456)
- Log Heartbeat thread start / close for debugging (dpkp)
1.4.2
Bugfixes
- Close leaked selector in version check (dpkp #1425)
- Fix
BrokerConnection.connection_delay()
to return milliseconds (dpkp #1414) - Use local copies in
Fetcher._fetchable_partitions
to avoid mutation errors (dpkp #1400) - Fix error var name in
_unpack
(j2gg0s #1403) - Fix KafkaConsumer compacted offset handling (dpkp #1397)
- Fix byte size estimation with kafka producer (blakeembrey #1393)
- Fix coordinator timeout in consumer poll interface (braedon #1384)
Client
- Add
BrokerConnection.connect_blocking()
to improve bootstrap to multi-address hostnames (dpkp #1411) - Short-circuit
BrokerConnection.close()
if already disconnected (dpkp #1424) - Only increase reconnect backoff if all addrinfos have been tried (dpkp #1423)
- Make BrokerConnection .host / .port / .afi immutable to avoid incorrect 'metadata changed' checks (dpkp #1422)
- Connect with sockaddrs to support non-zero ipv6 scope ids (dpkp #1433)
- Check timeout type in KafkaClient constructor (asdaraujo #1293)
- Update string representation of SimpleClient (asdaraujo #1293)
- Do not validate
api_version
against known versions (dpkp #1434)
Consumer
- Avoid tight poll loop in consumer when brokers are down (dpkp #1415)
- Validate
max_records
in KafkaConsumer.poll (dpkp #1398) - KAFKA-5512: Awake heartbeat thread when it is time to poll (dpkp #1439)
Producer
- Validate that serializers generate bytes-like (or None) data (dpkp #1420)
Core / Protocol
- Support alternative lz4 package: lz4framed (everpcpc #1395)
- Use hardware accelerated CRC32C function if available (tvoinarovskyi #1389)
- Add Admin CreatePartitions API call (alexef #1386)
Test Infrastructure
- Close KafkaConsumer instances during tests (dpkp #1410)
- Introduce new fixtures to prepare for migration to pytest (asdaraujo #1293)
- Removed pytest-catchlog dependency (asdaraujo #1380)
- Fixes racing condition when message is sent to broker before topic logs are created (asdaraujo #1293)
- Add kafka 1.0.1 release to test fixtures (dpkp #1437)
Logging / Error Messages
1.4.1
1.4.0
This is a substantial release. Although there are no known 'showstopper' bugs as of release, we do recommend you test any planned upgrade to your application prior to running in production.
Some of the major changes include:
- We have officially dropped python 2.6 support
- The KafkaConsumer now includes a background thread to handle coordinator heartbeats
- API protocol handling has been separated from networking code into a new class, KafkaProtocol
- Added support for kafka message format v2
- Refactored DNS lookups during kafka broker connections
- SASL authentication is working (we think)
- Removed several circular references to improve gc on close()
Thanks to all contributors -- the state of the kafka-python community is strong!
Detailed changelog are listed below:
Client
- Fixes for SASL support
- Refactor SASL/gssapi support (dpkp #1248 #1249 #1257 #1262 #1280)
- Add security layer negotiation to the GSSAPI authentication (asdaraujo #1283)
- Fix overriding sasl_kerberos_service_name in KafkaConsumer / KafkaProducer (natedogs911 #1264)
- Fix typo in _try_authenticate_plain (everpcpc #1333)
- Fix for Python 3 byte string handling in SASL auth (christophelec #1353)
- Move callback processing from BrokerConnection to KafkaClient (dpkp #1258)
- Use socket timeout of request_timeout_ms to prevent blocking forever on send (dpkp #1281)
- Refactor dns lookup in BrokerConnection (dpkp #1312)
- Read all available socket bytes (dpkp #1332)
- Honor reconnect_backoff in conn.connect() (dpkp #1342)
Consumer
- KAFKA-3977: Defer fetch parsing for space efficiency, and to raise exceptions to user (dpkp #1245)
- KAFKA-4034: Avoid unnecessary consumer coordinator lookup (dpkp #1254)
- Handle lookup_coordinator send failures (dpkp #1279)
- KAFKA-3888 Use background thread to process consumer heartbeats (dpkp #1266)
- Improve KafkaConsumer cleanup (dpkp #1339)
- Fix coordinator join_future race condition (dpkp #1338)
- Avoid KeyError when filtering fetchable partitions (dpkp #1344)
- Name heartbeat thread with group_id; use backoff when polling (dpkp #1345)
- KAFKA-3949: Avoid race condition when subscription changes during rebalance (dpkp #1364)
- Fix #1239 regression to avoid consuming duplicate compressed messages from mid-batch (dpkp #1367)
Producer
- Fix timestamp not passed to RecordMetadata (tvoinarovskyi #1273)
- Raise non-API exceptions (jeffwidman #1316)
- Fix reconnect_backoff_max_ms default config bug in KafkaProducer (YaoC #1352)
Core / Protocol
- Add kafka.protocol.parser.KafkaProtocol w/ receive and send (dpkp #1230)
- Refactor MessageSet and Message into LegacyRecordBatch to later support v2 message format (tvoinarovskyi #1252)
- Add DefaultRecordBatch implementation aka V2 message format parser/builder. (tvoinarovskyi #1185)
- optimize util.crc32 (ofek #1304)
- Raise better struct pack/unpack errors (jeffwidman #1320)
- Add Request/Response structs for kafka broker 1.0.0 (dpkp #1368)
Bugfixes
- use python standard max value (lukekingbru #1303)
- changed for to use enumerate() (TheAtomicOption #1301)
- Explicitly check for None rather than falsey (jeffwidman #1269)
- Minor Exception cleanup (jeffwidman #1317)
- Use non-deprecated exception handling (jeffwidman a699f6a)
- Remove assertion with side effect in client.wakeup() (bgedik #1348)
- use absolute imports everywhere (kevinkjt2000 #1362)
Test Infrastructure
- Use 0.11.0.2 kafka broker for integration testing (dpkp #1357 #1244)
- Add a Makefile to help build the project, generate docs, and run tests (tvoinarovskyi #1247)
- Add fixture support for 1.0.0 broker (dpkp #1275)
- Add kafka 1.0.0 to travis integration tests (dpkp #1365)
- Change fixture default host to localhost (asdaraujo #1305)
- Minor test cleanups (dpkp #1343)
- Use latest pytest 3.4.0, but drop pytest-sugar due to incompatibility (dpkp #1361)
Documentation
- Expand metrics docs (jeffwidman #1243)
- Fix docstring (jeffwidman #1261)
- Added controlled thread shutdown to example.py (TheAtomicOption #1268)
- Add license to wheel (jeffwidman #1286)
- Use correct casing for MB (jeffwidman #1298)
Logging / Error Messages
- Fix two bugs in printing bytes instance (jeffwidman #1296)
1.3.5
Bugfixes
- Fix partition assignment race condition (jeffwidman #1240)
- Fix consumer bug when seeking / resetting to the middle of a compressed messageset (dpkp #1239)
- Fix traceback sent to stderr not logging (dbgasaway #1221)
- Stop using mutable types for default arg values (jeffwidman #1213)
- Remove a few unused imports (jameslamb #1188)
Client
- Refactor BrokerConnection to use asynchronous receive_bytes pipe (dpkp #1032)
Consumer
- Drop unused sleep kwarg to poll (dpkp #1177)
- Enable KafkaConsumer beginning_offsets() and end_offsets() with older broker versions (buptljy #1200)
- Validate consumer subscription topic strings (nikeee #1238)
Documentation
- Small fixes to SASL documentation and logging; validate security_protocol (dpkp #1231)
- Various typo and grammar fixes (jeffwidman)