Apache Kudu 1.15.0
Upgrade Notes
Obsoletions
- The
kudu-mapreduce
integration has been removed in the 1.15.0 release.
Similar functionality and capabilities now exist via the Apache Spark, Apache Hive, Apache Impala, and Apache NiFi integrations. See KUDU-3142 for details.
Deprecations
- Support for Python 2.x and Python 3.4 and earlier is deprecated and may be removed in the next minor release.
New features
-
Kudu now experimentally supports multi-row transactions. Currently only
INSERT
andINSERT_IGNORE
operations are supported.
See here for a design overview of this feature. -
Kudu now supports Raft configuration change for Kudu masters and CLI tools for orchestrating addition and removal of masters in a Kudu cluster. These tools substantially simplify the process of migrating to multiple masters, recovering a dead master and removing masters from a Kudu cluster. For detailed steps, see the latest administration documentation. This feature is evolving and the steps to add, remove and recover masters may change in the future. See KUDU-2181 for details.
-
Kudu now supports table comments directly on Kudu tables which are automatically synchronized when the Hive Metastore integration is enabled. These comments can be added at table creation time and changed via table alteration.
-
Kudu now experimentally supports per-table size limits based on leader disk space usage or number of rows. When generating new authorization tokens, Masters will now consider the size limits and strip tokens of
INSERT
andUPDATE
privileges if either limit is reached. To enable this feature, set the--enable_table_write_limit
master flag; adjust the--table_disk_size_limit
and--table_row_count_limit
flags as desired or use thekudu table set_limit
tool to set limits per table.
Optimizations and improvements
-
It is now possible to change the Kerberos Service Principal Name using the
--principal
flag. The default SPN is stillkudu/_HOST
. Clients connecting to a cluster using a non-default SPN must set thesasl_protocol_name
orsaslProtocolName
to match the SPN base (i.e. “kudu” if the SPN is “kudu/_HOST”) in the client builder or the Kudu CLI. See KUDU-1884 for details. -
Kudu RPC now supports TLSv1.3. Kudu servers and clients automatically negotiate TLSv1.3 for Kudu RPC if OpenSSL (or Java runtime correspondingly) on each side supports TLSv1.3. If necessary, use the newly introduced flag
--rpc_tls_ciphersuites
to customize TLSv1.3-specific cipher suites at the server side. See KUDU-2871 for details. -
TLS ciphers renegotiation for TLSv1.2 and prior protocol versions is now explicitly disabled. See KUDU-1926 for details.
-
The location assignment for Kudu clients is now disabled by default since it doesn’t bring a lot of benefits, but rather puts an extra load to Kudu masters. This change reduces the load on Kudu masters which is essential if too many clients run in a cluster. To enable the location assignment for clients, override the default by setting
--master_client_location_assignment_enabled=true
for Kudu masters. -
The behavior of the C++ client replica selection for closest replica, the default, was updated to match the behavior of the Java client. Instead of picking a random replica each time, a static value is used for each process ensuring that the selection remains deterministic and can benefit from better caching. See KUDU-3248 for details.
-
The Web UI /rpcz endpoint now displays information on whether an RPC connection is protected by TLS, and if so, provides information on the negotiated TLS cipher suite.
-
Tooling requests and C++ client requests bound for leader masters will now be retried in the event the masters cannot be reached.
-
Cluster tooling will now validate that the master argument contains no duplicate values. See KUDU-3226 for details.
-
The error message output by Kudu Java client in an attempt to write into a non-existent table partition now contains the table’s name.
See KUDU-3267 for details.
Fixed Issues
-
Fixed a bug in the Kudu tablet servers that could result in a crash when performing an incremental backup of rows that had many batches of updates. See KUDU-3291 for more details.
-
The Kudu Java client will now retry scans bound for tablets hosted on quiescing tablet servers at replicas on other tablet servers. See
KUDU-3213 for more details. -
Fixed a race between the scheduling of a maintenance op and the destruction of a tablet. This could previously lead to a crash.
See KUDU-3268 for more details. -
Fixed crash in Kudu C++ client introduced with KUDU-1802. See KUDU-3254 for details.
-
Fixed bug in Kudu Java client which manifested in
AUTO_FLUSH_BACKGROUND
sessions hung in a call toKuduSession.flush()
method. Another sign of the bug were stuck data ingest workloads based on Java client (e.g., kudu-spark applications) with
"java.lang.AssertionError: This Deferred was already called!" message in the logs. See KUDU-3277 for details. -
Fixed crash in Kudu server due to lack of
getrandom(2)
system call in Linux kernel version earlier than 3.17 by instead using/dev/random
for uuid generation in the Boost library. Crash includes the following message in the logs "terminate called after throwing an instance of 'boost::wrapexceptboost::uuids::entropy_error'". See the fix for a sample stack trace.
Wire Protocol compatibility
Kudu 1.15.0 is wire-compatible with previous versions of Kudu:
- Kudu 1.15 clients may connect to servers running Kudu 1.0 or later. If the client uses features that are not available on the target server, an error will be returned.
- Rolling upgrade between Kudu 1.14 and Kudu 1.15 servers is believed to be possible though has not been sufficiently tested. Users are encouraged to shut down all nodes in the cluster, upgrade the software, and then restart the daemons on the new version.
- Kudu 1.0 clients may connect to servers running Kudu 1.15 with the exception of the below-mentioned restrictions regarding secure clusters.
The authentication features introduced in Kudu 1.3 place the following limitations
on wire compatibility between Kudu 1.15 and versions earlier than 1.3:
- If a Kudu 1.15 cluster is configured with authentication or encryption set to "required", clients older than Kudu 1.3 will be unable to connect.
- If a Kudu 1.15 cluster is configured with authentication and encryption set to "optional" or "disabled", older clients will still be able to connect.
Incompatible Changes in Kudu 1.15.0
- Updated hash computation for empty strings in the FastHash implementation to conform with the handling in Apache Impala. For Bloom filter predicate pushdown feature that uses FastHash, this makes the Kudu client older than version 1.15.0 incompatible with Kudu server version 1.15.0 and Kudu client version at or newer than 1.15.0 incompatible with Kudu server version earlier than 1.15.0. Both client library and Kudu server need to be updated to version 1.15.0 or above if using the Bloom filter predicate feature. One manifestation of this incompatibility is following message in the logs, "Not implemented: call requires unsupported application feature flags: 4". See KUDU-3286 for details.
Client Library Compatibility
-
The Kudu 1.15 Java client library is API- and ABI-compatible with Kudu 1.14. Applications written against Kudu 1.14 will compile and run against the Kudu 1.15 client library and vice-versa.
-
The Kudu 1.15 {cpp} client is API- and ABI-forward-compatible with Kudu 1.14. Applications written and compiled against the Kudu 1.14 client library will run without modification against the Kudu 1.15 client library. Applications written and compiled against the Kudu 1.15 client library will run without modification against the Kudu 1.14
client library. -
The Kudu 1.15 Python client is API-compatible with Kudu 1.14. Applications written against Kudu 1.14 will continue to run against the Kudu 1.15 client and vice-versa.
Known Issues and Limitations
Please refer to the Known Issues and Limitations section of the documentation.
Contributors
Kudu 1.15.0 includes contributions from 12 people, including 2 first-time contributors:
- Abhishek Chennaka
- shenxingwuying
Thank you for your contributions!