jihoonson released this
Aug 10, 2018
· 363 commits to master since this release
Druid 0.12.2 contains stability improvements and bug fixes from 13 contributors. Major improvements include:
The full list of changes is here: https://github.com/apache/incubator-druid/pulls?q=is%3Apr+milestone%3A0.12.2+is%3Aclosed
Documentation for this release is at: http://druid.io/docs/0.12.2-rc1
We have fixed a bunch of bugs in Kafka indexing service, which are mostly race conditions when incrementally publishing segments.
Added by @jihoonson in #5805. Added by @surekhasaharan in #5899. Added by @surekhasaharan in #5900. Added by @jihoonson in #5905. Added by @jihoonson in #5907. Added by @jihoonson in #5996.
We also have fixed some bugs in general data ingestion logics. Especially we have fixed a bug of wrong segment data when you use auto encoded long columns with any compression.
Added by @jihoonson in #5932. Added by @clintropolis in #6045.
Coordinator is now capable of more stable segment management especially for segment balancing. We have fixed an unexpected segment imbalancing caused by the conflicted decisions of Coordinator rule runner and balancer.
Added by @clintropolis in #5528. Added by @clintropolis in #5529. Added by @clintropolis in #5532. Added by @clintropolis in #5555. Added by @clintropolis in #5591. Added by @clintropolis in #5888. Added by @clintropolis in #5928.
We've fixed the wrong lexicographic sort of topN queries and the wrong filter application for the nested queries. The bug of ClassCastException when caching topN queries with Float dimensions has also fixed.
Added by @drcrallen in #5650. Added by @gianm in #5653.
0.12.2 is a minor release and compatible with 0.12.1. If you're updating from an earlier version than 0.12.1, please see release notes of the relevant intermediate versions for additional notes.
Thanks to everyone who contributed to this release!
@acdn-ekeddy @awelsh93 @clintropolis @drcrallen @gianm @jihoonson @jon-wei @kaijianding @leventov @michas2 @Oooocean @samarthjain @surekhasaharan
jihoonson released this
Jun 8, 2018
· 262 commits to master since this release
Druid 0.12.1 contains stability improvements and bug fixes from 10 contributors. Major improvements include:
The full list of changes is here: https://github.com/druid-io/druid/pulls?utf8=%E2%9C%93&q=is%3Apr+is%3Aclosed+milestone%3A0.12.1
Documentation for this release is at: http://druid.io/docs/0.12.1
The loadstatus API of Coordinators returns the percentage of segments actually loaded in the cluster versus segments that should be loaded in the cluster. The performance of this API has greatly been improved.
Added by @jon-wei in #5632.
Druid now can limit the amount of memory used by HttpPostEmitter to 10% of the available JVM heap, thereby avoiding OutOfMemory errors from buffered events.
Added by @jon-wei in #5300.
There were some bugs in Kerberos authentication like authentication failure without cookies or broken authentication when router is used. See #5596, #5706, and #5766 for more details.
Added by @nishantmonu51 in #5596. Added by @b-slim in #5706. Added by @jon-wei in #5766.
Coordinators could be stuck if it loses leadership while starting. This bug has been fixed now.
Added by @jihoonson in #5554.
SegmentMetadataQuery is supposed to use the interval of druid.query.segmentMetadata.defaultHistory if the interval is not specified, but it queried all segments instead which incurs an unexpected performance hit. SegmentMetadataQuery now respects the defaultHistory option again.
Added by @gianm in #5489.
Druid now supports the HTTP OPTIONS request by fixing its auth handling.
Added by @jon-wei in #5615.
Kafka indexing service allowed retrying tasks to overwrite the segments in deep storage written by the previous failed tasks. However, this caused another bug that the same segment ID could have different data on historicals and in deep storage. This bug has been fixed now by using unique segment paths for each Kafka index tasks.
Added by @dclim in #5692.
0.12.1 is a minor release and compatible with 0.12.0. If you're updating from an earlier version than 0.12.0, please see release notes of the relevant intermediate versions for additional notes.
@dclim @gianm @JeKuOrdina @jihoonson @jon-wei @leventov @niketh @nishantmonu51 @pdeva
jon-wei released this
Mar 9, 2018
· 115 commits to master since this release
Druid 0.12.0 contains over a hundred performance improvements, stability improvements, and bug fixes from almost 40 contributors. This release adds major improvements to the Kafka indexing service.
Other major new features include:
The full list of changes is here: https://github.com/druid-io/druid/pulls?utf8=%E2%9C%93&q=is%3Apr%20is%3Aclosed%20milestone%3A0.12.0
Documentation for this release is at: http://druid.io/docs/0.12.0/
The Kafka indexing service now supports incremental handoffs, as well as decoupling the number of segments created by a Kafka indexing task from the number of Kafka partitions. Please see #4815 (comment) for more information.
Added by @pjain1 in #4815.
Druid now supports priorities for indexing task locks. When an indexing task needs to acquire a lock on a datasource + interval, higher-priority tasks can now preempt lower-priority tasks. Please see http://druid.io/docs/0.12.0-rc1/ingestion/tasks.html#task-priority for more information.
Added by @jihoonson in #4550.
Indexing tasks create entries in the "pendingSegments" table in the metadata store; prior to 0.12.0, these temporary entries were not automatically cleaned up, leading to possible cluster performance degradation over time. Druid 0.12.0 allows the coordinator to automatically clean up unused entries in the pending segments table. This feature is enabled by setting druid.coordinator.kill.pendingSegments.on=true in coordinator properties.
Added by @jihoonson in #5149.
Compacting segments (merging a set of segments within a given interval to create a set with larger but fewer segments) is a common Druid batch ingestion use case. Druid 0.12.0 now supports a Compaction Task that merges all segments within a given interval into a single segment. Please see http://druid.io/docs/0.12.0-rc1/ingestion/tasks.html#compaction-task for more details.
Added by @jihoonson in #4985.
New z-score and p-value test statistics post-aggregators have been added to the druid-stats extension. Please see http://druid.io/docs/0.12.0-rc1/development/extensions-core/test-stats.html for more details.
Added by @chunghochen in #4532.
A numeric quantiles sketch aggregator has been added to the druid-datasketches extension.
Added by @AlexanderSaydakov in #5002.
Druid 0.12.0 includes a new authentication/authorization extension that provides Basic HTTP authentication and simple role-based access control. Please see http://druid.io/docs/0.12.0-rc1/development/extensions-core/druid-basic-security.html for more information.
Added by @jon-wei in #5099.
Currently clients can overwhelm a broker inadvertently by sending too many requests which get queued in an unbounded Jetty worker pool queue. Clients typically close the connection after a certain client-side timeout but the broker will continue to process these requests, giving the appearance of being unresponsive. Meanwhile, clients would continue to retry, continuing to add requests to an already overloaded broker..
The newly introduced properties druid.server.http.queueSize and druid.server.http.enableRequestLimit in the broker configuration and historical configuration allow users to configure request rejection to prevent clients from overwhelming brokers and historicals with queries.
Added by @himanshug in #4540.
For developers of custom ingestion parsing extensions, it is now possible for InputRowParsers to return multiple InputRows from a single input row. This can simplify ingestion pipelines by reducing the need for input transformations outside of Druid. Added by @pjain1 in #5081.
When creating new segments, Druid stores some pre-processed data in temporary buffers. Prior to 0.12.0, these buffers were always kept in temporary files on disk. In 0.12.0, PR #4762 by @leventov allows these temporary buffers to be stored in off-heap memory, thus reducing the number of disk I/O operations during ingestion. To enable using off-heap memory for these buffers, the druid.peon.defaultSegmentWriteOutMediumFactory property needs to be configured accordingly. If using off-heap memory for the temporary buffers, please ensure that -XX:MaxDirectMemorySize is increased to accommodate the higher direct memory usage.
Please see http://druid.io/docs/0.12.0-rc1/configuration/indexing-service.html#SegmentWriteOutMediumFactory for configuration details.
PR #4704 by @jihoonson allows the user to configure a number of processing threads to be used for parallel merging of intermediate GroupBy results that have been spilled to disk. Prior to 0.12.0, this merging step would always take place within a single thread.
Please see http://druid.io/docs/0.12.0-rc1/configuration/querying/groupbyquery.html#parallel-combine for configuration details.
Various improvements and features have been added to Druid SQL, by @gianm in the following PRs:
Please see below for changes between 0.11.0 and 0.12.0 that you should be aware of before upgrading. If you're updating from an earlier version than 0.11.0, please see release notes of the relevant intermediate versions for additional notes.
Please note that after upgrading to 0.12.0, it is no longer possible to downgrade to a version older than 0.11.0, due to changes made in #4762. It is still possible to roll back to version 0.11.0.
The Metamarkets java-util library has been brought into Druid. As a result, the following package references have changed:
com.metamx.common -> io.druid.java.util.common com.metamx.emitter -> io.druid.java.util.emitter com.metamx.http -> io.druid.java.util.http com.metamx.metrics -> io.druid.java.util.metrics
This will affect the druid.monitoring.monitors configuration. References to monitor classes under the old com.metamx.metrics.* package will need to be updated to reference io.druid.java.metrics.* instead, e.g. io.druid.java.util.metrics.JvmMonitor.
If classes under the the com.metamx packages shown above are referenced in other configurations such as log4j2.xml, those references will need to be updated as well.
Extension developers will need to update their code to use the new Druid packages as well.
The Caffeine cache extension has been moved out of an extension, into core Druid. In addition, the Caffeine cache is now the default cache implementation. Please remove druid-caffeine-cache if present from the extension list when upgrading to 0.12.0. More information can be found at #4810.
The semantics of the earlyMessageRejectPeriod configuration have changed. The earlyMessageRejectPeriod will now be added to (task start time + task duration) instead of just (task start time) when determining the bounds of the message window. Please see #4990 for more information.
(task start time + task duration)
(task start time)
In 0.12.0, there are protocol changes between the Kafka supervisor and Kafka Indexing task and also some changes to the metadata formats persisted on disk. Therefore, to support rolling upgrade, all the Middle Managers will need to be upgraded first before the Overlord. Note that this ordering is different from the standard order of upgrade, also note that this ordering is only necessary when using the Kafka Indexing Service. If one is not using Kafka Indexing Service or can handle down time for Kafka Supervisor then one can upgrade in any order.
Until the point in time Overlord is upgraded, all the Kafka Indexing Task will behave in same manner (even if they are upgraded) as earlier which means no decoupling and incremental hand-offs. Once, Overlord is upgraded, the new tasks started by the upgraded Overlord will support the new features.
Please see #4815 for more info.
Once both the overlord and middle managers are rolled back, a new set of tasks should be started, which will work properly. However, the current set of tasks may fail during a roll back. Please see #4815 for more info.
The ColumnSelectorFactory API has changed. Aggregator extension authors and any others who use ColumnSelectorFactory will need to update their code accordingly. Please see #4886 for more details.
The Aggregator.reset() method has been removed because it was deprecated and unused. Please see #5177 for more info.
The DataSegmentPusher interface has changed, and the push() method now has an additional replaceExisting parameter. Please see #5187 for details.
The Escalator interface has changed: the createEscalatedJettyClient method has been removed. Please see #5322 for more details.
@a2l007 @akashdw @AlexanderSaydakov @b-slim @ben-manes @benvogan @chuanlei @chunghochen @clintropolis @daniel-tcell @dclim @dpenas @drcrallen @egor-ryashin @elloooooo @Fokko @fuji-151a @gianm @gvsmirnov @himanshug @hzy001 @Igosuki @jihoonson @jon-wei @KenjiTakahashi @kevinconaway @leventov @mh2753 @niketh @nishantmonu51 @pjain1 @QiuMM @QubitPi @quenlang @Shimi @skyler-tao @xanec @yuppie-flu @zhangxinyu1