GSoC: The final phase
August 23, 2017
Since the last Google Summer of Code post a few weeks ago in which I promised to post weekly, I’ve mainly worked on the kafka-cdi
library, adding custom serializers/deserializers to automatically handle objects of any type T
.
Generic serializers and deserializers for Kafka
I ran into every problem imaginable - null serializers, problems with the library and problems with abstract class serialization. Here is the list of what was done:
Added
GenericSerializer
andGenericDeserializer
classes, which use Jackson’s Json object mapper. I looked intogson
as a possibility but Jackson was the better choice for what we needed. (Gson also doesn’t enforce strict deserialization, which I’m not a huge fan of) #17, plus an example of a generic gson ser/des.Used Producer/Consumer overloaded serializer/deserializer constructors. It turns out that if you don’t pass the serializers in manually, Kafka will attempt to construct them itself using the default constructor (no typing), which is quite obvious when you eventually get some distance from the problem! #17
Default to generic serdes to handle unknown types (right now Apache only supports standard data types) #17
Added unit tests for all classes in the serialization package #17, #23
Added jackson type info for abstract classes (in our case, we only needed it for the
Variant
class and its subtypes) #889Matthias refactored the library’s
DelegationKafkaConsumer
to support custom Consumers, which were previously only consuming records of typeString, String
#16
All PRs for this can be found here: #16, #17, #23, #889, plus an example of a generic gson ser/des.
And the mailing list threads: #1, #2
Kafka test environment for UPS consumers / producers
Polina worked on a more solid test environment which allows us to test real consumers/producers we are using in the UnifiedPush Server.
We’re using Debezium for setting up the Kafka Cluster in the mock environment and Mockito and Arquillian for constructing our test jar with the dependencies we need.
So far the tests are working well but we’re running into issues with circular dependencies, which means we’re probably going to have to try rethink our Kafka
module.
The PR for this can be found here: #895
JMS removal
Matthias refactored the Token Loader
class to replace the JMS queue with Kafka (#894).
After this, (and given the possibility of being able to send any objects with the generic ser/des) we slowly started replacing the rest of the push message flow with Kafka consumers, producers and streams:
We replaced the old notification router with a new streams based approach #900
We added a producer in the
Push Notification Sender
endpoint, for producing messages read by the streams #896We added a consumer in the
Token Loader
class, which reads from the streams output topics #900We added a consumer in the
Notification Dispatcher
class, which consumes messages from theMessageHolderWithTokens
producer. #897, #894 (token loader producer)
We still have to test everything against all variant types, which we’re going to use the UPS mock data loader for. I’m still trying to get it working with a local server though.
Metrics processing
Our plan for the next and last week of GSoC is to focus on logging responses and stats (for example, delivery successes and failures) to various topics for processing and analysis with Kafka streams at a later stage.
Polina has already come up with a few jiras (see our gsoc_2017
label) and we’ve sent an email to the mailing list asking for ideas from the community as well.
After that, we’ll be publishing our final reports including what’s left to do, and we’ll probably be doing a “wrap-up webinar”, going through our thoughts on the last three months of work and giving a small demo on the final result. Stay tuned!