Debezium Blog
Debezium provides a way to run the connectors directly within Debezium from the very beginning of the project. The way how it was provided has changed over the time and it still evolves. This article will describe another evolution step in this regard - new implementation of Debezium engine.
In the previous blog post, we have shown how to leverage Debezium to train neural-network model with the existing data from the database and use this pre-trained model to classify images newly stored into the database. In this blog post, we will move it one step further - we will use Debezium to create multiple data streams from the database and use one of the streams for continuous learning and to improve our model, and the second one for making predictions on the data. When the model is constantly improved or adjusted to recent data samples, this approach is known as online machine learning. Online learning is only suitable for some use cases, and implementing an online variant of a given algorithm may be challenging or even impossible. However, in situations where online learning is possible, it becomes a very powerful tool as it allows one to react to the changes in the data in real-time and avoids the need to re-train and re-deploy new models, thus saving the hardware and operational costs. As the streams of data become more and more common, e.g. with the advent of IoT, we can expect online learning to become more and more popular. It’s usually a perfect fit for analyzing streaming data in use cases where it’s possible.