What is Redis Data Integration (RDI)?

Last updated 18, May 2024

Question

What is Redis Data Integration (RDI)?

Answer

RDI is a data integration product offered by Redis (the company) that simplifies bringing data into Redis Enterprise. Data can be ingested from existing (non-Redis) systems, transformed, and loaded into Redis Enterprise. Our product, RDI, comes with out-of-the-box support for Debezium Server. Debezium Server is an open-source distributed platform for CDC. Here is a brief explanation of how RDI can be used to implement a CDC pattern: Initial data sync

  • Debezium will take a baseline snapshot containing the data you want to mirror to Redis, which is already in the source database. It will stream these changes to the RDI database instance.
  • Then, the data is transformed using the transformation engine (based on Redis Gears) within this RDI database. A declarative configuration can set how this transformation is performed.

Capturing ongoing data changes

  • Data changes are captured from the source database (from which we want to stream changes) using a Debezium connector. Such a Debezium connector for a relational database system would typically read the transaction (or write-ahead) log file to capture changes.
  • Debezium Server pushes those data changes into an RDI database instance that buffers them in Redis Streams. This RDI database also keeps the required state information and metadata.
  • Then, the data is transformed within the RDI database based on the transformation configuration.
  • Ultimately, the transformation result is loaded to a pre-configured target Redis DB.

In the simplest terms possible - with the RDI and Debezium Server, we're able to automatically capture data written to other databases (currently Oracle, Postgres, MySQL, MS SQL, and MariaDB supported) and synchronize it into hashes (or JSON documents) in Redis Enterprise. RDI is generally not limited to the CDC. It can also be used to implement other data integration patterns. Any source that can write to an RDI data stream would be valid. However, CDC is currently the main use case.

References