Back

How to Prepare for Active Data Replication

Planning to implement active data replication? Read on to learn more about how this works, the potential obstacles you could encounter during the process, and how to properly prepare for replication.What Is Active/Active Replication?Your workplace operations may use multiple databases. Replicating data across them can save you significant time and effort. Active/active replication refers to the process of synchronizing changes between these databases. This is achieved by using the same application tables between them, although they continue to operate on an independent basis. How many databases might an active/active replication setup involve? It could just be two, but it could equally be many more. This process is often used for geographical replication or migrations. For example, active/active replication could be used while migrating on-premises data to the cloud. If you wish to implement active/active replication by running change data capture you could encounter data collisions (for example, duplicate primary keys) that make this complicated. A little planning and preparation can prevent this from happening and allow you to integrate all databases. One of the advantages of active/active replication is that, should a disaster occur that removes a database copy, another can automatically replace it.

Key Considerations for Active Data Replication:

Primary Keys

If your application generates a primary key from a database sequence, be prepared to implement adoption. Otherwise, an active/active relationship will likely result in data issues. You can prevent data collisions by taking the following three steps:

On different sides, you should start the sequence at incremental values. Bear in mind that the sequence should be incremented by the number of active sites.
Introduce a unique composite key by adding a column to every table.
Review each site periodically and ensure they haven’t used their allocated batch of numbers.

You might also introduce triggers that avoid application changes. Keep the possible impact on application performance in mind when changes are made. Indexes could be affected, and a sequence column could no longer simply increment.

Triggers

Database triggers can be used to implement business logic. This is often the case with traditional OTLP applications. These triggers should be provoked upon changes to the database made by the application. However, in most cases, you don’t want the same response when changes are introduced by the replication process. This means you must adapt the triggers or disable them at a session level.

Cascade Constraints

What does a cascade constraint do? Basically, it will delete, update, or nullify records in a child table when changes are made to the parent table of the primary key. Be aware of the applications that use these constraints and prepare to avoid potential replication issues. Because the cascade constraints happen automatically, the child row could be deleted, affecting the replication which wants to perform both deletes. It’s possible to instruct the replication to handle this and even register a warning. If you don’t do this, many data replication tools will register it as a failed process.

Collisions

Worried about collisions? You should be! Even the best preparation in the world cannot 100% avoid collisions. After all, databases operate independently, and collisions happen when one user updates a row at the same time as another user working from another database. The application you use might prevent these collisions from having a negative impact. For example, a CRM application account only allows managers to make changes to their own accounts. However, this doesn’t apply in every case. In situations where the system does not stop the collision from occurring, you’ll have to resolve them during the data replication process. Resist the temptation to take your system out-of-sync while you resolve it: this will only worsen the problem! Keep systems in sync throughout.

Managed Data Replication from Gravity

Gravity is a team of data engineers that provide data integration and replication services. Our work empowers clients to move their data from multiple sources into a single warehouse. Data security is a top priority for the team at Gravity. You’ll benefit from the highest levels of security at standard when you use our service. Our services are equipped to manage multiple destinations, and we offer real-time monitoring, a full audit trail, and alerts when a job requires your attention. Gravity data engineers are on hand to help handle any data emergencies you encounter.

Data Lake vs Data Warehouse

Discover their distinct purposes, and how they cater to data scientists and decision-makers in handling raw and processed data respectively.

Hybrid Integration Platform vs iPaaS

Both Hybrid Integration Programs (HIPs) and iPaaS are used to make the development, testing, maintenance, and use of application and data interfaces easier.