Skip to main content

How to Migrate PostgreSQL Data Source to AWS RDS PostgreSQL Without Downtime

Recently, Google Cloud made a significant blunder by mistakenly deleting all data of an investment company (Unisuper, managing a fund of 80 billion USD) across all regions on Google Cloud, with such thoroughness that not even a single backup was left.

Unisuper is an Australian superannuation fund providing retirement services for employees in the higher education and research sectors of Australia. As of June 30, 2023, the fund has over 615,000 members and manages a fund of 124 billion Australian dollars (80 billion USD).

This incident directly led to a complete paralysis of Unisuper's fund services, and as of today (May 12th), it has been more than a week, with 60 million members' 80 billion USD inaccessible. Fortunately, Unisuper had considered potential risks beforehand and did not put all its eggs in one basket, but also kept data backups on other clouds, which directly saved their fund business and provided a chance for system recovery.

This event has sparked widespread concern and worry in the industry regarding the data security of cloud service providers. Although cloud computing provides flexibility and convenience for businesses, keeping business data on a single cloud is very insecure. For large financial institutions like Unisuper, the impact on credit and reputation caused by this event will be profound, and customers will raise such questions: "Is my money safe?", "Is my information secure?", "Why should I believe that this kind of incident will not happen again?"

How to Prevent Cloud Providers from Deleting Databases with One Click?

Although the probability of occurrence is not high, once it happens, it will be a fatal blow to the enterprise. The Google Cloud incident undoubtedly sounded the alarm for all enterprises and organizations that host their businesses on the cloud, forcing them to re-examine the issue of how to ensure the lifeblood of the enterprise: the security of business data.

The solution we are going to discuss today is the multi-cloud strategy adopted by Unisuper, but the multi-cloud strategy adopted by Unisuper actually has flaws.

Since Unisuper only kept backup data on other clouds, and the backup data may not be real-time, it took more than a week to complete the recovery.

What we need to adopt is a more comprehensive multi-cloud strategy, that is, to maintain a database instance on another cloud, and through a real-time synchronization link, continuously synchronize the data of the master database to this database instance, ensuring that the data on both sides is consistent at all times.

To implement this strategy, it is necessary to migrate the data in the current cloud environment to another cloud. As an example, our migration target this time is AWS's RDS PostgreSQL.

Why Choose AWS

Amazon Web Services (AWS) is the most comprehensive and widely used cloud in the world, providing more than 200 full-featured services from data centers around the globe (from the AWS official website).

One of the reasons for choosing AWS RDS PostgreSQL as an example in this article is AWS's security. AWS's core infrastructure is built to meet the security requirements of the military, global banks, and other highly sensitive organizations. Moreover, AWS supports 90 security standards and compliance certifications, and all AWS services that store customer data have the ability to encrypt data.

What Problems Exist in Cross-Cloud Migration?

What we are going to do is cross-cloud migration, which sounds simple, but it is quite complicated in actual operation. Since most cloud vendors provide migration products that only allow data to be brought in and not out, that is, they only provide good support for self-built databases on their own cloud, but often cannot meet the needs of migrating data out of their own cloud.

For security reasons, cloud databases are usually closed intranet environments. In the absence of migration products that provide support for data migration out, it is necessary to open public network access addresses to migrate data, which undoubtedly provides criminals with a great opportunity to attack the database. What enterprises may face is the database being compromised, important data being leaked, and even worse, the data being completely cleared, and years of business achievements being destroyed in an instant.

In addition, there are a series of issues that must be considered:

  • Business Availability: Migration must be carried out without affecting the business, in other words, the migration cannot be stopped, which requires a lot of consideration: how to completely migrate the existing and incremental data? How to deal with performance fluctuations during migration? How to achieve smooth switching of applications? And so on.
  • Initialization and Change Linkage of Table Structure: When the number of tables to be migrated is huge, and DDL operations occur in the source database during the migration process, how to achieve efficient and stable database migration is undoubtedly a big challenge.
  • Migration Data Quality: In the process of large-scale data migration, how to ensure data consistency; when the business migration switch is abnormal, how to effectively roll back to ensure the availability of the business.
  • How to Stop Loss in Case of Migration Failure: Different clouds have different functions and performance, and the migration complexity is high. When the business migration fails, how to effectively ensure the availability of the business is also a key issue that the business needs to consider.

Therefore, we need a tool that does not need to open the public network, has comprehensive functions, is stable, fast, and can monitor the migration task status in real-time, and ensure the consistency of the migration results.

How to Solve Cross-Cloud Migration Problems?

So here it comes, NineData's data replication function is specifically designed for the above pain points. Let's first look at some features of NineData:

  • Support for Multiple Cloud Vendors: Integrate the private network environment of each cloud vendor, and migration does not need to open public network access links on the cloud database end. For less common cloud vendors, it also provides gateway functions, which can also access the database directly without opening the public network, making the migration link safe and efficient.
  • Business Non-Stop During Migration: NineData provides structural migration, full data migration, and CDC incremental data migration based on redo log. During the database migration process, the source Oracle can provide services normally. NineData can automatically complete structural migration, full data migration, and automatically start real-time monitoring, collection, parsing, and replication capabilities of redo log. The incremental update data on the source end will be replicated in real-time to the target end. When NineData enters the incremental data migration phase and the replication is delay-free, the business can perform read-only verification on the target end, and use NineData's data comparison tool for data consistency verification. After the business verification is passed, the business can be stopped and switched. It can be seen that the entire migration process has a very short business downtime.
  • Strong Replication Performance: During the database migration process, the migration speed is undoubtedly an important factor affecting whether the business can be successfully switched. NineData is based on log analysis, intelligent sharding, dynamic batch accumulation, data merging, and unique data formats, etc., to effectively ensure the performance of full data replication and incremental data replication. Currently, NineData's full data replication performance is up to 200 GB/hour, and the incremental data replication performance is up to 20,000 records/second.
  • Complete Migration Rollback Plan: There is a big difference in functions and performance between different cloud databases, involving a lot of business transformation and performance tuning, and the difficulty of migration and docking is very high. To reduce the risk of failure in docking, businesses usually need to prepare a rollback plan for failure. NineData provides CDC incremental replication capabilities, supporting real-time collection and parsing of logs from the source end and synchronizing increments to the target end. Before the business switches from the source end to the target end, a replication task can be set up in NineData to synchronize the new data generated after the business docking back to the source end in real-time. Based on this replication task, the new data generated after the business docking can be synchronized back to the source end, ensuring that the source end retains complete business data. In case of business operation problems caused by the functions or performance of the target end, the business can be switched back to the source end at any time, effectively avoiding business migration failures.

Based on the above capabilities, NineData can easily solve the migration problems between different cloud vendors. Let's take a look at how to operate.

Step One: Enter Source and Target Data Sources

  1. Log in to the NineData console, click on Data Source Management > Data Sources, and then click on Create Data Source on the page, and select the required cloud vendor.

    image-20240426170752547

  2. According to the page prompts, enter the data source through the private network method, and then click Create Data Source to complete the creation. Repeat this step to complete the entry of the source data source and the target data source.

    image-20240527103711434

  1. Log in to the NineData console, click on Data Replication > Data Replication, and then click on Create Replication.

    image-20240319150313727

  2. According to the page prompts, configure the replication task. To ensure that the full data and incremental data from the source end are completely migrated to the target end, structural replication, full replication, and incremental replication need to be checked in the replication type.

    image-20240527103817909

  3. After the configuration is completed, start the task. For all the synchronization objects you have configured, NineData will first perform a full migration of all the existing data, and then it will be the real-time migration of the incremental data newly added in the source end. All newly written data will be synchronized to the target end without omission. Whenever the incremental data on the target end catches up with the source end, the task panel will display delay 0 seconds, representing that the data in the current target end is the latest.

    image-20240527104228870

Step Three (Optional): Verify the Integrity of Synchronized Data on the Target End

In addition to the synchronization function, NineData also provides a comparison function for the synchronized data between the source end and the target end after synchronization to ensure the integrity of the data on the target end.

  1. Log in to the NineData console, click on Data Replication > Data Replication, and then click on the replication task ID created in Step Two.

    image-20240527104041851

  2. Click on the Data Comparison tab to display the comparison results (if Enable Data Consistency Comparison is not checked in the task configuration of Step Two, you need to click Enable Data Comparison here).

    image-20240527104731274

  3. You can click Re-compare on the page after a while to verify the synchronization results of the latest incremental data.

    image-20240527104635661

Step Four (Optional): Configure Task Exception Alerts

Since it is a long-term task, you may need the system to monitor the task status in real-time and notify you immediately when there is an exception.

  1. Log in to the NineData console, click on Data Replication > Data Replication, and then click on the replication task ID created in Step Two.

    image-20240527104041851

  2. Click on Configure Alerts in the upper right corner.

    image-20240527104815686

  3. Enter the Policy Name and click Save Configuration. You can directly use the built-in default rules to send a text message reminder when the task fails or the replication delay is greater than or equal to 10 minutes. You can also create custom rules to notify according to your needs.

    image-20240527104913082

Summary

By following the above steps, you can completely migrate your business data from other clouds to AWS RDS PostgreSQL. On the premise of zero incremental replication delay, you can perform business docking at any time you need, and switch the business traffic to the new cloud.

If you just need to use AWS RDS PostgreSQL as a multi-active node for your business, you can also keep this migration link running continuously, and NineData will ensure that the data on both ends is kept consistent in real-time. In case of a failure in one cloud vendor, the business can be quickly switched to another cloud.

So far, you have successfully achieved non-stop database migration across cloud vendors, minimizing the impact on online business.