Skip to main content

Data replication

NineData data replication supports full data synchronization and incremental data synchronization between two data sources.

Function Description

NineData supports data replication between various data sources, and the specific support for each function is detailed in the table below.

Source Data SourceTarget Data SourceSchema ReplicationFull ReplicationIncremental ReplicationSchema ComparisonData Comparison
MySQLMySQL✔️✔️✔️✔️✔️
PostgreSQL✔️✔️✔️✔️
Oracle✔️✔️✔️✔️
Doris✔️✔️✔️✔️
SelectDB✔️✔️✔️✔️
ClickHouse✔️✔️✔️✔️
Elasticsearch✔️✔️✔️
Kafka✔️✔️
DWS✔️✔️✔️
Greenplum✔️✔️✔️✔️
Redshift✔️✔️
StarRocks✔️✔️✔️✔️
SingleStore✔️✔️
TiDB✔️✔️✔️✔️
Hive✔️✔️✔️
ADB PostgreSQL✔️✔️✔️
GaussDB✔️✔️✔️
Database Grouping✔️✔️
OpenGauss✔️✔️✔️✔️
KingBase✔️✔️✔️✔️
OracleMySQL✔️✔️✔️✔️
PostgreSQL✔️✔️✔️✔️
Oracle✔️✔️✔️✔️✔️
OB-Oracle✔️✔️✔️✔️✔️
Doris✔️✔️✔️✔️
SelectDB✔️✔️
ClickHouse✔️✔️✔️
Kafka✔️✔️
DWS✔️✔️✔️
Greenplum✔️✔️✔️✔️
StarRocks✔️✔️
SingleStore✔️✔️✔️✔️
TiDB✔️✔️✔️✔️
Hive✔️✔️✔️
ADB PostgreSQL✔️✔️✔️✔️
GaussDB✔️✔️✔️
KingBase✔️✔️✔️✔️
PostgreSQLMySQL✔️✔️✔️✔️
PostgreSQL✔️✔️✔️✔️✔️
Oracle✔️✔️✔️✔️
Doris✔️✔️✔️✔️
SelectDB✔️✔️
ClickHouse✔️✔️✔️
Kafka✔️✔️
DWS✔️✔️✔️
Greenplum✔️✔️✔️✔️
StarRocks✔️✔️✔️✔️
SingleStore✔️✔️
TiDB✔️✔️✔️✔️
Hive✔️✔️✔️
ADB PostgreSQL✔️✔️✔️✔️
Sybase✔️✔️✔️
GaussDB✔️✔️✔️
SQL ServerMySQL✔️✔️
SQL Server✔️✔️✔️✔️✔️
PostgreSQL✔️✔️✔️
Doris✔️✔️
StarRocks✔️✔️
TDSQL MySQLTDSQL MySQL✔️✔️✔️✔️
TiDBMySQL✔️✔️✔️✔️
PostgreSQL✔️✔️✔️✔️
Oracle✔️✔️✔️✔️
ClickHouse✔️✔️✔️
Greenplum✔️✔️✔️✔️
TiDB✔️✔️✔️✔️
Hive✔️✔️✔️
ADB PostgreSQL✔️✔️✔️✔️
DaMengDaMeng✔️✔️
KingBaseMySQL✔️✔️✔️✔️
PostgreSQL✔️✔️✔️✔️✔️
Oracle✔️✔️✔️✔️
KingBase✔️✔️✔️✔️✔️
OpenGaussMySQL✔️✔️✔️
PostgreSQL✔️✔️✔️
Oracle✔️✔️✔️
Doris✔️✔️✔️
SelectDB✔️✔️✔️
StarRocks✔️✔️✔️
GaussDB✔️✔️✔️
OpenGauss✔️✔️✔️
GaussDBMySQL✔️✔️✔️
PostgreSQL✔️✔️✔️
Oracle✔️✔️✔️
Doris✔️✔️✔️
SelectDB✔️✔️✔️
DWS✔️✔️✔️
StarRocks✔️✔️✔️
GaussDB✔️✔️✔️
OpenGauss✔️✔️✔️
DWSMySQL✔️✔️✔️
PostgreSQL✔️✔️✔️
Oracle✔️✔️✔️
Doris✔️✔️✔️
SelectDB✔️✔️✔️
DWS✔️✔️✔️✔️
StarRocks✔️✔️✔️
GaussDB✔️✔️✔️
ClickHouseMySQL✔️✔️
PostgreSQL✔️✔️
Oracle✔️✔️
Doris✔️✔️
SelectDB✔️✔️
ClickHouse✔️✔️✔️
Greenplum✔️✔️
StarRocks✔️✔️
ADB PostgreSQLMySQL✔️✔️✔️
PostgreSQL✔️✔️✔️
Oracle✔️✔️✔️
TiDB✔️✔️✔️
ADB PostgreSQL✔️✔️✔️
OB-MySQLMySQL✔️✔️✔️
OB-MySQL✔️✔️✔️
OB-OracleOB-Oracle✔️✔️
GreenplumMySQL✔️✔️✔️
PostgreSQL✔️✔️✔️
Oracle✔️✔️✔️
Doris✔️✔️
SelectDB✔️✔️
ClickHouse✔️✔️
Greenplum✔️✔️✔️✔️
StarRocks✔️✔️
SingleStore✔️✔️
TiDB✔️✔️✔️
Hive✔️✔️✔️
SelectDBMySQL✔️✔️
PostgreSQL✔️✔️
Oracle✔️✔️
Doris✔️✔️
SelectDB✔️✔️
ClickHouse✔️✔️
Greenplum✔️✔️✔️✔️
StarRocks✔️✔️
SingleStore✔️✔️
StarRocksMySQL✔️✔️
PostgreSQL✔️✔️
Oracle✔️✔️
Doris✔️✔️
SelectDB✔️✔️
ClickHouse✔️✔️
Greenplum✔️✔️✔️✔️
StarRocks✔️✔️
SingleStore✔️✔️
SingleStoreMySQL✔️✔️
PostgreSQL✔️✔️
Oracle✔️✔️
Doris✔️✔️
SelectDB✔️✔️
ClickHouse✔️✔️
Greenplum✔️✔️✔️✔️
StarRocks✔️✔️
SingleStore✔️✔️
DB2DB2✔️✔️
RedisRedis✔️✔️✔️
MongoDBMongoDB✔️✔️✔️✔️✔️
SybasePostgreSQL✔️✔️✔️
Sybase✔️✔️✔️
KafkaMySQL✔️
ClickHouse✔️
Kafka✔️
HiveMySQL✔️✔️✔️
PostgreSQL✔️✔️✔️
Oracle✔️✔️✔️
Greenplum✔️✔️✔️
TiDB✔️✔️✔️
Hive✔️✔️
Database GroupingMySQL✔️✔️
Database Grouping✔️✔️

Typically, a complete data replication process consists of three phases:

  1. Structure replication: Synchronize the library table structure in the source data source to the target data source.
  2. Full data replication: Synchronize full data from the source data source to the target data source.
  3. Incremental data replication: Synchronize ongoing data changes in the source data source to the target database in real time.

After the replication is complete, user can first verify the compatibility of the data and structure in the target data source. If passed, user can switch the business to the target data source to achieve smooth migration.

The above three stages can be performed separately. For example, you can separately perform structure replication, full data replication or incremental data replication according to business needs.

tip

You cannot perform structure replication and incremental data replication at the same time without selecting full data replication.

Prerequisite

  • Added source and target data sources to NineData. For how to add, see Adding Data Sources .

  • In scenarios that do not involve table structure replication, the target data source must contain the structure of the table to be replicated.

  • When MySQL is the source, Binlog must be enabled, and the related parameters of Binlog are set as follows:

    • binlog_format=ROW
    • binlog_row_image=FULL
    tip

    If the source data source is the standby database, in order to obtain the complete Binlog, the log_slave_updates parameter must be turn on.

Restrictions

  • The data replication function is only for the user databases in the data source, and the system databases will not be replicated. For example: information_schema, mysql, performance_schema, sys databases in MySQL type data sources will not be replicated.
  • The account for source data must have SELECT (for replicate database structure and full data), SHOW VIEW (for replicate views), and REPLICATION CLIENT, REPLICATION SLAVE (for replicate incremental data) privileges on the objects to be replicated. The account for target must have DML and DDL privileges.
  • Before performing data synchronization, user need to evaluate the performance of the source data source and the target data source, and it is recommended to perform data synchronization during off-peak time. Otherwise, the full data initialization will occupy a certain amount of read and write resources of the source data source and the target data source, increasing database load.
  • During the synchronization process, if the source data contains views, functions, stored procedures, triggers, and events, after synchronizing to the target data source, the definer of the above objects information will be modified in the target data source to the account that accesses the target data source in the current synchronization task.
  • It is necessary to ensure that each table in the synchronization object has a primary key or unique constraint, and the column name is unique, otherwise the same data may be synchronized repeatedly.
  • During the synchronization process, if there are triggers in the source, the system will not synchronize the triggers until the incremental synchronization ends.

Steps

Commercialization Notice

NineData’s data replication product has been commercialized. You can still use 10 replication tasks for free, with the following considerations:

  • Among the 10 replication tasks, you can include 1 task, with a specification of Micro.

  • Tasks with a status of do not count towards the 10-task limit. If you have already created 10 replication tasks and want to create more, you can terminate previous replication tasks and then create new ones.

  • When creating replication tasks, you can only select the you have purchased. Specifications that have not been purchased will be grayed out and cannot be selected. If you need to purchase additional specifications, please contact us through the customer service icon at the bottom right of the page.

  1. Log in to the NineData Console.

  2. Click on > in the left navigation bar.

  3. On the page, click on in the top right corner.

  4. On the tab, configure according to the table below, and click .

    ParameterDescription
    Enter the name of the data synchronization task for easy search and management. Please use meaningful names. Up to 64 characters are supported.
    Data source where the synchronization object is located.
    Data source receiving the synchronized object.
    Choose what to copy to the target data source.
    • : Only synchronize the schema of the source data source, not the data.
    • : Synchronize all objects and data from the source data source, i.e., full data replication.
    • : After full synchronization, perform incremental synchronization based on the source data source's logs. Click the setting icon to uncheck certain operation types according to your needs. Once unchecked, these operations will be ignored during incremental synchronization.
    Note: You can also click to expand to select the handling strategy in case of same-named tables or identical data.
    • (Select ):
      • : Stop the task when a table with the same name is detected during the precheck stage.
      • : When a table with the same name is detected during the precheck stage, display a prompt and continue the task. During structural replication, ignore this same-named table. If you are also replicating data, the data will be appended to the same-named table without overwriting existing data.
      • : When a table with the same name is detected during the precheck stage, display a prompt and continue the task. During structural replication, delete the same-named table in the target database and replicate the table structure based on the source database. If you are also replicating data, the data will be written after the table structure replication is completed.
      • (Selectable when performing both structural and data replication): When a table with the same name is detected during the precheck stage, display a prompt and continue the task. During structural replication, retain the table structure in the target database and clear the data in the same-named table when data replication starts, then replicate from the original table.
    • (Unselected when is selected):
      • : Stop the task when data is detected in the target table during the precheck stage.
      • : Ignore the data detected in the target table during the precheck stage and append other data.
      • : Delete the data detected in the target table during the precheck stage and rewrite.
  5. On the tab, configure the following parameters, then click .

    ParameterDescription
    Choose what to replicate. You can select to replicate all contents of the source database, or choose , select the items to be replicated in the list, and click > to add them to the right list.
    (Optional)Click to add a blacklist record, select the databases or objects to be added to the blacklist. These contents will not be replicated. This is used to exclude certain databases or objects in the case of full database replication for or .
    • Left dropdown: Select the name of the database to be added to the blacklist.
    • Right dropdown: Select the objects in the corresponding database. You can click to select multiple objects; leave it blank to add the entire database to the blacklist.
    If you want to add multiple databases to the blacklist, you can click the Add button below.

    If you need to create multiple identical replication links, you can create a configuration file and import it when creating a new task. Click at the top right corner, then click Download Template to download the configuration file template to your local machine. After editing, click to upload the configuration file for batch import.

    Parameter
    Description
    source_table_nameName of the source table containing the objects to be synchronized.
    destination_table_nameName of the target table receiving the synchronized objects.
    source_schema_nameName of the source schema containing the objects to be synchronized.
    destination_schema_nameName of the target schema receiving the synchronized objects.
    source_database_nameName of the source database containing the objects to be synchronized.
    target_database_nameName of the target database receiving the synchronized objects.
    column_listList of columns to be synchronized.
    extra_configurationAdditional configuration information, where you can configure the following:
    • Column Mapping: column_name, destination_column_name
    • Column Value: column_value
    • Data Filtering: filter_condition
    tip
    • An example of the extra_configuration content is as follows:

      {
      "column_name": "created_time", // Original column name to be mapped
      "destination_column_name": "migrated_time", // Target column name mapped to "migrated_time"
      "column_value": "current_timestamp()", // Change the column value to the current timestamp
      "filter_condition": "id != 0" // Only synchronize rows where ID is not 0.
      }
    • For a complete example of the configuration file, refer to the downloaded template.

  6. On the tab, choose different operations based on the selected replication type. If updates occur in the source and target data sources during the configuration mapping phase, you can click the button in the upper right corner of the page to refresh the information of the source and target data sources.

    • Includes : Configure the table name after synchronization to the target data source. Click .

    • Excludes : The system automatically selects the same-named database in the target data source, if it exists; otherwise, you need to manually select the target database. The table names and column names in the target database need to match the synchronization objects. If they don't match, you can manually map the table names and column names.

    You can click on the right side of the page to customize the column names after synchronization to the target data source. Additionally, you can set , and only data that meets the filtering conditions will be synchronized to the target data source. Taking the test data table employees as an example, if you set the filtering condition to emp_no>=10005, data with emp_no less than 10005 will not be synchronized to the target data source.

  7. On the tab, wait for the system to complete the precheck. After passing the precheck, click .

    tip
    • You can check . After the synchronization task is completed, automatically start data consistency comparison based on the source data source to ensure data consistency on both ends. The startup timing of based on your selection of is as follows:
      • : Starts after structural replication is completed.
      • + , : Starts after full replication is completed.
      • + + , : Starts when incremental data is first consistent with the source data source and is 0 seconds. You can click to view the synchronization delay on the page. ![sync_delay](../image/sync_delay.png)
    • If the precheck fails, you need to click in the column on the right of the target check item, identify the reason for the failure, manually fix it, and then click to perform the precheck again until it passes.
    • For check items with as , you can repair or ignore them based on the specific situation.
  8. On the page, you will receive a prompt , indicating that the synchronization task has started. At this point, you can:

    • Click to view the execution status of the synchronization task at each stage.
    • Click to return to the task list page.

View sync results

  1. Log in to the NineData Console.

  2. Click on > in the left navigation bar.

  3. On the page, click on for the target synchronization task. The page details are as follows.

    result

    Serial Number
    Function
    Description
    1Synchronization DelayThe data synchronization delay between the source data source and the target data source. 0 seconds indicates no delay between the two ends. At this point, you can choose to switch your business to the target data source for smooth migration.
    2Configure AlertsAfter configuring alerts, the system will notify you in the way you choose when the task fails. For more information, please refer to Operations and Monitoring Introduction.
    3More
    • Pause: Pause the task. Only tasks in the Running state can be selected.
    • Terminate: End tasks that are incomplete or in listening mode (i.e., in incremental synchronization). After terminating the task, it cannot be restarted. Please proceed with caution. If the synchronization objects include triggers, trigger replication options will pop up; choose as needed.
    • Delete: Delete the task. Once a task is deleted, it cannot be recovered. Please proceed with caution.
    4Structural Replication (Displayed for scenarios that include structural replication)Display the progress and detailed information of structural replication.
    • Click Logs on the right side of the page: View the execution logs of structural replication.
    • Click refresh on the right side of the page: View the latest information.
    • Click View DDL in the Actions column on the right of the target object in the list: View SQL playback.
    5Full Copy (Displayed for scenarios that include full copy)Display the progress and detailed information of full copy.
    • Click Monitoring on the right side of the page: View various monitoring indicators during the full copy process. During full copy, you can also click Rate Limit Settings on the right of the monitoring indicator page to limit the rate of writing to the target data source in rows per second.
    • Click Logs on the right side of the page: View the execution logs of the full copy.
    • Click refresh on the right side of the page: View the latest information.
    6Incremental Copy (Displayed for scenarios that include incremental copy)Display various monitoring indicators for incremental copy.
    • Click on the right of the page: View the operations currently being executed by the current copy task, including:
      • : The thread number currently in progress for the copy task.
      • : Details of the SQL statement currently being executed by the thread.
      • : The response time of the current thread. An increase in this value may indicate that the current thread may be stuck for some reason.
      • : The timestamp when the current thread started.
      • : The status of the current thread.
    • Click Rate Limit Settings on the right of the page: Limit the rate of writing to the target data source in rows per second.
    • Click Logs on the right side of the page: View the execution logs of the incremental copy.
    • Click refresh on the right side of the page: View the latest information.
    7Modify ObjectDisplay the modification records of synchronized objects.
    • Click Modify Synchronized Object on the right of the page to configure synchronized objects.
    • Click refresh on the right side of the page: View the latest information.
    8Data ComparisonDisplay the results of data comparison between the source data source and the target data source, including Structure Comparison and Data Comparison. If you have not enabled data comparison, click Enable Data Comparison in the page.
    • Click Re-compare on the right side of the page: Re-initiate the comparison of data between the current source and target ends.
    • Click Logs on the right side of the page: View the execution logs of consistency comparison.
    • Click Monitoring (displayed for data comparison only): View the trend chart of RPS (records compared per second) during the comparison. Click Details to view earlier records.
    • Click details in the Actions column on the right of the comparison list (displayed on the Data Comparison page only in case of inconsistency): View detailed comparison results between the source and target ends.
    • Click sql in the Actions column on the right (displayed in case of inconsistency): Generate change SQL. You can copy this SQL directly to the target data source for execution to modify inconsistent content.
    9View ReverseDisplayed for bidirectional replication tasks only. Click to view the replication details from the target data source to the source data source.
    10ExpandDisplay detailed information about the current replication task, including Replication Type, Replication Object, Start Time, and more.

Introduction to Data Replication