Skip to main content

MySQL to Doris

NineData data copying supports full data synchronization and incremental data synchronization from MySQL to Doris data sources.

Prerequisites

  • The source data source and target data source have been added to NineData. For instructions on how to add a data source, see Adding Data Sources.

  • The database type of the source is MySQL or a MySQL-like database, such as MySQL, MariaDB, PolarDB MySQL, TDSQL-C, GaussDB MySQL, Aurora, etc.

  • The target database is Doris 1.2.0, 1.1.0, 1.0.0, or 0.15.0 version.

  • The source data source must have Binlog enabled, and the Binlog-related parameters are set as follows:

    • binlog_format=ROW
    • binlog_row_image=FULL
    tip

    If the source data source is a standby database, to ensure complete Binlog logs, you also need to enable the log_slave_updates parameter.

Usage Limitations

  • The table names in the source data source must comply with Doris' table naming rules (must start with a letter, can contain letters, numbers, and underscores, but cannot contain spaces or other special characters).

  • The data copying function is only for user databases in the data source; system databases will not be copied. For example, databases like information_schema, mysql, performance_schema, and sys in MySQL-type data sources will not be copied.

  • The account in the source data source must have SELECT permission for the objects to be copied (structure copying, full copying), REPLICATION CLIENT, REPLICATION SLAVE permission (incremental copying). The account in the target data source must have DML, DDL permissions.

  • Evaluate the performance of the source data source and target data source before executing data synchronization, and it is recommended to execute data synchronization during the business off-peak period. Otherwise, during full data initialization, it will consume certain read and write resources of the source data source and target data source, leading to an increase in database load.

  • In the case of table-level synchronization, do not use online DDL change tools (e.g., gh-ost, pt-online-schema-change) to modify the synchronization objects in the source database during the synchronization process, as it may cause synchronization failure.

  • Ensure that each table in the synchronization object has a primary key or unique constraint, and column names are unique, otherwise, it may result in duplicate synchronization of the same data.

Operation Steps

  1. Log in to the NineData Console.

  2. Click on > in the left navigation bar.

  3. On the page, click on in the upper right corner.

  4. On the tab, configure as per the table below and click on .

    Parameter
    Description
    Enter a name for the data synchronization task. For ease of future retrieval and management, please use a meaningful name. Maximum support is 64 characters.
    The data source where the synchronization object resides.
    The data source that will receive the synchronized object.
    Choose the content to be copied to the target data source.
    • : Only syncs the database table structure from the source data source, without syncing the data.
    • : Syncs all objects and data from the source data source, i.e., full data copy.
    • : After full sync, performs incremental sync based on the source data source's logs. Click the setting icon to selectively uncheck certain operation types according to your needs. These unchecked operations will be ignored during incremental sync.
    Note: You can also click to expand to choose the handling strategy for tables with the same name or data in the target.
    • (to be selected with )
      • : Stop the task when a table with the same name is detected during pre-check.
      • : Prompt and continue the task when a table with the same name is detected during pre-check. Ignore this table during structure copy. If you're also doing data copy, the data will be appended to the table without overwriting existing data.
      • : Prompt and continue the task when a table with the same name is detected during pre-check. During structure copy, delete the same-named table in the target database and recopy the table structure based on the source database. If you're also doing data copy, the data will be written after the table structure copy.
      • (optional when doing both structure and data copy): Prompt and continue the task when a table with the same name is detected during pre-check. During structure copy, keep the table structure in the target database and clear the data in the same-named table when data copy starts, then recopy from the original table.
    • (to be selected when is not chosen)
      • : Stop the task when data exists in the target table is detected during pre-check.
      • : Ignore the portion of data in the target table detected during pre-check and append other data.
      • : Delete the portion of data in the target table detected during pre-check and rewrite.
  5. On the tab, configure the following parameters and then click .

    Parameter
    Description
    Choose the content to be copied. You can choose to copy all contents from the source database, or choose . In list, select the content you want to copy, and click > to add it to the right list.
    (optional)Click to add a blacklist record. Choose the database or object to be added to the blacklist; these contents will not be copied. This is used to exclude certain databases or objects in the full database copy of or .
    • Left dropdown: Choose the database name to be added to the blacklist.
    • Right dropdown: Choose the objects in the corresponding database. You can click to select multiple objects; leave it blank to include the entire database in the blacklist.
    If you want to add multiple databases to the blacklist, you can click the Add button below.
  6. On the tab, choose different operations based on the selected copy type.

    • Including : Configure the table name, Key, and Distribute Key after syncing the target table to the target data source, then click . For different types of tables, the default Data Model and Key definitions used after copying to Doris are as follows.

      MySQL Table TypeDoris Data ModelDoris Key DefinitionDoris Distribute Key Definition
      Tables with primary keyUnique ModelAll primary key columns, in the order defined by the source database's primary key column definition.All primary key columns.
      Tables without primary key but with unique keyUnique ModelAll unique key columns, in the order defined by the source database's unique key definition.All unique key columns.
      Tables without primary key and unique keyDuplicate ModelDefault selection of the first 2 columns in the table.Keep the same as Key.
      tip

      You can click on the right side of the page to customize the names of columns after synchronization to the target data source. Additionally, you can set to synchronize only the data that meets the filtering conditions to the target data source. For example, using the MySQL official test data table "employees" as an example, setting the filtering condition to emp_no>=10005 means that data with emp_no less than 10005 will not be synchronized to the target data source.

    • Excluding : The target database with the same name as the target data source is automatically selected by the system. If it doesn't exist, you need to manually select the target database. The table names and column names in the target database need to be consistent with the synchronized objects. If they are not, you can manually map table names and column names.

  7. On the tab, wait for the system to complete the pre-check. After a successful pre-check, click .

    tip
    • You can check . After the synchronization task is completed, automatically start data consistency comparison based on the source data source. The startup time of depends on your choice of :
      • : Starts after structure copy is completed.
      • + , : Starts after full copy is completed.
      • + + , : Starts when incremental data is consistent with the source data source for the first time and is set to 0 seconds. You can click to view the synchronization delay in the page.![sync_delay](../image/sync_delay.png)
    • If the pre-check fails, you need to click in the column on the right side of the target check item to investigate the cause of the failure, manually fix it, and click to rerun the pre-check until it passes.
    • For check items where is , you can fix or ignore based on the specific situation.
  8. On the page, when prompted with , the synchronization task begins to run. At this point, you can perform the following operations:

    • Click to view the execution status of the synchronization task at each stage.
    • Click to return to the task list page .

View Synchronization Results

  1. Log in to the NineData Console.

  2. Click on > in the left navigation bar.

  3. On the page, click on the of the target synchronization task. The page is explained as follows.

    result

    Number
    Function
    Description
    1Synchronization DelayThe data synchronization delay between the source data source and the target data source. 0 seconds indicates no delay between the two ends. At this point, you can choose to switch the business to the target data source to achieve a smooth migration.
    2Configure AlertsAfter configuring alerts, the system will notify you through the selected method in case of task failure. For more information, see Operations Monitoring Introduction.
    3More
    • Pause: Pause the task. Only tasks with a status of Running are available for selection.
    • Terminate: End tasks that are incomplete or in monitoring (i.e., in incremental synchronization). After terminating the task, it cannot be restarted, so please proceed with caution. If the synchronization objects contain triggers, trigger copy options will pop up. Select as needed.
    • Delete: Delete the task. Once the task is deleted, it cannot be recovered, so please proceed with caution.
    4Structure Copy (displayed in scenarios including structure copy)Display the progress and detailed information of structure copy.
    • Click Logs on the right side of the page: View the execution logs of structure copy.
    • Click refresh on the right side of the page: View the latest information.
    • Click View DDL in the Operation column on the right side of the target object in the list: View SQL playback.
    5Full Copy (displayed in scenarios including full copy)Display the progress and detailed information of full copy.
    • Click Monitoring on the right side of the page: View various monitoring indicators during full copy. During full copy, you can also click Rate Limit Settings on the right side of the monitoring indicator page to limit the rate at which data is written to the target data source. The unit is rows/second.
    • Click Logs on the right side of the page: View the execution logs of full copy.
    • Click refresh on the right side of the page: View the latest information.
    6Incremental Copy (displayed in scenarios including incremental copy)Display various monitoring indicators for incremental copy.
    • Click View Threads on the right side of the page under : View the operations currently being performed by the current replication task, including:
      • : Shows the thread number currently in progress for the replication task, which is divided into multiple threads for replication operations.
      • : Details of the SQL statement being executed by the current thread.
      • : The response time of the current thread. If this value increases, it indicates that the current thread may be stuck due to certain reasons.
      • : The timestamp when the current thread was started.
      • : The status of the current thread.
    • Click Rate Limit Settings on the right side of the page: Limit the rate at which data is written to the target data source. The unit is rows/second.
    • Click Logs on the right side of the page: View the execution logs of incremental copy.
    • Click refresh on the right side of the page: View the latest information.
    7Modify ObjectDisplay the modification records of synchronized objects.
    • Click Modify Synchronized Object on the right side of the page to configure the synchronized object .
    • Click refresh on the right side of the page: View the latest information.
    8Data ComparisonDisplay the results of data comparison between the source data source and the target data source. If you have not enabled data comparison, click Enable Data Comparison in the page.
    • Click Redo Comparison on the right side of the page: Reinitiate a comparison of data between the current source and target.
    • Click Logs on the right side of the page: View the execution logs of consistency comparison.
    • Click Monitoring on the right side of the page: View the trend chart of RPS (records per second) for comparison. Click Details to view records from earlier times.
    • Click details in the Operation column in the comparison list (displayed in case of inconsistency): View details of the source and target comparisons.
    • Click sql in the Operation column in the comparison list (displayed in case of inconsistency): Generate change SQL. You can copy this SQL directly to the target data source for execution to modify inconsistent content.
    9ExpandDisplay detailed information about the current replication task, including Replication Type, Replication Object, Start Time, etc.

Appendix 1: Mapping Table of MySQL and Doris Data Types

During the data replication process, MySQL data types will be mapped to the corresponding Doris data types.

Category MySQL Data Type Doris Data Type
Numeric TINYINT TINYINT
TINYINT UNSIGNED SMALLINT
SMALLINT SMALLINT
SMALLINT UNSIGNED INT
MEDIUMINT INT
MEDIUMINT UNSIGNED BIGINT
INT INT
INT UNSIGNED BIGINT
BIGINT BIGINT
BIGINT UNSIGNED LARGEINT
BIT(M) INT
Decimal Decimal
Numeric Decimal
Float Float
Double DOUBLE
BOOL|BOOLEAN BOOLEAN
DATE AND TIME DATE DATEV2 (Doris version ≥ 1.2.0)
DATE (Doris version < 1.2.0)
DATETIME[(fsp)] DATETIME (without FSP parameter from the source and Doris version ≥ 1.2.0, or Doris version ≤ 1.2.0)
DATETIMEV2 (with FSP parameter from the source and Doris version ≥ 1.2.0)
Timestamp[(fsp)] DATETIME (without FSP parameter from the source and Doris version ≥ 1.2.0, or Doris version ≤ 1.2.0)
DATETIMEV2 (with FSP parameter from the source and Doris version ≥ 1.2.0)
Time[(fsp)] VARCHAR
YEAR[(4)] INT
STRING CHAR|VARCHAR CHAR
VARCHAR VARCHAR
Note: To avoid data migration loss, VARCHAR(n) will be adjusted to VARCHAR(3*n) after migration to Doris.
BINARY|VARBINARY STRING
TINYTEXT|TEXT|MEDIUMTEXT|LONGTEXT STRING
TINYBLOB|BLOB|MEDIUMBOLB|LONGBLOB STRING
ENUM STRING
SET STRING
JSON JSON JSONB (Doris version ≥ 1.2.0)
STRING (Doris version < 1.2.0)

Appendix 2: Checklist of Pre-Check Items

Check ItemCheck Content
Target Database Data Existence CheckCheck if data exists for the objects to be replicated in the target database
Target Database Same Name Object Existence CheckCheck if the objects to be replicated already exist in the target database
Check Object Name ComplianceCheck if the database name and table name comply with Doris naming conventions
Source Data Source Connection CheckCheck the status of the source data source gateway, instance accessibility, and accuracy of username and password
Target Data Source Connection CheckCheck the status of the target data source gateway, instance accessibility, and accuracy of username and password
Target Database Permission CheckCheck if the account permissions in the target database meet the requirements
Source Database Permission CheckCheck if the account permissions in the source database meet the requirements
Check if Source Database log_slave_updates is SupportedCheck if log_slave_updates is set to ON when the source database is a slave
Source Data Source and Target Data Source Version CheckCheck if the versions of the source database and target database are compatible
Check if Source Database is Enabled with BinlogCheck if the source database is enabled with Binlog
Check if Source Database Binlog Format is SupportedCheck if the binlog format of the source database is 'ROW'
Check if Source Database binlog_row_image is SupportedCheck if the binlog_row_image of the source database is 'FULL'

Introduction to Data Replication