MySQL to Doris
NineData data copying supports full data synchronization and incremental data synchronization from MySQL to Doris data sources.
Prerequisites
The source data source and target data source have been added to NineData. For instructions on how to add a data source, see Adding Data Sources.
The database type of the source is MySQL or a MySQL-like database, such as MySQL, MariaDB, PolarDB MySQL, TDSQL-C, GaussDB MySQL, Aurora, etc.
The target database is Doris.
The source data source must have Binlog enabled, and the Binlog-related parameters are set as follows:
binlog_format
=ROW
binlog_row_image
=FULL
tipIf the source data source is a standby database, to ensure complete Binlog logs, you also need to enable the
log_slave_updates
parameter.
Usage Limitations
The table names in the source data source must comply with Doris' table naming rules (must start with a letter, can contain letters, numbers, and underscores, but cannot contain spaces or other special characters).
The data copying function is only for user databases in the data source; system databases will not be copied. For example, databases like
information_schema
,mysql
,performance_schema
, andsys
in MySQL-type data sources will not be copied.The account in the source data source must have
SELECT
permission for the objects to be copied (structure copying, full copying),REPLICATION CLIENT
,REPLICATION SLAVE
permission (incremental copying). The account in the target data source must have DML, DDL permissions.Evaluate the performance of the source data source and target data source before executing data synchronization, and it is recommended to execute data synchronization during the business off-peak period. Otherwise, during full data initialization, it will consume certain read and write resources of the source data source and target data source, leading to an increase in database load.
In the case of table-level synchronization, do not use online DDL change tools (e.g., gh-ost, pt-online-schema-change) to modify the synchronization objects in the source database during the synchronization process, as it may cause synchronization failure.
Ensure that each table in the synchronization object has a primary key or unique constraint, and column names are unique, otherwise, it may result in duplicate synchronization of the same data.
Operation Steps
NineData’s data replication product has been commercialized. You can still use 10 replication tasks for free, with the following considerations:
Among the 10 replication tasks, you can include 1 task, with a specification of Micro.
Tasks with a status of do not count towards the 10-task limit. If you have already created 10 replication tasks and want to create more, you can terminate previous replication tasks and then create new ones.
When creating replication tasks, you can only select the you have purchased. Specifications that have not been purchased will be grayed out and cannot be selected. If you need to purchase additional specifications, please contact us through the customer service icon at the bottom right of the page.
Log in to the NineData Console.
Click on > in the left navigation bar.
On the page, click on in the upper right corner.
On the tab, configure as per the table below and click on .
Parameter Description Enter a name for the data synchronization task. For ease of future retrieval and management, please use a meaningful name. Maximum support is 64 characters. The data source where the synchronization object resides. The data source that will receive the synchronized object. Choose the content to be copied to the target data source. - : Only syncs the database table structure from the source data source, without syncing the data.
- : Syncs all objects and data from the source data source, i.e., full data copy.
- : After full sync, performs incremental sync based on the source data source's logs. Click the icon to selectively uncheck certain operation types according to your needs. These unchecked operations will be ignored during incremental sync.
- (to be selected with )
- : Stop the task when a table with the same name is detected during pre-check.
- : Prompt and continue the task when a table with the same name is detected during pre-check. Ignore this table during structure copy. If you're also doing data copy, the data will be appended to the table without overwriting existing data.
- : Prompt and continue the task when a table with the same name is detected during pre-check. During structure copy, delete the same-named table in the target database and recopy the table structure based on the source database. If you're also doing data copy, the data will be written after the table structure copy.
- (optional when doing both structure and data copy): Prompt and continue the task when a table with the same name is detected during pre-check. During structure copy, keep the table structure in the target database and clear the data in the same-named table when data copy starts, then recopy from the original table.
- (to be selected when is not chosen)
- : Stop the task when data exists in the target table is detected during pre-check.
- : Ignore the portion of data in the target table detected during pre-check and append other data.
- : Delete the portion of data in the target table detected during pre-check and rewrite.
On the tab, configure the following parameters and then click .
Parameter Description Choose the content to be copied. You can choose to copy all contents from the source database, or choose . In list, select the content you want to copy, and click > to add it to the right list. (optional) Click to add a blacklist record. Choose the database or object to be added to the blacklist; these contents will not be copied. This is used to exclude certain databases or objects in the full database copy of or . - Left dropdown: Choose the database name to be added to the blacklist.
- Right dropdown: Choose the objects in the corresponding database. You can click to select multiple objects; leave it blank to include the entire database in the blacklist.
On the tab, choose different operations based on the selected copy type.
Including : Configure the table name, Key, and Distribute Key after syncing the target table to the target data source, then click . For different types of tables, the default Data Model and Key definitions used after copying to Doris are as follows.
MySQL Table Type Doris Data Model Doris Key Definition Doris Distribute Key Definition Tables with primary key Unique Model All primary key columns, in the order defined by the source database's primary key column definition. All primary key columns. Tables without primary key but with unique key Unique Model All unique key columns, in the order defined by the source database's unique key definition. All unique key columns. Tables without primary key and unique key Duplicate Model Default selection of the first 2 columns in the table. Keep the same as Key. tipYou can click on the right side of the page to customize the names of columns after synchronization to the target data source. Additionally, you can set to synchronize only the data that meets the filtering conditions to the target data source. For example, Taking the test data table
employees
as an example, setting the filtering condition toemp_no>=10005
means that data with emp_no less than 10005 will not be synchronized to the target data source.Excluding : The target database with the same name as the target data source is automatically selected by the system. If it doesn't exist, you need to manually select the target database. The table names and column names in the target database need to be consistent with the synchronized objects. If they are not, you can manually map table names and column names.
On the tab, wait for the system to complete the pre-check. After a successful pre-check, click .
tip- You can check . After the synchronization task is completed, automatically start data consistency comparison based on the source data source. The startup time of depends on your choice of :
- : Starts after structure copy is completed.
- + , : Starts after full copy is completed.
- + + , : Starts when incremental data is consistent with the source data source for the first time and is set to 0 seconds. You can click to view the synchronization delay in the page.![sync_delay](../image/sync_delay.png)
- If the pre-check fails, you need to click in the column on the right side of the target check item to investigate the cause of the failure, manually fix it, and click to rerun the pre-check until it passes.
- For check items where is , you can fix or ignore based on the specific situation.
- You can check . After the synchronization task is completed, automatically start data consistency comparison based on the source data source. The startup time of depends on your choice of :
On the page, when prompted with , the synchronization task begins to run. At this point, you can perform the following operations:
- Click to view the execution status of the synchronization task at each stage.
- Click to return to the task list page .
View Synchronization Results
Log in to the NineData Console.
Click on > in the left navigation bar.
On the page, click on the of the target synchronization task. The page is explained as follows.
Number Function Description 1 Synchronization Delay The data synchronization delay between the source data source and the target data source. 0 seconds indicates no delay between the two ends. At this point, you can choose to switch the business to the target data source to achieve a smooth migration. 2 Configure Alerts After configuring alerts, the system will notify you through the selected method in case of task failure. For more information, see Operations Monitoring Introduction. 3 More - Pause: Pause the task. Only tasks with a status of Running are available for selection.
- Terminate: End tasks that are incomplete or in monitoring (i.e., in incremental synchronization). After terminating the task, it cannot be restarted, so please proceed with caution. If the synchronization objects contain triggers, trigger copy options will pop up. Select as needed.
- Delete: Delete the task. Once the task is deleted, it cannot be recovered, so please proceed with caution.
4 Structure Copy (displayed in scenarios including structure copy) Display the progress and detailed information of structure copy. - Click Logs on the right side of the page: View the execution logs of structure copy.
- Click on the right side of the page: View the latest information.
- Click View DDL in the Operation column on the right side of the target object in the list: View SQL playback.
5 Full Copy (displayed in scenarios including full copy) Display the progress and detailed information of full copy. - Click Monitoring on the right side of the page: View various monitoring indicators during full copy. During full copy, you can also click Rate Limit Settings on the right side of the monitoring indicator page to limit the rate at which data is written to the target data source. The unit is rows/second.
- Click Logs on the right side of the page: View the execution logs of full copy.
- Click on the right side of the page: View the latest information.
6 Incremental Copy (displayed in scenarios including incremental copy) Display various monitoring indicators for incremental copy. - Click View Threads on the right side of the page under : View the operations currently being performed by the current replication task, including:
- : Shows the thread number currently in progress for the replication task, which is divided into multiple threads for replication operations.
- : Details of the SQL statement being executed by the current thread.
- : The response time of the current thread. If this value increases, it indicates that the current thread may be stuck due to certain reasons.
- : The timestamp when the current thread was started.
- : The status of the current thread.
- Click Rate Limit Settings on the right side of the page: Limit the rate at which data is written to the target data source. The unit is rows/second.
- Click Logs on the right side of the page: View the execution logs of incremental copy.
- Click on the right side of the page: View the latest information.
7 Modify Object Display the modification records of synchronized objects. - Click Modify Synchronized Object on the right side of the page to configure the synchronized object .
- Click on the right side of the page: View the latest information.
8 Data Comparison Display the results of data comparison between the source data source and the target data source. If you have not enabled data comparison, click Enable Data Comparison in the page. - Click Redo Comparison on the right side of the page: Reinitiate a comparison of data between the current source and target.
- Click Logs on the right side of the page: View the execution logs of consistency comparison.
- Click Monitoring on the right side of the page: View the trend chart of RPS (records per second) for comparison. Click Details to view records from earlier times.
- Click in the Operation column in the comparison list (displayed in case of inconsistency): View details of the source and target comparisons.
- Click in the Operation column in the comparison list (displayed in case of inconsistency): Generate change SQL. You can copy this SQL directly to the target data source for execution to modify inconsistent content.
9 Expand Display detailed information about the current replication task, including Replication Type, Replication Object, Start Time, etc.
Appendix 1: Mapping Table of MySQL and Doris Data Types
During the data replication process, MySQL data types will be mapped to the corresponding Doris data types.
Category | MySQL Data Type | Doris Data Type |
---|---|---|
Numeric | TINYINT | TINYINT |
TINYINT UNSIGNED | SMALLINT | |
SMALLINT | SMALLINT | |
SMALLINT UNSIGNED | INT | |
MEDIUMINT | INT | |
MEDIUMINT UNSIGNED | BIGINT | |
INT | INT | |
INT UNSIGNED | BIGINT | |
BIGINT | BIGINT | |
BIGINT UNSIGNED | LARGEINT | |
BIT(M) | INT | |
Decimal | Decimal | |
Numeric | Decimal | |
Float | Float | |
Double | DOUBLE | |
BOOL|BOOLEAN | BOOLEAN | |
DATE AND TIME | DATE | DATEV2 (Doris version ≥ 1.2.0) DATE (Doris version < 1.2.0) |
DATETIME[(fsp)] | DATETIME (without FSP parameter from the source and Doris version ≥ 1.2.0, or Doris version ≤ 1.2.0) DATETIMEV2 (with FSP parameter from the source and Doris version ≥ 1.2.0) | |
Timestamp[(fsp)] | DATETIME (without FSP parameter from the source and Doris version ≥ 1.2.0, or Doris version ≤ 1.2.0) DATETIMEV2 (with FSP parameter from the source and Doris version ≥ 1.2.0) | |
Time[(fsp)] | VARCHAR | |
YEAR[(4)] | INT | |
STRING | CHAR|VARCHAR | CHAR |
VARCHAR | VARCHAR Note: To avoid data migration loss, VARCHAR(n) will be adjusted to VARCHAR(3*n) after migration to Doris. | |
BINARY|VARBINARY | STRING | |
TINYTEXT|TEXT|MEDIUMTEXT|LONGTEXT | STRING | |
TINYBLOB|BLOB|MEDIUMBOLB|LONGBLOB | STRING | |
ENUM | STRING | |
SET | STRING | |
JSON | JSON | JSONB (Doris version ≥ 1.2.0) STRING (Doris version < 1.2.0) |
Appendix 2: Checklist of Pre-Check Items
Check Item | Check Content |
---|---|
Target Database Data Existence Check | Check if data exists for the objects to be replicated in the target database |
Target Database Same Name Object Existence Check | Check if the objects to be replicated already exist in the target database |
Check Object Name Compliance | Check if the database name and table name comply with Doris naming conventions |
Source Data Source Connection Check | Check the status of the source data source gateway, instance accessibility, and accuracy of username and password |
Target Data Source Connection Check | Check the status of the target data source gateway, instance accessibility, and accuracy of username and password |
Target Database Permission Check | Check if the account permissions in the target database meet the requirements |
Source Database Permission Check | Check if the account permissions in the source database meet the requirements |
Check if Source Database log_slave_updates is Supported | Check if log_slave_updates is set to ON when the source database is a slave |
Source Data Source and Target Data Source Version Check | Check if the versions of the source database and target database are compatible |
Check if Source Database is Enabled with Binlog | Check if the source database is enabled with Binlog |
Check if Source Database Binlog Format is Supported | Check if the binlog format of the source database is 'ROW' |
Check if Source Database binlog_row_image is Supported | Check if the binlog_row_image of the source database is 'FULL' |