MySQL to SelectDB
NineData's data replication supports both full and incremental synchronization of data from MySQL to SelectDB data sources.
Prerequisites
Ensure both the source and target data sources have been added to NineData. For instructions on how to add, please refer to Adding Data Sources.
The source database should be of MySQL or similar types, such as MySQL, MariaDB, PolarDB MySQL, TDSQL-C, GaussDB MySQL, Aurora, etc.
The target data source should be of SelectDB type.
The source data source must have Binlog enabled, and the Binlog-related parameters should be set as follows:
binlog_format
=ROW
binlog_row_image
=FULL
tipIf the source data source is a standby database, ensure the
log_slave_updates
parameter is also enabled to obtain complete Binlog logs.
Usage Restrictions
- The names of databases and tables in the source data source must adhere to SelectDB's naming conventions (begin with a letter, can contain letters, numbers, and underscores, but no spaces or other special characters).
- The data replication function is only for the user databases in the data source, and the system databases will not be replicated. For example:
information_schema
,mysql
,performance_schema
,sys
databases in MySQL type data sources will not be replicated. - The account for source data must have SELECT (for replicate database structure and full data), SHOW VIEW (for replicate views), and REPLICATION CLIENT, REPLICATION SLAVE (for replicate incremental data) privileges on the objects to be replicated. The account for target must have DML and DDL privileges.
- Before performing data synchronization, user need to evaluate the performance of the source data source and the target data source, and it is recommended to perform data synchronization during off-peak time. Otherwise, the full data initialization will occupy a certain amount of read and write resources of the source data source and the target data source, increasing database load.
- During the synchronization process, if the source data contains views, functions, stored procedures, triggers, and events, after synchronizing to the target data source, the definer of the above objects information will be modified in the target data source to the account that accesses the target data source in the current synchronization task.
- It is necessary to ensure that each table in the synchronization object has a primary key or unique constraint, and the column name is unique, otherwise the same data may be synchronized repeatedly.
- During the synchronization process, if there are triggers in the source, the system will not synchronize the triggers until the incremental synchronization ends.
Procedure
NineData’s data replication product has been commercialized. You can still use 10 replication tasks for free, with the following considerations:
Among the 10 replication tasks, you can include 1 task, with a specification of Micro.
Tasks with a status of do not count towards the 10-task limit. If you have already created 10 replication tasks and want to create more, you can terminate previous replication tasks and then create new ones.
When creating replication tasks, you can only select the you have purchased. Specifications that have not been purchased will be grayed out and cannot be selected. If you need to purchase additional specifications, please contact us through the customer service icon at the bottom right of the page.
Log in to the NineData console.
Click on > in the left navigation bar.
On the page, click on in the top right corner.
On the tab, configure as per the table below and click .
Parameter Description Enter a name for the data synchronization task, for ease of search and management. Please use a meaningful name with a maximum of 64 characters. Data source of the objects to be synchronized. Data source that will receive the synchronized objects. Choose the content to be copied to the target data source. - : Only sync the schema of the source data source, without syncing data.
- : Sync all objects and data from the source data source, i.e., full data replication.
- : After full sync, perform incremental sync based on the source data source's logs. Click the icon to deselect specific operation types as needed. After deselecting, these operations will be ignored during incremental sync.
- (to be selected when is selected)
- : Stop the task when duplicate table is detected during pre-check.
- : Send a prompt and continue the task when duplicate table is detected during pre-check. Ignore the duplicate table during schema replication. If you are also copying data, the data will be appended to the duplicate table without overwriting the existing data.
- : Send a prompt and continue the task when duplicate table is detected during pre-check. During schema replication, delete the duplicate table in the target database and re-copy the table structure based on the source database. If you are also copying data, the data will be written after the table structure replication.
- (optional when copying both schema and data): Send a prompt and continue the task when duplicate table is detected during pre-check. During schema replication, retain the table structure in the target database, clear data in the duplicate table when data replication begins, and then copy from the original table.
- (to be selected when is not selected)
- : Stop the task when data is detected in the target table during pre-check.
- : Ignore the data in the target table when data is detected during pre-check. Append and write other data.
- : Delete the data detected in the target table during pre-check and rewrite.
On the tab, configure the parameters below and click .
Parameter Description Choose the content to be copied. You can choose to copy all content from the source database, or choose , select the content to be copied in the list, and click > to add them to the list. (optional) Click to add a blacklist record. Choose the databases or objects to be included in the blacklist; these will not be copied. This is useful to exclude certain databases or objects from full replication in or . - Left dropdown: Choose the database name to be included in the blacklist.
- Right dropdown: Choose the objects in the corresponding database. You can select multiple objects by clicking, or leave it blank to include the entire database in the blacklist.
On the tab, choose different actions based on the selected replication type.
Including : Configure the table name, Key, and Distribute Key after synchronization to the target data source, and click . For different types of tables, the default Data Model and Key definitions used after replication to SelectDB are as follows.
MySQL Table Type SelectDB Data Model SelectDB Key Definition SelectDB Distribute Key Definition Tables with Primary Key Unique Model All primary key columns, following the order defined in the source database's primary key columns. All primary key columns. Tables without Primary Key but with Unique Key Unique Model All unique key columns, following the order defined in the source database's unique key columns. All unique key columns. Tables without Primary Key and Unique Key Duplicate Model By default, select the first 2 columns in the table. Same as Key. tipYou can click on the right side of the page to customize the column names after synchronization to the target data source. Additionally, you can set to only sync data that meets the filtering conditions to the target data source. For example, Taking the test data table
employees
as an example, setting the filter condition toemp_no>=10005
means that data with emp_no less than 10005 will not be synchronized to the target data source.Excluding : The target database with the same name as the target data source is automatically selected by the system. If it does not exist, you need to manually select the target database. The table and column names in the target database need to match the synchronization objects. If they do not match, you can manually map the table and column names.
On the tab, wait for the system to complete the pre-check. After passing the pre-check, click .
tip- You can select . After the synchronization task is completed, automatic data consistency comparison based on the source data source is initiated to ensure data consistency on both ends. Depending on your selection of , the initiation timing for is as follows:
- : Starts after schema replication is completed.
- + , : Starts after full replication is completed.
- + + , : Starts when incremental data is first consistent with the source data source and is 0 seconds. You can click to view the synchronization delay in the page. ![sync_delay](../image/sync_delay.png)
- If the pre-check fails, click in the column on the right of the target check item to diagnose the reason for failure. Manually fix it and click to rerun the pre-check until it passes.
- Check items with as can be repaired or ignored based on the specific situation.
- You can select . After the synchronization task is completed, automatic data consistency comparison based on the source data source is initiated to ensure data consistency on both ends. Depending on your selection of , the initiation timing for is as follows:
On the page, when prompted with , the synchronization task starts running. At this point, you can:
- Click to view the execution status of each stage of the synchronization task.
- Click to return to the task list page.
Viewing Synchronization Results
Log in to the NineData console.
Click on > in the left navigation bar.
On the page, click on the of the target synchronization task, and the page will display as follows.
Number Function Description 1 Synchronization Delay The synchronization delay between the source data source and the target data source. 0 seconds means no delay between the two ends. At this point, you can choose to switch the business to the target data source for a smooth migration. 2 Configure Alerts After configuring alerts, the system will notify you in the way you choose when the task fails. For more information, see Operation Monitoring Introduction. 3 More - Pause: Pause the task. Only tasks with the status Running can be paused.
- Terminate: End tasks that are unfinished or in progress (i.e., in incremental sync). After terminating a task, you cannot restart it, so please use caution. If the synchronization objects contain triggers, trigger copying options will appear. Choose as needed.
- Delete: Delete the task. Once the task is deleted, it cannot be recovered, so please use caution.
4 Structure Replication (displayed in scenarios involving structure replication) Display the progress and detailed information of structure replication. - Click Logs on the right side of the page: View the execution logs of structure replication.
- Click on the right side of the page: View the latest information.
- Click View DDL in the Operation column on the right of the target object in the list: View SQL playback.
5 Full Replication (displayed in scenarios involving full replication) Display the progress and detailed information of full replication. - Click Monitoring on the right side of the page: View various monitoring indicators during full replication. During full replication, you can also click Rate Limit Settings on the right of the monitoring indicator page to limit the rate at which data is written to the target data source per second. The unit is rows/second.
- Click Logs on the right side of the page: View the execution logs of full replication.
- Click on the right side of the page: View the latest information.
6 Incremental Replication (displayed in scenarios involving incremental replication) Display various monitoring indicators for incremental replication. - Click on the right side of the page: View the operations being executed by the current replication task, including:
- : The replication task is executed by multiple threads, displaying the current thread number being executed.
- : Details of SQL statements being executed by the current thread.
- : The response time of the current thread. If this value increases, it indicates that the current thread may be stuck due to certain reasons.
- : The timestamp when the current thread started.
- : The status of the current thread.
- Click Rate Limit Settings on the right: Limit the rate at which data is written to the target data source per second. The unit is rows/second.
- Click Logs on the right: View the execution logs of incremental replication.
- Click on the right: View the latest information.
7 Modify Objects Display modification records of synchronization objects. - Click Modify Synchronization Objects on the right to configure synchronization objects.
- Click on the right: View the latest information.
8 Data Comparison Display the data comparison results between the source data source and the target data source. If you have not enabled data comparison, click Enable Data Comparison in the page. - Click Recompare on the right: Reinitiate data comparison between the current source and target ends.
- Click Logs on the right: View the execution logs of consistency comparison.
- Click Monitoring on the right: View the trend chart of comparison RPS (records per second). Click Details to view records from earlier times.
- Click in the Operation column on the right of the comparison list (displayed in case of inconsistencies): View the comparison details between the source and target ends.
- Click in the Operation column on the right of the comparison list (displayed in case of inconsistencies): Generate change SQL. You can directly copy this SQL to the target data source for execution to modify the inconsistent content.
9 Expand Display detailed information of the current replication task, including Replication Type, Replication Objects, Start Time, and so on.
Appendix 1: Mapping Table for MySQL and SelectDB Data Types
During the data replication process, MySQL data types are mapped to corresponding SelectDB data types.
Category | MySQL Data Type | SelectDB Data Type |
---|---|---|
Numeric | TINYINT | TINYINT |
TINYINT UNSIGNED | SMALLINT | |
SMALLINT | SMALLINT | |
SMALLINT UNSIGNED | INT | |
MEDIUMINT | INT | |
MEDIUMINT UNSIGNED | BIGINT | |
INT | INT | |
INT UNSIGNED | BIGINT | |
BIGINT | BIGINT | |
BIGINT UNSIGNED | LARGEINT | |
BIT(M) | INT | |
Decimal | Decimal | |
Numeric | Decimal | |
Float | Float | |
Double | DOUBLE | |
BOOL|BOOLEAN | BOOLEAN | |
DATE AND TIME | DATE | DATEV2 (SelectDB version ≥ 2.2.x) DATE (SelectDB version < 2.2.x) |
DATETIME[(fsp)] | DATETIME (source without FSP parameter and SelectDB version ≥ 2.2.x, or SelectDB version ≤ 2.2.x) DATETIMEV2 (source with FSP parameter and SelectDB version ≥ 2.2.x) | |
Timestamp[(fsp)] | DATETIME (source without FSP parameter and SelectDB version ≥ 2.2.x, or SelectDB version ≤ 2.2.x) DATETIMEV2 (source with FSP parameter and SelectDB version ≥ 2.2.x) | |
Time[(fsp)] | VARCHAR | |
YEAR[(4)] | INT | |
STRING | CHAR|VARCHAR | CHAR |
VARCHAR | VARCHAR Note: To avoid data migration loss, VARCHAR(n) will be adjusted to VARCHAR(3*n) after migration to SelectDB. | |
BINARY|VARBINARY | STRING | |
TINYTEXT|TEXT|MEDIUMTEXT|LONGTEXT | STRING | |
TINYBLOB|BLOB|MEDIUMBOLB|LONGBLOB | STRING | |
ENUM | STRING | |
SET | STRING | |
JSON | JSON | JSONB (SelectDB version ≥ 2.2.x) STRING (SelectDB version < 2.2.x) |
Appendix 2: Checklist of Pre-Check Items
Check Item | Check Content |
---|---|
Target Database Data Existence Check | Check if data exists for the objects to be replicated in the target database |
Target Database Same Name Object Existence Check | Check if the objects to be replicated already exist in the target database |
Check Object Name Compliance | Check if the database name and table name comply with Doris naming conventions |
Source Data Source Connection Check | Check the status of the source data source gateway, instance accessibility, and accuracy of username and password |
Target Data Source Connection Check | Check the status of the target data source gateway, instance accessibility, and accuracy of username and password |
Target Database Permission Check | Check if the account permissions in the target database meet the requirements |
Source Database Permission Check | Check if the account permissions in the source database meet the requirements |
Check if Source Database log_slave_updates is Supported | Check if log_slave_updates is set to ON when the source database is a slave |
Source Data Source and Target Data Source Version Check | Check if the versions of the source database and target database are compatible |
Check if Source Database is Enabled with Binlog | Check if the source database is enabled with Binlog |
Check if Source Database Binlog Format is Supported | Check if the binlog format of the source database is 'ROW' |
Check if Source Database binlog_row_image is Supported | Check if the binlog_row_image of the source database is 'FULL' |