Skip to main content

MySQL to SelectDB

NineData's data replication supports both full and incremental synchronization of data from MySQL to SelectDB data sources.

Prerequisites

  • Ensure both the source and target data sources have been added to NineData. For instructions on how to add, please refer to Adding Data Sources.

  • The source database should be of MySQL or similar types, such as MySQL, MariaDB, PolarDB MySQL, TDSQL-C, GaussDB MySQL, Aurora, etc.

  • The target data source should be of SelectDB type.

  • The source data source must have Binlog enabled, and the Binlog-related parameters should be set as follows:

    • binlog_format=ROW
    • binlog_row_image=FULL
    tip

    If the source data source is a standby database, ensure the log_slave_updates parameter is also enabled to obtain complete Binlog logs.

Usage Restrictions

  • The names of databases and tables in the source data source must adhere to SelectDB's naming conventions (begin with a letter, can contain letters, numbers, and underscores, but no spaces or other special characters).
  • The data replication function is only for the user databases in the data source, and the system databases will not be replicated. For example: information_schema, mysql, performance_schema, sys databases in MySQL type data sources will not be replicated.
  • The account for source data must have SELECT (for replicate database structure and full data), SHOW VIEW (for replicate views), and REPLICATION CLIENT, REPLICATION SLAVE (for replicate incremental data) privileges on the objects to be replicated. The account for target must have DML and DDL privileges.
  • Before performing data synchronization, user need to evaluate the performance of the source data source and the target data source, and it is recommended to perform data synchronization during off-peak time. Otherwise, the full data initialization will occupy a certain amount of read and write resources of the source data source and the target data source, increasing database load.
  • During the synchronization process, if the source data contains views, functions, stored procedures, triggers, and events, after synchronizing to the target data source, the definer of the above objects information will be modified in the target data source to the account that accesses the target data source in the current synchronization task.
  • It is necessary to ensure that each table in the synchronization object has a primary key or unique constraint, and the column name is unique, otherwise the same data may be synchronized repeatedly.
  • During the synchronization process, if there are triggers in the source, the system will not synchronize the triggers until the incremental synchronization ends.

Procedure

  1. Log in to the NineData console.

  2. Click on > in the left navigation bar.

  3. On the page, click on in the top right corner.

  4. On the tab, configure as per the table below and click .

    Parameter
    Description
    Enter a name for the data synchronization task, for ease of search and management. Please use a meaningful name with a maximum of 64 characters.
    Data source of the objects to be synchronized.
    Data source that will receive the synchronized objects.
    Choose the content to be copied to the target data source.
    • : Only sync the schema of the source data source, without syncing data.
    • : Sync all objects and data from the source data source, i.e., full data replication.
    • : After full sync, perform incremental sync based on the source data source's logs. Click the setting icon to deselect specific operation types as needed. After deselecting, these operations will be ignored during incremental sync.
    Note: You can also click on to choose the handling strategy when there are objects with the same name or data in the target table.
    • (to be selected when is selected)
      • : Stop the task when duplicate table is detected during pre-check.
      • : Send a prompt and continue the task when duplicate table is detected during pre-check. Ignore the duplicate table during schema replication. If you are also copying data, the data will be appended to the duplicate table without overwriting the existing data.
      • : Send a prompt and continue the task when duplicate table is detected during pre-check. During schema replication, delete the duplicate table in the target database and re-copy the table structure based on the source database. If you are also copying data, the data will be written after the table structure replication.
      • (optional when copying both schema and data): Send a prompt and continue the task when duplicate table is detected during pre-check. During schema replication, retain the table structure in the target database, clear data in the duplicate table when data replication begins, and then copy from the original table.
    • (to be selected when is not selected)
      • : Stop the task when data is detected in the target table during pre-check.
      • : Ignore the data in the target table when data is detected during pre-check. Append and write other data.
      • : Delete the data detected in the target table during pre-check and rewrite.
  5. On the tab, configure the parameters below and click .

    Parameter
    Description
    Choose the content to be copied. You can choose to copy all content from the source database, or choose , select the content to be copied in the list, and click > to add them to the list.
    (optional)Click to add a blacklist record. Choose the databases or objects to be included in the blacklist; these will not be copied. This is useful to exclude certain databases or objects from full replication in or .
    • Left dropdown: Choose the database name to be included in the blacklist.
    • Right dropdown: Choose the objects in the corresponding database. You can select multiple objects by clicking, or leave it blank to include the entire database in the blacklist.
    If you want to include multiple databases in the blacklist, click the Add button below.
  6. On the tab, choose different actions based on the selected replication type.

    • Including : Configure the table name, Key, and Distribute Key after synchronization to the target data source, and click . For different types of tables, the default Data Model and Key definitions used after replication to SelectDB are as follows.

      MySQL Table TypeSelectDB Data ModelSelectDB Key DefinitionSelectDB Distribute Key Definition
      Tables with Primary KeyUnique ModelAll primary key columns, following the order defined in the source database's primary key columns.All primary key columns.
      Tables without Primary Key but with Unique KeyUnique ModelAll unique key columns, following the order defined in the source database's unique key columns.All unique key columns.
      Tables without Primary Key and Unique KeyDuplicate ModelBy default, select the first 2 columns in the table.Same as Key.
      tip

      You can click on the right side of the page to customize the column names after synchronization to the target data source. Additionally, you can set to only sync data that meets the filtering conditions to the target data source. For example, using the MySQL official test data "employees" table as an example, setting the filter condition to emp_no>=10005 means that data with emp_no less than 10005 will not be synchronized to the target data source.

    • Excluding : The target database with the same name as the target data source is automatically selected by the system. If it does not exist, you need to manually select the target database. The table and column names in the target database need to match the synchronization objects. If they do not match, you can manually map the table and column names.

  7. On the tab, wait for the system to complete the pre-check. After passing the pre-check, click .

    tip
    • You can select . After the synchronization task is completed, automatic data consistency comparison based on the source data source is initiated to ensure data consistency on both ends. Depending on your selection of , the initiation timing for is as follows:
      • : Starts after schema replication is completed.
      • + , : Starts after full replication is completed.
      • + + , : Starts when incremental data is first consistent with the source data source and is 0 seconds. You can click to view the synchronization delay in the page. ![sync_delay](../image/sync_delay.png)
    • If the pre-check fails, click in the column on the right of the target check item to diagnose the reason for failure. Manually fix it and click to rerun the pre-check until it passes.
    • Check items with as can be repaired or ignored based on the specific situation.
  8. On the page, when prompted with , the synchronization task starts running. At this point, you can:

    • Click to view the execution status of each stage of the synchronization task.
    • Click to return to the task list page.

Viewing Synchronization Results

  1. Log in to the NineData console.

  2. Click on > in the left navigation bar.

  3. On the page, click on the of the target synchronization task, and the page will display as follows.

    result-selectdb

    Number
    Function
    Description
    1Synchronization DelayThe synchronization delay between the source data source and the target data source. 0 seconds means no delay between the two ends. At this point, you can choose to switch the business to the target data source for a smooth migration.
    2Configure AlertsAfter configuring alerts, the system will notify you in the way you choose when the task fails. For more information, see Operation Monitoring Introduction.
    3More
    • Pause: Pause the task. Only tasks with the status Running can be paused.
    • Terminate: End tasks that are unfinished or in progress (i.e., in incremental sync). After terminating a task, you cannot restart it, so please use caution. If the synchronization objects contain triggers, trigger copying options will appear. Choose as needed.
    • Delete: Delete the task. Once the task is deleted, it cannot be recovered, so please use caution.
    4Structure Replication (displayed in scenarios involving structure replication)Display the progress and detailed information of structure replication.
    • Click Logs on the right side of the page: View the execution logs of structure replication.
    • Click refresh on the right side of the page: View the latest information.
    • Click View DDL in the Operation column on the right of the target object in the list: View SQL playback.
    5Full Replication (displayed in scenarios involving full replication)Display the progress and detailed information of full replication.
    • Click Monitoring on the right side of the page: View various monitoring indicators during full replication. During full replication, you can also click Rate Limit Settings on the right of the monitoring indicator page to limit the rate at which data is written to the target data source per second. The unit is rows/second.
    • Click Logs on the right side of the page: View the execution logs of full replication.
    • Click refresh on the right side of the page: View the latest information.
    6Incremental Replication (displayed in scenarios involving incremental replication)Display various monitoring indicators for incremental replication.
    • Click on the right side of the page: View the operations being executed by the current replication task, including:
      • : The replication task is executed by multiple threads, displaying the current thread number being executed.
      • : Details of SQL statements being executed by the current thread.
      • : The response time of the current thread. If this value increases, it indicates that the current thread may be stuck due to certain reasons.
      • : The timestamp when the current thread started.
      • : The status of the current thread.
    • Click Rate Limit Settings on the right: Limit the rate at which data is written to the target data source per second. The unit is rows/second.
    • Click Logs on the right: View the execution logs of incremental replication.
    • Click refresh on the right: View the latest information.
    7Modify ObjectsDisplay modification records of synchronization objects.
    • Click Modify Synchronization Objects on the right to configure synchronization objects.
    • Click refresh on the right: View the latest information.
    8Data ComparisonDisplay the data comparison results between the source data source and the target data source. If you have not enabled data comparison, click Enable Data Comparison in the page.
    • Click Recompare on the right: Reinitiate data comparison between the current source and target ends.
    • Click Logs on the right: View the execution logs of consistency comparison.
    • Click Monitoring on the right: View the trend chart of comparison RPS (records per second). Click Details to view records from earlier times.
    • Click details in the Operation column on the right of the comparison list (displayed in case of inconsistencies): View the comparison details between the source and target ends.
    • Click sql in the Operation column on the right of the comparison list (displayed in case of inconsistencies): Generate change SQL. You can directly copy this SQL to the target data source for execution to modify the inconsistent content.
    9ExpandDisplay detailed information of the current replication task, including Replication Type, Replication Objects, Start Time, and so on.

Appendix 1: Mapping Table for MySQL and SelectDB Data Types

During the data replication process, MySQL data types are mapped to corresponding SelectDB data types.

Category MySQL Data Type SelectDB Data Type
Numeric TINYINT TINYINT
TINYINT UNSIGNED SMALLINT
SMALLINT SMALLINT
SMALLINT UNSIGNED INT
MEDIUMINT INT
MEDIUMINT UNSIGNED BIGINT
INT INT
INT UNSIGNED BIGINT
BIGINT BIGINT
BIGINT UNSIGNED LARGEINT
BIT(M) INT
Decimal Decimal
Numeric Decimal
Float Float
Double DOUBLE
BOOL|BOOLEAN BOOLEAN
DATE AND TIME DATE DATEV2 (SelectDB version ≥ 2.2.x)
DATE (SelectDB version < 2.2.x)
DATETIME[(fsp)] DATETIME (source without FSP parameter and SelectDB version ≥ 2.2.x, or SelectDB version ≤ 2.2.x)
DATETIMEV2 (source with FSP parameter and SelectDB version ≥ 2.2.x)
Timestamp[(fsp)] DATETIME (source without FSP parameter and SelectDB version ≥ 2.2.x, or SelectDB version ≤ 2.2.x)
DATETIMEV2 (source with FSP parameter and SelectDB version ≥ 2.2.x)
Time[(fsp)] VARCHAR
YEAR[(4)] INT
STRING CHAR|VARCHAR CHAR
VARCHAR VARCHAR
Note: To avoid data migration loss, VARCHAR(n) will be adjusted to VARCHAR(3*n) after migration to SelectDB.
BINARY|VARBINARY STRING
TINYTEXT|TEXT|MEDIUMTEXT|LONGTEXT STRING
TINYBLOB|BLOB|MEDIUMBOLB|LONGBLOB STRING
ENUM STRING
SET STRING
JSON JSON JSONB (SelectDB version ≥ 2.2.x)
STRING (SelectDB version < 2.2.x)

Appendix 2: Checklist of Pre-Check Items

Check ItemCheck Content
Target Database Data Existence CheckCheck if data exists for the objects to be replicated in the target database
Target Database Same Name Object Existence CheckCheck if the objects to be replicated already exist in the target database
Check Object Name ComplianceCheck if the database name and table name comply with Doris naming conventions
Source Data Source Connection CheckCheck the status of the source data source gateway, instance accessibility, and accuracy of username and password
Target Data Source Connection CheckCheck the status of the target data source gateway, instance accessibility, and accuracy of username and password
Target Database Permission CheckCheck if the account permissions in the target database meet the requirements
Source Database Permission CheckCheck if the account permissions in the source database meet the requirements
Check if Source Database log_slave_updates is SupportedCheck if log_slave_updates is set to ON when the source database is a slave
Source Data Source and Target Data Source Version CheckCheck if the versions of the source database and target database are compatible
Check if Source Database is Enabled with BinlogCheck if the source database is enabled with Binlog
Check if Source Database Binlog Format is SupportedCheck if the binlog format of the source database is 'ROW'
Check if Source Database binlog_row_image is SupportedCheck if the binlog_row_image of the source database is 'FULL'

Data Replication Introduction