MySQL to Redshift

NineData data replication supports both full data synchronization and incremental data synchronization from MySQL to Redshift data sources.

Prerequisites

Source and target data sources have been added to NineData. For instructions on how to add data sources, please refer to Adding Data Sources.
The source database type is MySQL or a MySQL-like database, such as MySQL, MariaDB, PolarDB MySQL, TDSQL-C, GaussDB MySQL, Aurora, etc.
The target database is a Redshift provisioned cluster or Redshift Serverless.
In the case of incremental replication, the source data source must have Binlog enabled, and the Binlog-related parameters are set as follows:
- binlog_format=ROW
- binlog_row_image=FULL
tip
If the source data source is a standby server, to ensure complete Binlog logs are obtained, you also need to enable the log_slave_updates parameter.

Usage Restrictions

Incremental replication process does not currently support synchronizing DDL operations on the source side.
Structural replication is not currently supported; you need to manually create tables on the target side that match the objects to be synchronized from the source.
The data replication functionality is only for user databases in the data source; system databases are not replicated. For example, in MySQL-type data sources, databases like information_schema, mysql, performance_schema, and sys will not be replicated.
The account of the source data source must have SELECT, REPLICATION CLIENT, and REPLICATION SLAVE permissions for the objects to be replicated (incremental replication), and the account of the target data source must have DML permissions.
Evaluate the performance of both the source and target data sources before executing data synchronization, and it is recommended to perform data synchronization during off-peak hours for the business. Otherwise, during full data initialization, there will be a certain consumption of reading and writing resources on both the source and target data sources, leading to an increase in database load.
It is recommended to ensure that each table in the synchronized objects has a primary key or unique constraint, and column names are unique. Otherwise, the same data may be synchronously replicated repeatedly.

Operation Steps

Commercialization Notice

NineData’s data replication product has been commercialized. You can still use 10 replication tasks for free, with the following considerations:

Among the 10 replication tasks, you can include 1 task, with a specification of Micro.
Tasks with a status of do not count towards the 10-task limit. If you have already created 10 replication tasks and want to create more, you can terminate previous replication tasks and then create new ones.
When creating replication tasks, you can only select the you have purchased. Specifications that have not been purchased will be grayed out and cannot be selected. If you need to purchase additional specifications, please contact us through the customer service icon at the bottom right of the page.

Log in to the NineData console.
Click on > in the left navigation bar.
On the page, click on in the top right corner.

On the tab, configure according to the table below and click .

Parameter	Description
	Enter a name for the data synchronization task for easy later retrieval and management. Please use meaningful names. Up to 64 characters are supported.
	The data source where the synchronization object is located.
	The data source that receives the synchronized object.
	Select the content to be copied to the target data source. : Synchronize all objects and data from the source data source, i.e., full data replication. : After full synchronization, perform incremental synchronization based on the source data source's logs. Click on the icon to unselect certain operation types as needed. These operations will be ignored during incremental synchronization after unselecting them.
IAM Role	An IAM Role with permissions to execute commands in Redshift. During the replication process, NineData first copies MySQL data to NineData's Amazon S3 service, and then from S3 to your Redshift. This Role is used to authorize NineData to execute copy commands in your Redshift to copy data from S3 to Redshift. For information on obtaining the IAM Role, please refer to the section on this page.
	The should only be selected when is chosen. Options are: : Perform incremental replication based on the start time of the current replication task. : Choose the starting point for incremental replication. You can select the time based on your business location's timezone.
	: Stop the task when data is detected in the target table during the pre-check stage. : Ignore the data detected in the target table during the pre-check stage and append other data. : Delete the data detected in the target table during the pre-check stage and rewrite it.

On the tab, configure the following parameters and then click .

Parameter	Description
	Choose what to copy. You can select to copy all contents of the source library, or you can choose . In the list, select the content to be copied, click > to add it to the list on the right.
(optional)	Click to add a blacklist record. Select the library or object to be added to the blacklist; these contents will not be copied. This is used to exclude certain libraries or objects in full library replication with or . Left dropdown: Select the library name to be added to the blacklist. Right dropdown: Select the corresponding objects in the library. You can click on multiple objects for multiple selections. Leave it blank to add the entire database to the blacklist. If you want to add multiple libraries to the blacklist, you can click the Add button below to add a row.

Parameter

Description

Choose what to copy. You can select to copy all contents of the source library, or you can choose . In the list, select the content to be copied, click > to add it to the list on the right.

(optional)

Click to add a blacklist record. Select the library or object to be added to the blacklist; these contents will not be copied. This is used to exclude certain libraries or objects in full library replication with or .

Left dropdown: Select the library name to be added to the blacklist.
Right dropdown: Select the corresponding objects in the library. You can click on multiple objects for multiple selections. Leave it blank to add the entire database to the blacklist.

If you want to add multiple libraries to the blacklist, you can click the Add button below to add a row.

On the tab, the system matches the target data source's schema with the same name by default. If it doesn't exist, you need to manually select the target schema. By default, table names and column names in the target library need to match the synchronized objects. If they don't match, you can manually map the table names and column names. After configuration, click Save and Pre-check.
On the tab, wait for the system to complete the pre-check. After the pre-check passes, click .
tip
- If the pre-check fails, you need to click on the right of the target check item in the column to investigate the cause of the failure and manually fix it. Click to re-run the pre-check until it passes.
- Check items with in can be repaired or ignored depending on the specific situation.
On the page, when is displayed, the synchronization task starts running. At this point, you can:
- Click to view the execution status of each stage of the synchronization task.
- Click to return to the task list page.

Viewing Sync Results

Log in to the NineData console.
Click on > in the left navigation bar.

On the page, click on the of the target sync task. The page is explained as follows.

redshift_result

Serial Number	Function	Description
1	Sync Delay	The data synchronization delay between the source data source and the target data source. 0 seconds indicates no delay between the two ends, allowing you to switch the business to the target data source for smooth migration.
2	Configure Alarm	After configuring the alarm, the system will notify you in the way you choose when the task fails. For more information, see Operations and Monitoring Introduction.
3	More	Pause: Pause the task. Only tasks with the status Running can be selected. Terminate: End the task that is unfinished or in listening mode (i.e., in incremental sync). After terminating the task, it cannot be restarted, so please proceed with caution. If the synchronized objects contain triggers, a trigger copy option will pop up, choose as needed. Delete: Delete the task. Once the task is deleted, it cannot be recovered, so please proceed with caution.
4	Full Copy (displayed in scenarios involving full copy)	Display the progress and details of the full copy. Click Monitoring on the right side of the page: View various monitoring indicators during the full copy process. During the full copy process, you can also click Flow Control Settings on the right side of the monitoring indicator page to limit the rate of writing to the target data source in rows per second. Click Logs on the right side of the page: View the execution logs of the full copy. Click on the right side of the page: View the latest information.
5	Incremental Copy (displayed in scenarios involving incremental copy)	Display various monitoring indicators for incremental copy. Click on the right side of the page: View the operations currently being performed in the current copy task, including: : The copy task is performed by multiple threads, showing the thread number currently in progress. : Details of the SQL statement currently being executed by the thread. : The response time of the current thread. If this value increases, the current thread may be stuck for some reason. : The timestamp when the current thread was started. : The status of the current thread. Click Flow Control Settings on the right side of the page: Limit the rate of writing to the target data source in rows per second. Click Logs on the right side of the page: View the execution logs of the incremental copy. Click on the right side of the page: View the latest information.
6	Modify Object	Display the modification records of synchronized objects. Click Modify Synchronized Object on the right of the page to configure synchronized objects. Click on the right side of the page: View the latest information.
7	Expand	Display detailed information about the current copy task, including Copy Type, Copy Object, Start Time, etc.

Appendix 1: Mapping Table of MySQL and Redshift Data Types

During the data copying process, MySQL data types are mapped to corresponding Redshift data types.

Category	MySQL Data Type	Redshift Data Type
Numeric	TINYINT	SMALLINT
	TINYINT UNSIGNED	SMALLINT
	SMALLINT	SMALLINT
	SMALLINT UNSIGNED	INTEGER
	MEDIUMINT	INTEGER
	MEDIUMINT UNSIGNED	BIGINT
	INT	INTEGER
	INT UNSIGNED	BIGINT
	BIGINT	BIGINT
	BIGINT UNSIGNED	BIGINT
	BIT(M)	BOOL
	Decimal	Decimal
	Numeric	Numeric
	Float	REAL
	Double	REAL
	BOOL, BOOLEAN	BOOL
DATE AND TIME	DATE	DATE
	DATETIME[(fsp)]	timestamp
	Timestamp[(fsp)]	timestamptz
	Time[(fsp)]	time
	YEAR[(4)]	INT
STRING	CHAR	CHAR
	VARCHAR	VARCHAR
	BINARY/VARBINARY	VARBINARY
	TINYTEXT/TEXT/MEDIUMTEXT/LONGTEXT	VARCHAR
	TINYBLOB/BLOB/MEDIUMBOLB/LONGBLOB	VARBINARY
	ENUM	VARCHAR
	SET	VARCHAR
JSON	JSON	VARCHAR

Appendix 2: Pre-check Items Overview

Check Item	Check Item Description	Does Failure Prevent Task Continuation?	Failure Reasons and Remediation Strategies
Source Data Source Connection Test	Check the connectivity of the source data source.	Yes	Failure Reason: The connection between NineData and the source database is not established. Remediation Method: Self-built Database: Check if the database is configured with a firewall and confirm if NineData's service IP is allowed. ECS Self-built Database: Confirm the security group settings of the ECS instance where the data source is located, allowing access from NineData's IP. If not allowed, modify the security group to open NineData IP and perform the pre-check again. Cloud Database: Confirm the security group \| whitelist settings of the cloud database, allowing access from NineData's service IP. If not allowed, modify the security group \| whitelist to open NineData IP and perform the pre-check again. JDBC Connection Failure: Confirm if the database account and password are incorrect. If incorrect, adjust the data source configuration and perform the pre-check again.
Target Data Source Connection Test	Check the connectivity of the target data source.	Yes	Failure Reason: The connection between NineData and the target data source is not established. Remediation Method: Cloud Database: Confirm the security group \| whitelist settings of the cloud database, allowing access from NineData's service IP. If not allowed, modify the security group \| whitelist to open NineData IP and perform the pre-check again. JDBC Connection Failure: Confirm if the database account and password are incorrect. If incorrect, adjust the data source configuration and perform the pre-check again.
Source Data Source Permission Check	Check the account permissions of the source data source.	Yes	Failure Reason: Insufficient permissions for the source database. Remediation Method: Assign SELECT, REPLICATION CLIENT, and REPLICATION SLAVE permissions to the account of the source data source, and perform the pre-check again.
Target Data Source Permission Check	Check the account permissions of the target data source.	Yes	Failure Reason: Insufficient permissions for the target database. Remediation Method: Assign CREATE/ALTER/DROP TABLE, SELECT, INSERT, UPDATE, DELETE, REFERENCES, CREATE, TEMPORARY, and USAGE permissions to the account of the target data source, and perform the pre-check again.
Check if the Source Database has Binlog Enabled	Check if the source database has Binlog enabled.	Yes	Failure Reason: Binlog is not enabled for your source database. Remediation Method: Enable Binlog and perform the pre-check again.
Check if the binlog format of the source database supports row mode	Check if the binlog format of the source database is row mode	Yes	Failure Reason: The binlog format of your source database is not row mode. Remediation Method: Please refer to the MySQL official documentation to modify it, and perform the pre-check again.
Check if binlog_row_image is full	Check if binlog_row_image of the source database is full	Yes	Failure Reason: The binlog_row_image of your source database is not full. Remediation Method: Please refer to the MySQL official documentation to modify it, and perform the pre-check again.
Check for the Existence of Objects with the Same Name	Check if there are objects with the same name as the ones to be copied in the target	Yes: When the same-name object handling strategy is set to "pre-check error and stop the task". No: In other cases.	Failure Reason: There are objects with the same name in the target data source. Remediation Method: Modify the same-name object handling strategy and perform the pre-check again. Delete the objects with the same name and perform the pre-check again.
Check for Data Existence	Check if the target data source already contains data for the copied objects	Yes: When the same-name object handling strategy is "pre-check error and stop the task". No: In other cases.	Failure Reason: There is duplicate data in the target data source. Remediation Method: Modify the handling strategy for existing data and perform the pre-check again. Manually clear the existing data in the target data source and perform the pre-check again.

Data Replication Introduction

MySQL to Redshift

Prerequisites​

Usage Restrictions​

Operation Steps​

Viewing Sync Results​

Appendix 1: Mapping Table of MySQL and Redshift Data Types​

Appendix 2: Pre-check Items Overview​

Related Documents​