Archive & Clean
NineData supports Archive & Clean functionality, allowing you to archive unused data to another database for backup purposes, freeing up storage space in the operational database and optimizing performance. This feature supports both one-time execution and periodic automatic execution.
Background Information
After running online data for a period of time, with a significant increase in data volume, the performance of writing, updating, and querying business data may be affected. At the same time, data from a certain period of time ago is often no longer frequently queried but still occupies storage space in the database. To meet the requirements of performance optimization and online storage space, we often need to manage Archive & Clean for data that meets certain conditions, and simultaneously clean up archived data.
NineData's Archive & Clean functionality supports one-time or periodic archiving and cleaning tasks, ensuring that your operational database maintains high performance and stable operation at all times.
Prerequisites
- You have already created or joined an organization, and the organization has subscribed to . Please ensure that your annual or monthly subscription is still active. For more information, please refer to Manage Organizations.
- Your current account has been switched to the target organization. For more information, please refer to Switching to an Organization.
- The table to be archived must have a timestamp field.
- The target database type is MySQL, PostgreSQL.
Submit Archive & Clean Task
Log in to the NineData console.
In the left navigation pane, click > .
On the page, click .
On the page, configure according to the table below.
Parameter Description Enter the name of the data archiving task. For ease of subsequent retrieval and management, please use a meaningful name. Supports up to 64 characters. (Optional) Enter remarks, such as a description of the purpose of the data archiving task. Select the strategy for the data archiving task. Choose either or . Supports the following three options: - : Archives data from the source to the destination, storing the required data from the source into the destination.
- : Archives data from the source to the destination, and then deletes the archived data from the source.
- : Deletes data from the destination without archiving.
The data source and schema where the data to be archived is located. The data source and schema where the archived data will be stored. Not visible when is selected. Select the execution frequency of the task. - : Executes only once.
- : Creates a periodic task that executes cyclically according to the selected period.
Select the timezone of your target location. Required for periodic tasks. Choose whether the task should be executed on specific days of the week or month.
Select or to execute the task on specific days of the week or month. Check to execute the task every day.Required for periodic tasks. Select the time of day when the periodic task should be executed during the selected cycle. Clicking Now sets the current time as the task execution time. Optional only when the task includes a cleansing operation. After data cleanup in the source data source, the storage space will not be automatically released. Configuring this option allows you to periodically execute an Optimize operation to reclaim table space.
Choose how many times to execute the Optimize operation after performing data cleanup.Limits the duration of task execution, in hours. For example, you can set the archiving task to start at 0:00 during the business off-peak hours and set the execution duration to 6 hours. Even if the task is not completed by 6:00 in the morning, it will automatically stop to avoid impacting business operations. (MySQL only) When configuring data archiving to the target end, the following strategies are applied when encountering data conflicts: - : Ignore the current operation causing the conflict and continue to the next operation.
- : Overwrite the target data.
Click to go to the page to configure the data to be archived.
Parameter Description Defines system variables, where the $bizdate
variable automatically fetches the current system time. You can customize the format of this system time. For instructions on defining time formats, please refer to the appendix.
This feature is mainly used in scenarios such as writing the names of newly created tables and WHERE conditions in data archiving and cleanup tasks. For example, if you set toyyyyMMdd
, when creating a new table on the target to store archived data, set the table name tonew_table_${bizdate}
. If the current date is May 10, 2024, the table name will becomenew_table_20240510
.Selects the table in the source where the archived data resides. Selects the table in the target where the archived data will be stored. The table structure of this table needs to be consistent with the source table. If they are not identical, you can manually map table names and column names by clicking on the right side of .
Clicking on the icon on the right side allows you to create a new table in the target database. You don't need to map table names and column names manually as the system will automatically store the archived data in this new table according to the structure of the original table. When specifying the table name, you can use the defined , for example,new_table_${bizdate}
.Selects a time field to determine whether data needs to be archived. This field typically reflects the time when the row data was generated, such as the order creation time in your order table. If the data type of the field selected in is numeric (BIGINT, INT) or string (CHAR, VARCHAR), you need to define the time format here to ensure NineData can smoothly read these data types of fields. For instructions on defining time formats, please refer to the appendix. Based on the selected , selects how many days of data need to be archived, with optional values ranging from 1 to 3650 days. For example, if you need to archive or clean data older than one year, enter 365 here. - : Manually map table names and column names in the data archiving scenario. For example, storing data from a column in the source table to a specified column in the target table.
- : Preview the archiving and cleanup statements that the system will execute under the current configuration.
- : Delete the current record.
Adds a new record. Click . At the node, wait for the task status to change to , then click in the upper right corner of the page. In the pop-up window, select the workflow approvers, click , and enter the node. At the node, you can also perform the following operations:
- Click the icon in the upper right corner of the page to refresh the task status.
- Click in the upper right corner of the page to directly withdraw the task or edit and resubmit it.
- Click in the upper right corner of the page to copy a new task with the same configuration.
tip- Depending on the configuration of the approval process, the number of approvers to be selected here may vary. Please select according to the actual situation.
If the feature is enabled in the current approval process, this section will display . You do not need to manually specify an approver; all personnel authorized to approve the current work order will receive approval notifications and be able to approve it.
- If the data source is not configured with an approval process, the task will automatically skip the and nodes, proceed to the node, and change the task status to . You can then proceed directly to Step 8.
When the task status changes to , you can perform the following operations before approval:
- Click the icon in the upper right corner of the page to refresh the task status.
- Click in the upper right corner of the page to directly withdraw the task or edit and resubmit it.
- Click to change the approver of the task.
- Click in the page to view the execution objects of this task.
When the task status changes to , click in the upper right corner of the page for the task's , choose , and click .
- : Execute the task immediately.
- (only for one-time tasks): Choose a time to execute the task.
Wait for the task status to change to .
Submit Data Cleanup Task
Log in to the NineData console.
In the left navigation pane, click > .
On the page, click .
On the page, configure according to the table below.
Parameter Description Enter the name of the task. For ease of later search and management, please use a meaningful name. Supports up to 64 characters. (Optional) Enter remarks, such as a description of the purpose of this task. Select the policy of the Archive & Clean task. Choose here. Supports the following three options: - : Archive the target data, copying the data that needs to be archived from the source database to the target database.
- : Archive the target data first, then delete the archived data from the source database.
- : This operation only deletes the target data and does not perform archiving.
The data source and database where the data to be cleaned up is located. Select the execution frequency of the task. - : Execute only once.
- : Create a periodic task that executes cyclically according to the selected period.
Select the time zone of your target region. Select for periodic tasks, choose or to execute the task on the specified day of the week or month. Check to execute daily. For periodic tasks, select the time of day when the task will be executed during the selected period. Click Now to set the current time as the task execution time. Click to go to the page to configure the data to be cleaned up.
Parameter Description Select the table where the data to be cleaned up is located in the source table. Select a timestamp field as the basis for determining whether data needs to be cleaned up. This field usually reflects the time when the data in the row was generated. For example, the order creation time field in your order table. Based on the selected earlier, select how many days of data to clean up. Valid values are 1~3650 days. For example, if you need to clean up data older than a year, enter 365 here. (Optional) Set filtering conditions. Only data that meets the filtering conditions will be cleaned up. If not set, all data that meets the conditions will be cleaned up by default. Taking the test data table employees
as an example, setting the filtering condition toemp_no>=10005
means that data with emp_no less than 10005 will not be cleaned up.: Delete the current record. Add a new record. Click . At the node, wait for the task status to change to , then click in the upper right corner of the page. In the pop-up window, select the approver(s), click , and enter the node. At the node, you can also:
- Click the icon in the upper right corner of the page to refresh the task status.
- Click in the upper right corner of the page to directly withdraw the task or edit and resubmit it.
- Click in the upper right corner of the page to copy a new task with the same configuration.
tip- Depending on the configuration of the approval process, the number of approvers to be selected here may vary. Please select according to the actual situation.
If the feature is enabled in the current approval process, this section will display . You do not need to manually specify an approver; all personnel authorized to approve the current work order will receive approval notifications and be able to approve it.
- If the data source is not configured with an approval process, the task will automatically skip the and nodes, proceed to the node, and change the task status to . You can then proceed directly to Step 8.
When the task status changes to , you can perform the following operations before approval:
- Click the icon in the upper right corner of the page to refresh the task status.
- Click in the upper right corner of the page to directly withdraw the task or edit and resubmit it.
- Click to change the approver of the task.
- Click in the page to view the execution objects of this task.
When the task status changes to , click in the upper right corner of the page for the task's , choose , and click .
- : Execute the task immediately.
- (only for one-time tasks): Choose a time to execute the task.
Wait for the task status to change to .
Approve Task
Prerequisites
- The current task status is .
- You are either or the for the current task.
Steps
Log in to the NineData console.
In the left navigation pane, click > .
On the page, find the target task and click the task ID or in the column on the right side of the task.
tipAlternatively, you can hover over the icon at the top of the page and click the target task on the tab.
On the page, review the approval work order, and choose , , or based on the actual situation.
Operation Description Forward the task to another user for approval. Approve the application. After clicking , you can optionally enter as needed, and click . Reject the application. After clicking , you also need to enter , and click .
Appendix: Explanation of Time Fields
Category | Time Field | Description |
---|---|---|
Year | yyyy or y | Four-digit current year, e.g., 2024. |
yy | Two-digit current year, e.g., 24. | |
Month | M | Current month, e.g., 8. |
MM | Current month with leading zero if necessary, e.g., 08. | |
MMM | Current month with the suffix "月" in Chinese, e.g., 8月. | |
MMMM | Current month in Chinese, e.g., 八月. | |
Day | d | Day of the month, e.g., 1. |
dd | Day of the month with leading zero if necessary, e.g., 01. | |
ddd | Day of the year, e.g., 214. | |
e | Day of the week, e.g., 4. | |
Week | w | Week of the year, e.g., 31. |
Time Period | a | Time period of the day, e.g., 上午 (morning) or 下午 (afternoon). |
Time | h | Hour of the day, e.g., 9. |
hh | Hour of the day with leading zero if necessary, e.g., 09. | |
m | Minute, e.g., 5. | |
mm | Minute with leading zero if necessary, e.g., 05. | |
s | Second, e.g., 8. | |
ss | Second with leading zero if necessary, e.g., 08. | |
Time Zone | z | Time zone offset with a colon, e.g., +08:00. |
zz | Time zone offset without a colon, e.g., +0800. |
The above time field formats can be freely combined.
Example: yyyyMMdd ahh:mm:ss
will display the current time in the following format: 20240801 下午14:30:00
.