Archive & Clean
NineData supports Archive & Clean functionality, allowing you to archive unused data to another database for backup purposes, freeing up storage space in the operational database and optimizing performance. This feature supports both one-time execution and periodic automatic execution.
Background Information
After running online data for a period of time, with a significant increase in data volume, the performance of writing, updating, and querying business data may be affected. At the same time, data from a certain period of time ago is often no longer frequently queried but still occupies storage space in the database. To meet the requirements of performance optimization and online storage space, we often need to manage Archive & Clean for data that meets certain conditions, and simultaneously clean up archived data.
NineData's Archive & Clean functionality supports one-time or periodic archiving and cleaning tasks, ensuring that your operational database maintains high performance and stable operation at all times.
Prerequisites
- You have already created or joined an organization, and the organization has subscribed to DevOps Enterprise. Please ensure that your annual or monthly subscription is still active. For more information, please refer to Manage Organizations.
- Your current account has been switched to the target organization. For more information, please refer to Switching to an Organization.
- The table to be archived must have a timestamp field.
- The target database type is MySQL, PostgreSQL, MariaDB.
Submit Archive & Clean Task
Log in to the NineData console.
In the left navigation pane, click DevOps > Archive & Clean.
On the Archive & Clean page, click Create Archive & Clean.
On the Create Archive & Clean page, configure according to the table below.
Parameter Description Name Enter the name of the data archiving task. For ease of subsequent retrieval and management, please use a meaningful name. Supports up to 64 characters. Note (Optional) Enter remarks, such as a description of the purpose of the data archiving task. Archive Policy Select the strategy for the data archiving task. Choose either Only Archiving or Archiving + Cleansing. Supports the following three options: - Only Archiving: Archives data from the source to the destination, storing the required data from the source into the destination.
- Archiving + Cleansing: Archives data from the source to the destination, and then deletes the archived data from the source.
- Only Cleansing: Deletes data from the destination without archiving.
Source The data source and schema where the data to be archived is located. Target The data source and schema where the archived data will be stored. Not visible when Only Cleansing is selected. Archiving Frequency Select the execution frequency of the task. - Once-run: Executes only once.
- Recurring: Creates a periodic task that executes cyclically according to the selected period.
Time Zone Select the timezone of your target location. Periodic Required for periodic tasks. Choose whether the task should be executed on specific days of the week or month.
Select Week or Month to execute the task on specific days of the week or month. Check Every day to execute the task every day.Launch Time Required for periodic tasks. Select the time of day when the periodic task should be executed during the selected cycle. Clicking Now sets the current time as the task execution time. After how many times of cleaning, execute Optimize once Optional only when the task includes a cleansing operation. After data cleanup in the source data source, the storage space will not be automatically released. Configuring this option allows you to periodically execute an Optimize operation to reclaim table space.
Choose how many times to execute the Optimize operation after performing data cleanup.Execution time limit (1-24 HR), after which the task will be automatically terminated to avoid affecting the business Limits the duration of task execution, in hours. For example, you can set the archiving task to start at 0:00 during the business off-peak hours and set the execution duration to 6 hours. Even if the task is not completed by 6:00 in the morning, it will automatically stop to avoid impacting business operations. Data Conflict Strategy (MySQL only) When configuring data archiving to the target end, the following strategies are applied when encountering data conflicts: - Ignored: Ignore the current operation causing the conflict and continue to the next operation.
- Replace: Overwrite the target data.
Click Next to go to the Select Objects page to configure the data to be archived.
Parameter Description System Paramter Defines system variables, where the $bizdatevariable automatically fetches the current system time. You can customize the format of this system time. For instructions on defining time formats, please refer to the appendix.
This feature is mainly used in scenarios such as writing the names of newly created tables and WHERE conditions in data archiving and cleanup tasks. For example, if you set System Paramter toyyyyMMdd, when creating a new table on the target to store archived data, set the table name tonew_table_${bizdate}. If the current date is May 10, 2024, the table name will becomenew_table_20240510.Source Table Selects the table in the source where the archived data resides. Target Table Selects the table in the target where the archived data will be stored. The table structure of this table needs to be consistent with the source table. If they are not identical, you can manually map table names and column names by clicking on the right side of Mapping & Filtering.
Clicking on theicon on the right side allows you to create a new table in the target database. You don't need to map table names and column names manually as the system will automatically store the archived data in this new table according to the structure of the original table. When specifying the table name, you can use the defined System Paramter, for example,
new_table_${bizdate}.Time Selects a time field to determine whether data needs to be archived. This field typically reflects the time when the row data was generated, such as the order creation time in your order table. Column Setting If the data type of the time column selected in **Time**is numeric (BIGINT, INT) or a string (CHAR, VARCHAR), you need to configure the initialization method here:**Time Format**: Customize the time format to ensure NineData can correctly read fields of these data types. For how to define the time format, please refer to the Appendix in this document.**Timestamp**: If the time field is stored as a timestamp, you can select**Timestamp**. The system will automatically convert it based on the selected time precision (options include**s**,**ms**,**μs**).
Reserve Days Based on the selected Time, selects how many days of data need to be archived, with optional values ranging from 1 to 3650 days. For example, if you need to archive or clean data older than one year, enter 365 here. Actions - Mapping & Filtering: Manually map table names and column names in the data archiving scenario. For example, storing data from a column in the source table to a specified column in the target table.
- View SQL: Preview the archiving and cleanup statements that the system will execute under the current configuration.
- Delete: Delete the current record.
Add Tables Adds a new record. Click Create Task. At the Pre-Check node, wait for the task status to change to Pending, then click Submit Approval in the upper right corner of the page. In the pop-up window, select the workflow approvers, click OK, and enter the Submit Approval node. At the Pre-Check node, you can also perform the following operations:
- Click the
icon in the upper right corner of the page to refresh the task status.
- Click Withdraw in the upper right corner of the page to directly withdraw the task or edit and resubmit it.
- Click Duplicate in the upper right corner of the page to copy a new task with the same configuration.
tip- Depending on the configuration of the approval process, the number of approvers to be selected here may vary. Please select according to the actual situation.
If the Not Specifying Approvers feature is enabled in the current approval process, this section will display After submission, all approvers can review. Please check the results in the approval process.. You do not need to manually specify an approver; all personnel authorized to approve the current work order will receive approval notifications and be able to approve it.
- If the data source is not configured with an approval process, the task will automatically skip the Pre-Check and Submit Approval nodes, proceed to the Execute Task node, and change the task status to Approved. You can then proceed directly to Step 8.
- Click the
When the task status changes to Pending Approval, you can perform the following operations before approval:
- Click the
icon in the upper right corner of the page to refresh the task status.
- Click Withdraw in the upper right corner of the page to directly withdraw the task or edit and resubmit it.
- Click Transfer to change the approver of the task.
- Click View Objects in the page to view the execution objects of this task.
- Click the
When the task status changes to Approved, click Execute in the upper right corner of the page for the task's Executor, choose Execution Method, and click OK.
- Execute Now: Execute the task immediately.
- Scheduled Execution (only for one-time tasks): Choose a time to execute the task.
Wait for the task status to change to Success.
Submit Data Cleanup Task
Log in to the NineData console.
In the left navigation pane, click DevOps > Archive & Clean.
On the Archive & Clean page, click Create Archive & Clean.
On the Create Archive & Clean page, configure according to the table below.
Parameter Description Name Enter the name of the task. For ease of later search and management, please use a meaningful name. Supports up to 64 characters. Note (Optional) Enter remarks, such as a description of the purpose of this task. Archive Policy Select the policy of the Archive & Clean task. Choose Only Cleansing here. Supports the following three options: - Only Archiving: Archive the target data, copying the data that needs to be archived from the source database to the target database.
- Archiving + Cleansing: Archive the target data first, then delete the archived data from the source database.
- Only Cleansing: This operation only deletes the target data and does not perform archiving.
Source The data source and database where the data to be cleaned up is located. Archiving Frequency Select the execution frequency of the task. - Once-run: Execute only once.
- Recurring: Create a periodic task that executes cyclically according to the selected period.
Time Zone Select the time zone of your target region. Periodic Select for periodic tasks, choose Week or Month to execute the task on the specified day of the week or month. Check Every day to execute daily. Launch Time For periodic tasks, select the time of day when the task will be executed during the selected period. Click Now to set the current time as the task execution time. Click Next to go to the Select Objects page to configure the data to be cleaned up.
Parameter Description Source Table Select the table where the data to be cleaned up is located in the source table. Time Select a timestamp field as the basis for determining whether data needs to be cleaned up. This field usually reflects the time when the data in the row was generated. For example, the order creation time field in your order table. Reserve Days Based on the Time selected earlier, select how many days of data to clean up. Valid values are 1~3650 days. For example, if you need to clean up data older than a year, enter 365 here. Condition (Optional) Set filtering conditions. Only data that meets the filtering conditions will be cleaned up. If not set, all data that meets the conditions will be cleaned up by default. Taking the test data table employeesas an example, setting the filtering condition toemp_no>=10005means that data with emp_no less than 10005 will not be cleaned up.Actions Delete: Delete the current record. Add Tables Add a new record. Click Create Task. At the Pre-Check node, wait for the task status to change to Pending, then click Submit Approval in the upper right corner of the page. In the pop-up window, select the approver(s), click OK, and enter the Submit Approval node. At the Pre-Check node, you can also:
- Click the
icon in the upper right corner of the page to refresh the task status.
- Click Withdraw in the upper right corner of the page to directly withdraw the task or edit and resubmit it.
- Click Duplicate in the upper right corner of the page to copy a new task with the same configuration.
tip- Depending on the configuration of the approval process, the number of approvers to be selected here may vary. Please select according to the actual situation.
If the Not Specifying Approvers feature is enabled in the current approval process, this section will display After submission, all approvers can review. Please check the results in the approval process.. You do not need to manually specify an approver; all personnel authorized to approve the current work order will receive approval notifications and be able to approve it.
- If the data source is not configured with an approval process, the task will automatically skip the Pre-Check and Submit Approval nodes, proceed to the Execute Task node, and change the task status to Approved. You can then proceed directly to Step 8.
- Click the
When the task status changes to Pending Approval, you can perform the following operations before approval:
- Click the
icon in the upper right corner of the page to refresh the task status.
- Click Withdraw in the upper right corner of the page to directly withdraw the task or edit and resubmit it.
- Click Transfer to change the approver of the task.
- Click View Objects in the page to view the execution objects of this task.
- Click the
When the task status changes to Approved, click Execute in the upper right corner of the page for the task's Executor, choose Execution Method, and click OK.
- Execute Now: Execute the task immediately.
- Scheduled Execution (only for one-time tasks): Choose a time to execute the task.
Wait for the task status to change to Success.
Approve Task
Prerequisites
- The current task status is Pending Approval.
- You are either Administrator or the Approver for the current task.
Steps
Log in to the NineData console.
In the left navigation pane, click DevOps > Archive & Clean.
On the Archive & Clean page, find the target task and click the task ID or Details in the Actions column on the right side of the task.
tipAlternatively, you can hover over the
icon at the top of the page and click the target task on the Todo tab.
On the Details page, review the approval work order, and choose Transfer, Approved, or Rejected based on the actual situation.
Operation Description Transfer Forward the task to another user for approval. Approved Approve the application. After clicking Approved, you can optionally enter Reason as needed, and click OK. Rejected Reject the application. After clicking Rejected, you also need to enter Reason for rejection, and click OK.
Appendix: Explanation of Time Format
| Category | Time Field | Description |
|---|---|---|
| Year | yyyy or y | Four-digit current year, e.g., 2024. |
| yy | Two-digit current year, e.g., 24. | |
| Month | M | Current month, e.g., 8. |
| MM | Current month with leading zero if necessary, e.g., 08. | |
| MMM | Current month with the suffix "月" in Chinese, e.g., 8月. | |
| MMMM | Current month in Chinese, e.g., 八月. | |
| Day | d | Day of the month, e.g., 1. |
| dd | Day of the month with leading zero if necessary, e.g., 01. | |
| ddd | Day of the year, e.g., 214. | |
| e | Day of the week, e.g., 4. | |
| Week | w | Week of the year, e.g., 31. |
| Time Period | a | Time period of the day, e.g., 上午 (morning) or 下午 (afternoon). |
| Time | h | Hour of the day, e.g., 9. |
| hh | Hour of the day with leading zero if necessary, e.g., 09. | |
| m | Minute, e.g., 5. | |
| mm | Minute with leading zero if necessary, e.g., 05. | |
| s | Second, e.g., 8. | |
| ss | Second with leading zero if necessary, e.g., 08. | |
| Time Zone | z | Time zone offset with a colon, e.g., +08:00. |
| zz | Time zone offset without a colon, e.g., +0800. |
The above time field formats can be freely combined.
Example: yyyyMMdd ahh:mm:ss will display the current time in the following format: 20240801 下午14:30:00.