Required check items: System status check (services, configuration files, partitions, and storage space), system configuration check (license keys, interfaces, and virtual storage), and hardware health check (memory, NICs, and disks).
Optional check items: Disk performance check, which increases the storage IO usage. Perform this check during off-peak hours to avoid affecting core services, such as Oracle databases or production systems. For example, perform the check in the early morning on weekends.
Perform health checks and fix results:
Frequency: It is recommended to perform a health check every week (same as weekly maintenance). After a cluster adjustment, such as capacity expansion or SP installation, perform an additional health check.
Result analysis: After a health check is complete, locate abnormal check items first. Click an abnormal check item to view the description of the anomaly and the suggested solution. For example, when the transfer rates of storage network interfaces are inconsistent, the system suggests using 10GE interfaces.
Closed-loop management: Faults, such as hardware faults, must be immediately fixed. Alerts, such as performance optimization suggestions, must be fixed within one week. Health check reports must be retained and archived.[4]