Datacenters often require maintenance and upgrades. A cloud infrastructure is made up of many interdependent components, so even a seemingly routine or minor change requires consideration for how it could affect the cloud.
Inform Metacloud Support whenever you intend to make a change in your datacenter. Simply submit a Support request with a P4 priority level and provide as much relevant information as possible. Over-informing is preferable to leaving out important details, especially if:
- You have a new Metacloud installation or are unfamiliar with administering it.
- You haven't performed the given maintenance task or upgrade before.
- You are unsure of how a change might affect your cloud. See the following section for some examples of sensitive changes.
Once informed of intended changes, Support can advise you of best practices or actions to avoid. Also, Support knows not to investigate alerts triggered by events that occur during routine datacenter maintenance, such as servers going offline. This frees up Support resources to troubleshoot customer problems.
Datacenter Changes that Can Disrupt Cloud Operations
Certain types of datacenter changes are especially sensitive. Without proper planning and proactive communication with Support, they can put your cloud out of commission, such as the following examples:
Changing even a minor setting in a network segment that your organization controls, such as a firewall rule, can cause issues ranging from Support's inability to monitor your cloud to your entire cluster losing connectivity.
Note: In a typical Metacloud deployment with three Metacloud Control Planes (MCPs), if more than one MCP loses network connectivity, your cloud operations will be severely disrupted.
Firmware compatibility is important to maintain across your datacenter, especially with so many interdependent components. Incompatibility across devices can result in problems like kernel panic or machines failing to boot. For example, if you have two Fabric Interconnect switches connected for failover redundancy, and you plan to upgrade the firmware on one, you must make sure the firmware on the other switch is compatible. Even two different devices may need to have compatible firmware versions, such as a switch and an MCP.
Adding, replacing, swapping, or repairing hardware components without preparation can cause power or connectivity outages or data loss in your cloud. Even replacing or installing cable can be disruptive if not handled properly.
Changes Affecting Storage or Boot from Volume
If your cloud integrates a storage driver such as Pure or SolidFire, replacing disk drives or other components can be disruptive if not handled properly, resulting in data loss.
See Managing Datacenter Changes with a Ceph Deployment for information on how maintenance can impact storage.