Some common Metacloud issues can occur during the initial build process for your Metacloud environment, or after an environmental change, such as the addition of resources for scaling or replacement of a failed hardware component. Review your hardware and network configurations, especially anything you changed, if you experience the following, or similar, problems:
- Inability to add new instances to a project
- Inability of the Metacloud Support team to monitor or administer the cloud environment
- Unavailability of Metacloud services
- Lack of redundancy if a given network connection fails
Some of the most common errors related to installation and configuration of components occur in the following categories.
Cabling requires close attention, especially for new installations. Incorrect cabling impacts important activities in the following areas:
- Failed out-of-band (OOB) connectivity prevents the Metacloud Support team from administering your cloud environment during the build process. It can also be an issue when the cloud is in production. The OOB connection serves as a backup if Support's primarily administrative link is unavailable and allows Support to troubleshoot hardware-related issues.
- Failed connectivity between between Metacloud Control Planes (MCPs) and switches blocks availability of Metacloud services.
- Lack of redundancy can result in disruption of network activity if a router or switch fails. The Cisco Metacloud Controller Bundle includes two Aggregation Services Routers (ASRs) and two switches for redundancy. Each MCP includes two network interface cards (NICs). Proper redundant cabling connects one NIC on each MCP to each ASR.
To verify that your cabling is correct, refer to the cabling guidelines and diagrams in the Metacloud Controller Installation Guide. To verify connectivity for Compute services, see the section of the guide titled Cabling Compute Servers to the Nexus 9396 Switches. It includes a table that maps ports on the switch to specific Compute server nodes.
Certain ports must be open in your access control list (ACL) to allow connectivity between your network and your cloud. Their unavailability will prevent the Support team from performing a number of critical administrative and maintenance activities for your cloud, including the following:
- Testing for network connectivity
- Downloading and installing software
- Monitoring your cloud for issues
- Running backups
To verify that all necessary ports are open, see the section titled Open Network Ports in the Metacloud Controller Installation Guide for a list of ports and their functions.
Virtual LANs (VLANs)
If you encounter problems when adding new VLANs within your organization's control, it is important to know how to properly identify and connect them to avoid usage conflicts, as in the following examples:
- The ASRs must be connected to an external network. This network provides a link between your Metacloud Availability Zone (AZ) and the rest of the domain. You must provide to the Metacloud team the ID for the VLAN that is assigned to the external network. The Metacloud default ID is 99, but verify the number with your Metacloud or network administrator. See the Metacloud Controller Installation Guide a list of default VLAN IDs.
- To add an infrastructure network for purposes such as storage or backup, you must connect it directly to a Nexus 9396 switch, starting with port 41, to separate it from an external network.
Hardware Failure or Overload
You can't predict the failure of a hardware device within your organization's control; however, being aware of the failure as soon as possible enables you to take corrective action quickly to lessen disruption of your business operations. Such an event will trigger an alert to the Metacloud Support team, who will notify you. See Responding to Resource Issues.
You can anticipate when MCPs are nearing usage capacity by downloading usage reports or viewing metrics and Live Stats in your Dashboard.