Cross-Cluster Interconnect Issues¶
This page explains issues related to cross-cluster interconnect in a service mesh and their solutions.
Cross-Cluster Service Experiences 10s Access Latency¶
In a managed mesh with 2 clusters, both clusters have the same test service. When accessing the test service via the ingress gateway, there are occasional 10-second delays.
Cause Analysis¶
- The clusters are managed, services are discovered, but multi-cloud interconnect is not enabled, causing network issues. Requests fail and fallback to the local cluster's test service, leading to delays.
- Multi-cloud interconnect is enabled, clusters are in the same network group, but the network communication between pods in interconnected clusters is not established.
- East-West gateways are in an abnormal state.
- Some clusters are down. Multi-cloud interconnect solves the network connectivity between clusters but does not handle individual service anomalies. Hence, an outlier detection policy needs to be configured.
Solutions¶
- Enable multi-cloud interconnect.
- Create multiple network groups, place clusters in different groups, and restart all pods.
- Identify and fix the cause of the East-West gateway issues.
-
Enable outlier detection in the destination rule:
After successful configuration, instances from down clusters will be automatically removed, preventing delays.
Traffic in Mesh Only Hits Test Services in Some Clusters¶
In a managed mesh with 2 clusters, multi-cloud interconnect is enabled and successfully configured. However, continuous access to the test service via the ingress gateway only hits test services in some clusters.
Cause Analysis¶
- Some test services are in an abnormal state. Check the service status.
- Some test services have not injected sidecars. Check the sidecar injection status.
- Some test service configurations are incorrect. Check service configurations such as svc ports, port names, etc.
- Multi-cloud interconnect was enabled after the test services were created.
Solutions¶
- Investigate and resolve the causes of service anomalies to restore normal service status.
- Inject sidecars into the services.
-
Ensure consistent svc configurations for all test services. Use the Service Management -> Service List diagnostic function to assist in observation:
-
Restart all gateways, including self-built gateways and the North-South and East-West gateways in the data plane clusters:
By following these steps, you can troubleshoot and resolve issues related to cross-cluster interconnect in your service mesh.