How to Troubleshoot Spanning-Tree Protocol

In most networks, the best Spanning-Tree Protocol topology is determined as part of the network blueprint. The Spanning-Tree Protocol is implemented by configuring Spanning-Tree Protocol priority and cost values. Several things could go wrong.

You can expect something but your switches can give you something different. Situations are also occurring when Spanning-Tree Protocol was not measured in the network planning and implementation, or where it was measured or implemented before the network growth and change.

In these situations for troubleshooting Spanning-Tree Protocol, it is impotent to analyze the actual Spanning-Tree Protocol topology in the operational network. The steps for analyzing spanning-tree are the following:-

  1. In the first step find out the Layer 2 topology. For the topology consult network documentation if it prepared previously
  2. Use the show cdp neighbors”  The command helps to find the Layer 2 topology.
  3. When Layer 2 topology discovered, use spanning-tree protocol knowledge to resolve the predictable Layer 2 path.
  4. It is also important to know the root bridge. Use theshow spanning-tree vlan <vlan_id >command to resolve which switch is the root bridge.
  5. Use the show spanning-tree vlan <vlan_id > command on all switches for finding the port state and confirm your expected Layer 2 path.

Expected Topology vs Actual Topology

Comparing the actual state of the network against the expected state of the network and spot the differences to gather provide help about troubleshooting the problem. A network administrator can examine the switches and resolve the actual topology, and be able to recognize what the superior spanning-tree topology should be.

Overview of Spanning Tree Status

The overview of the spanning-tree is playing an important role in the troubleshooting. For an overview, we can use the “show spanning-tree” command without specifying any additional options to provide a quick overview of the status of STP for all VLANs.

We can limit the output of the command by specifying a particular VLAN.  The command syntax for specifying a VLAN is  “show spanning-tree vlan vlan_id”. The command output will display information about the role and status of each port on the switch.

The port role and state can be Root, Designated, alternate, etc.  The output of the command also provides information about the bridge ID of the local switch including the bridge ID of the root bridge.

Spanning Tree Failure Consequences

There are two types of failure that can occur with STP. In the first problem, the STP may block the wrong port that is planned in the forwarding state. This problem might be lost traffic that would normally pass through this switch, but other network remains unaffected.

The second type of failure is much more troublemaking, as shown in Figure below. It happens when Spanning Tree Protocol wrongly moves one or more ports into the forwarding state.

Troubleshooting Spanning-Tree Protocol

Recall that an Ethernet frame header does not contain a TTL field, so any frame that enters a bridging loop remains continuous forwarding switch to switch indefinitely.

The frames that have their destination address recorded in the MAC address table of the switches are simply forwarded to the port that is associated with the MAC address and does not enter a loop. But, any frame flooded by a switch enters the loop. The flooded traffic may include broadcasts, multicasts, and unicasts with a globally unknown destination MAC address.

What is the sign of STP failure?  The load on all links starts increasing as more and more frames enter the loop. The frames also affect other links in the switched network because the frames are flooded on links. If the failure is on a single VLAN then only the corresponding VLAN are affected. Switch and trunks that are not related to this VLAN operate normally.

The spanning-tree failure can create bridging loops, in this case, traffic increasing exponentially and the switches will flood the broadcasts out multiple ports. This creates copies of the frames each time the switches forward them.

When traffic like OSPF or EIGRP hello packets enters the loops, the devices that are running these protocols quickly get overloaded.  The CPU of these devices quickly reaches 100 percent utilization.

The network switches change the MAC address table frequently. If a loop exists, a switch may see a frame with a particular source MAC address receiving in on one port, and then the switch receives another frame with the same source MAC address on a different port.

So the switch will update the MAC address table twice for the same MAC address. Due to the high load and maximum CPU utilization, these devices become unreachable. This makes the troubleshooting very difficult.

Repairing a Spanning Tree Problem

The first method of resolving the problem is to remove the remove redundant links in the switched network. The redundant link can be removed, both physically and through configuration.

When the loops are removed and broken, the traffic and CPU loads should quickly go down to normal levels, and connectivity to devices also restored.

This is restored the network troubleshooting but this is not the end of the troubleshooting process. Because all redundant paths have been removed from the network. It needs to restore the redundant links.

If the problem of the spanning tree failure has not been fixed, there are chances during restoring the redundant links that a new broadcast storm will be trigger again. So, before restoring the redundant links, find out and correct the original fault.