VSS enable a pair of Cisco 6500 series switches to act as a single network device logically from the perspective of both lower-layer and upper-layer devices.
How does VSS work
1. When VSS is enabled on both peer, roles such as VSS Active and Standby will be figure out. Data Forwarding is performed on both peer, while the VSS Active Chassis becomes the single point of control of the whole VSS.
2. Configuration, Forwarding, Event and State information are synchronized between VSS Active and Standby Supervisor through Virtual Switch Link (VSL) since startup and whenever event happens or changes occurs.
3. Virtual Switch Link (VSL): Layer 2 EtherChannel with up to 8 10Gbps ports on Supervisor Engine or both Supervisor Engine and line card. On VSL, control traffic has higher priority than data traffic and is never discarded.
4. Multichassis EtherChannel (MEC): The 2 port channels on each Access Switch to respectively connect with 2 chassis of VSS system.
VSS Redundancy & High Availability
Stateful Switchover (SSO)
This switchover provides minimum traffic disruption as Configuration, Forwarding, Event and State information are synchronized between Active and Hot-Standby Supervisors since startup and whenever event happens or change occurs.
For the pre-requisites in terms of software and hardware for SSO, as well as how to configure SSO, refer to, RPR and SSO Redundancy.
Brief Data Path Disruption after SSO: After the SSO and the recovery of failed chassis, much of the processing power of the new VSS active supervisor engine is consumed in bringing up (L3 re-converge & copy FIB to remote line cards) a large number of ports simultaneously in the new VSS standby chassis. As a result, some links might be brought up before the new standby supervisor engine (Fabric Switching Matrix embedded in Supervisor Engine) finishes L3 re-converge & FIB building & copying to line cards, causing traffic to line card ports to be lost until the configuration is complete. This condition is especially disruptive if the link is an MEC link. Besides, line cards on chassis with reloading Supervisor Engine will use Switch Fabric on new Active Supervisor Engine which will further lead to potential packet loss as VSL BW will become the bottleneck for throughput of the whole new Standby chassis. Two methods are available to reduce data disruption following an SSO, refer to Failed Chassis Recovery.
Route Processor Redundancy (RPR)
When pre-requisites are not met, VSS will work in RPR mode in which traffic is disrupted. In RPR mode, the VSS active supervisor engine does not synchronize configuration changes or state information with the VSS standby. The VSS standby supervisor engine is only partially initialized and the switching modules on the VSS standby supervisor are not powered up. If a switchover occurs, the VSS standby supervisor engine completes its initialization and powers up the switching modules. Traffic is disrupted for the normal reboot time of the chassis.
Failed Chassis / Supervisor Recovery
When the former Active Supervisor in VSS Active chassis fails, the Standby Supervisor on the peer chassis will become take over the Active role. The failed supervisor or chassis performs recovery action by reloading the supervisor engine, and then take the Standby role after successful recovery, and the VSS reinitializes the VSL links between the two chassis.
Note that all line cards without DFCs & NSF configured will experience brief service disruption due to L3 re-converge. The bandwidth of the MEC uplink to VSS is reduced potentially due to VSL BW bottleneck issue during reloading of failed Supervisor Engine until it completes its recovery and become operational again. Besides, Switch Fabric of Catalyst 6500 series is embedded on the supervisor engine while, for Nexus 7000 series devices, Switch Fabric is Hardware detached and independent from the Supervisor Engine. Therefore, VSS with only one supervisor engine on each VSS chassis does not provide full HA and redundancy as Quad-Supervisor VSS do.
VSS Overview – Cisco.com