A quick story for everyone...
We generally perform software upgrades on all our routers and switches twice a year. It really helps to keep our network infrastructure current and it also helps to reduced unscheduled downtime.
Last fall we decided to skip the bi-yearly maintenance because there were just too many projects on the docket. This spring we came across a very interesting issue that we had never seen in the past. We started to notice that multiple Nortel Ethernet Switch 460/470 switches/stacks were rebooting themselves all over our network. It took us a few hours to realize that every switch that had rebooted had just eclipsed approximately 500 days of uptime. All the affected switches were running FW 3.6.0.6 with SW v3.6.4.08. The switches were literally rebooting themselves in the same order in which they had been upgraded almost 500 days earlier.
I'm currently trying to confirm with Nortel that this "bug" has been removed from the 3.7.x software release.
This was one occasion where the network was just too good for itself.
Cheers!
Update: Tuesday June 10, 2008
I received a formal response from Nortel today that included the following:
Analysis of the issue :-
When the BS-470 switches reaches 497 days the system time rolls over and during this period management communication will be lost. This is caused by the use of a 32 bit counter, which when it rolls back to 0, initiates an internal software synchronization to align all timers. This is only loss of IP management and not switching functionality.
This issue still open and can be fixed by rebooting the switches before reaching the 497 day mark.
When I inquired if the problem had been resolved in the v3.7.x software release I was told it had not. It would seem that a lot of folks just don't expect switches to be running that long these days.
Cheers!
2 comments:
This is interesting. We usually power cycle all switches every month during a short 1-2 hour maintenance window. We have had switches go over a year, but not 500 days. We're running FW 3.6.0.7 and SW 3.7.2.13.
Unfortunately, the Known Issues Tool does not have any entries for the ES 470; it will be interesting to see what Nortel says about the problem you discovered.
I've updated the original post documenting Nortel's response. Another reason to make sure you perform at least a yearly software upgrade or reboot of the switch.
Cheers!
Post a Comment