Redundancy bitch! You need it

ArsTechnica writes about the disruption that knocked off all Delta Airlines’ operations today and which apparently was caused by a fire in their datacenter:

According to the flight captain of JFK-SLC this morning, a routine scheduled switch to the backup generator this morning at 2:30am caused a fire that destroyed both the backup and the primary. Firefighters took a while to extinguish the fire. Power is now back up and 400 out of the 500 servers rebooted, still waiting for the last 100 to have the whole system fully functional

Now you’d imagine that a company as big as Delta Airlines with such major operations would have redundant servers across two or more datacenters because accidents like this are bound to happen.

Obviously this was an accident and it wasn’t from Delta Airlines’ fault, still their fault is that they were cheap enough to keep their operations in one datacenter only and then register losses and run their business in chaos when for some reason the datacenter went down.

I’m pretty sure they have replication and load balancing along with backup servers altogether, but keeping them in one datacenter is simply insane. I wonder how they would have reacted if this incident would have taken much longer to fix.

Unfortunately this case isn’t the only one in which a major airline is involved and experiences downtime, profit losses and chaos, because according to the same source this is the second severe IT-induced travel disruption in recent weeks when Southwest Airlines experienced a similar situation with a toasted router in its Dallas data center, which resulted in 2,300 flight cancellations.

After all it’s just as they say it: better safe than sorry!

internet network server room with computers racks and digital receiver for digital tv