Last week we told you about a massive outage involving Amazon’s web services that resulted in over one hundred thousand websites going down for a matter of hours. Now we’ve learned what caused the outage and it was simple as a human typo.

An Amazon employee attempting to debug the service’s billing system simply entered a code wrong. That was it. It was an established code and it was all perfectly following protocol, but a simple typo told the system to remove more servers than the employee was trying to remove. Those servers needed to then be reset, and the outage occurred during that reset, when the systems couldn’t service requests.

Due to the incident, Amazon is making some changes to the way it manages its systems. The ability to remove servers is crucial, but in this instance the tool used was able to remove too many at once.”

So yeah, if you ignore all the technical mumbo jumbo, it was a typo that brought down a significant portion of the Internet.

Share This With The World!
  •  
  •  
  •  
  •  
  •  
  •  
  •