Hydra Cloud to the Rescue!

In case you didn’t know AWS had an outage yesterday in it’s EU-Central-1 Region:

Twitter: Fire in datacenter that’s part of awscloud’s eu-central-1 region. Impact limited to single AZ, no one hurt.

Our teams were actually notified of the outage via our own alerting system and were able to trigger a failover to a different region in the EU before AWS posted the incident. This means that our customers experienced almost zero impact.

As a cloud service, we rely on cloud platforms like AWS for our infrastructure. AWS provides an incredibly stable infrastructure, and we value our partnership with them. But we know outages can occur. Therefore, we have designed our product to be able to react to any outage quickly and efficiently with very little impact to our customers. Our mature telemetry notifies us when something is wrong immediately. And our Hydra Cloud infrastructure enables our product to not only scale out when necessary to meet demand but to also seamlessly failover as it did yesterday.

OneLogin’s Hydra Cloud combines the strengths of our architecture with modern site reliability and scaling approaches–including containerization, microservices, orchestration, service mesh, dynamic clustering and routing, etc.–to achieve new levels of site reliability and performance. Last year we implemented our HydraBoost which was able to leverage our Hydra Cloud infrastructure to respond to 1 million login requests per minute. Yesterday we saw the power of this infrastructure design in its ability to respond to a major outage.

Every day our engineers are working hard monitoring and maintaining our infrastructure. They are also constantly designing and implementing ways to improve our infrastructure and our overall performance. Recently, in fact, they rolled out a change that improved a background process that automatically updated users records. A job that once took up to 2 hours to complete now takes only a few minutes. We know our customers rely upon us for core functionality: accessing applications they need to do their job or provide access to their solutions. We take this responsibility very seriously and are happy to know that it worked exactly when it needed to and that if you are one of our customers and this is the first time you are hearing about an outage with AWS yesterday, then we have done our job successfully.

About the Author

Alicia Townsend

For almost 40 years, Alicia Townsend has been working with technology as both a consultant and a trainer. She has a passion for empowering others to use technology to make their lives easier. As Director of Content and Documentation at OneLogin, Ms. Townsend works with technical writers, trainers and content marketing writers to inspire and empower everyone to take advantage of what OneLogin’s platform has to offer them.

Related Articles