Yahoo Japan Consolidates 164 OpenStack Clusters into One

Moderate risk — monitor and plan remediation
Basically, Yahoo Japan is simplifying its cloud systems to make them work better and safer.
Yahoo Japan is consolidating 164 OpenStack clusters into one. This change aims to enhance efficiency and security for its massive user base. The new cloud, Flava, will streamline operations and improve service reliability.
What Happened
Yahoo Japan's parent company, LY Corporation, is undergoing a significant transformation by consolidating its 164 OpenStack clusters into a single cloud infrastructure named "Flava." This decision comes as the company aims to streamline operations and enhance the reliability of its services, which cater to around 300 million monthly users.
The Issue
Previously, Yahoo Japan's cloud infrastructure was heavily customized, making upgrades challenging and complicating maintenance. According to Ryuutarou Inoue, head of LY’s Cloud Infrastructure Unit, the legacy system's complexity hindered the ability to implement timely updates and security patches. The new strategy focuses on adopting a more conventional version of OpenStack, minimizing custom modifications to facilitate easier upgrades.
A New Approach
The new cloud architecture, Flava, will operate on a much larger scale with 500 hosts and over 9,000 virtual machines (VMs). This design aims to achieve three key objectives:
- Pursuing Statelessness: By defining VM root disks as temporary, persistent data is moved to external storage, reducing service disruption during failures.
- Application-Driven Availability: Instead of relying solely on infrastructure for uptime, the design integrates application-level strategies to enhance reliability.
- Faster Recovery: In case of incidents, the focus shifts to maintaining service continuity rather than restoring the previous state, utilizing Infrastructure as Code (IaC) for quick environment rebuilding.
Monitoring and Automation
To ensure the health of the new cloud, LY Corporation employs various monitoring tools like Prometheus and Grafana. These tools help detect anomalies early, allowing for prompt responses to potential issues. Inoue mentioned that the company automates many processes, from detecting hardware failures to reintegrating replaced components into the clusters.
Security Enhancements
This consolidation comes on the heels of previous infosec problems that exposed user data, prompting government intervention to improve security measures. By streamlining their cloud infrastructure, LY Corporation aims to bolster security and privacy for its users, ensuring compliance with regulatory standards.
What's Next
As LY Corporation moves forward with the Flava cloud, it plans to contribute functional changes back to the upstream OpenStack project. This proactive approach not only enhances their infrastructure but also supports the broader open-source community, fostering collaboration and innovation in cloud technologies.
🔒 Pro insight: This consolidation reflects a broader trend in cloud infrastructure, emphasizing simplicity and security to meet user demands effectively.