Senior Site Reliability Engineer
Microsoft | |
remote work | |
United States, Washington, Redmond | |
Jan 10, 2025 | |
OverviewHalo Studios is the developer of the blockbuster Halo series of video games. As part of Xbox Game Studios, the Halo franchise encompasses games, television series, novels, comics, licensed collectibles, apparel, and more. We are looking for talented new voices to join our team to build the next generation of games and experiences in our award-winning sci-fi universe. Do you want to closely collaborate with teams to drive inspiring, cohesive gameplay experiences? Are you interested in both celebrating and evolving Halo's legacy to create a new generation of stories that millions of players love? If so, the Halo team would love to speak with you about becoming our next Senior Site Reliability Engineer! You will be a key member of the Halo Studios IT Engineering team, responsible for studio data and IT services and infrastructure within our studio. You will be contributing to the architecture and design of new on-prem and cloud infrastructure, while continuing to drive optimization, performance, security, and reliability with cutting edge technologies and automation. You will empower artists, developers, and others in our studio by proactively designing technical solutions to maximize their efficiency. Along the way, you will be a trusted voice who shares your knowledge and expertise within our team and other teams in the studio. You will be joining a fast-paced team that constantly provides new opportunities to learn and grow. Roles at our studio are flexible, and you can work from home up to two days a week in this role. Microsoft's mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.
ResponsibilitiesArchitect, implement, and optimize critical hybrid and cloud-based IT infrastructure utilizing Infrastructure as Code (IaC) technologies (e.g., Docker, Terraform, AKS) to ensure high availability, scalability, security, and operational efficiency. Design and implement Azure Networking solutions including Site-to-Site tunnels, App Gateway, Private Endpoints/Private Link, DNS, and network security for Azure resources, ensuring secure and reliable connectivity. Lead the design and implementation of data governance, storage, backup, and disaster recovery solutions for a multi-Petabyte Azure-based environment, ensuring data integrity, security, and performance. Research, evaluate, and integrate emerging tools and methodologies into the technology roadmap, to continuously optimize efficiency, reliability, and scalability. Architect and deliver automation solutions to improve service health, manageability, reliability, telemetry, and alerting, with a focus on resiliency and operational improvements. Produce and maintain clear and accurate technical documentation and design specifications that align with best practices. Collaborate with software engineers, project management, and operations teams to improve and optimize infrastructure and evolve services, ensuring alignment with organizational goals. Demonstrate agility and responsibility by participating in on-call rotations, owning critical issues, communicating effectively to diverse stakeholders, and ensuring rapid resolution and knowledge transfer with an emphasis on learning and improvement. |