Site Reliability Engineer
Multiple Locations: Austin, TX, USA • Seattle, WA, USA • Redwood City, CA, USA • Vancouver, BC, Canada • Orlando, FL, USA
Requisition Number: 174615
Position Title: Site Reliability Engineer III
The EA Product Infrastructure & Engineering (PI&E) group is part of the CTO organization and is the foundation on which EA's games are built and operate. PI&E builds and operates a broad platform of infrastructure services that powers games and enables game development across EA's global ecosystem Our team's charter is to make EA's games and services available to all players anytime and anywhere. To do this, we focus on the high availability of infrastructure, core services, and studio services. Additionally, we aim to help developers to experiment and build new games quickly with infrastructure services on-demand and workflows that ensure rapid development in the cloud. In all of this, we partner across EA to promote infrastructure cost efficiency and infrastructure optimization.
As a Site Reliability Engineer, your role covers the entire life-cycle of a product-- from helping developers with architecture and delivery to on-call incident response and triage. Your primary focus will be automation and continuous integration/delivery with an emphasis on solving operations issues using software. You will report to the Manager of Systems Engineering.
- You will create monitoring, alerting and dashboarding solutions that improve visibility into EA's application performance and business metrics.
- You will develop and troubleshoot distributed, large-scale production systems spanning on-prem. and cloud-based hosting.
- You will write code, meeting peer-reviewed code standards.
- You will perform root cause analysis and post-mortems with an eye towards future prevention.
- You will use automation technologies to ensure repeatability, eliminate toil, reduce mean time to detection and resolution (MTTD & MTTR) and repair services.
- You will design CI/CD pipelines.
- You will produce documentation and support tooling for online support teams.
- 5+ years of experience monitoring infrastructure and application uptime and availability to ensure SLI and SLO
- 5+ years of experience with Virtualization, Containerization, Cloud Computing (AWS preferred), VMWare ecosystems, Kubernetes, Docker
- Systems Administration experience, including an understanding of *nix
- Network experience, including an understanding of standard protocols/components
- Automation and orchestration experience including Chef, Puppet, Terraform, Packer, Jenkins
- Experience writing code in Go, C#, Ruby, or Python
- Experience working with distributed systems
Community / Marketing Title: Site Reliability Engineer
Electronic Arts Inc. 是全球领先的互动娱乐软件公司。 EA 提供适用于联网主机、个人电脑、手机和平板电脑的游戏、内容和在线服务。
EEOText: EA 是一个奉行机会均等的雇主。 所有招聘决定均不考虑种族、肤色、国籍、血统、生理性别、社会性别、性别认同或表达、性向、年龄、遗传信息、宗教信仰、残障状况、医疗状况、怀孕状况、婚姻状况、家庭状况、退伍军人身份或其他任何受法律保护的特征等因素的影响。 我们也会遵照相关法律，考虑雇佣带犯罪记录的合格申请者。 EA 还会遵照相关法律，为符合条件的残障个体改善工作环境。
Date Opened: 2022-07-08 21:40:08.383
EEO Employer Verbiage:
EA 是一个奉行机会均等的雇主。 所有招聘决定均不考虑种族、肤色、国籍、血统、生理性别、社会性别、性别认同或表达、性向、年龄、遗传信息、宗教信仰、残障状况、医疗状况、怀孕状况、婚姻状况、家庭状况或退伍军人身份等因素的影响。 EA 还会遵照相关法律，为符合条件的残障个体改善工作环境。