hero

Career Central

Connecting people since 1887

Cloud Solution Engineer - Observability

New York Life Insurance

New York Life Insurance

Clinton, NJ, USA
Posted on Tuesday, October 24, 2023

Location Designation: Hybrid 

 

 

When you join New York Life, you’re joining a company that values career development, collaboration, innovation, and inclusiveness. We want employees to feel proud about being part of a company that is committed to doing the right thing. You’ll have the opportunity to grow your career while developing personally and professionally through various resources and programs. New York Life is a relationship-based company and appreciates how both virtual and in-person interactions support our culture.

 

 

Note: This role only requires 3 days a quarter in the office - Hybrid schedule.

 

We are seeking an experienced observability and event management engineer to join our team to help enable and uplift practices throughout the enterprise. The ideal candidate will have experience implementing observability and event management best practices in large, complex organizations with diverse technologies spanning cloud, on-premises, and co-lo locations.  

 

Key Duties and Responsibilities 

  • Build and maintain strong relationships with teams across the organization, including IT infrastructure, operations, application development, and business partners, ensuring understanding and alignment on observability and event management practices. 

  • Clearly communicate observability and event management concepts and best practices with individual contributors and leadership across the enterprise, building alignment and understanding and conveying importance of key concepts and governance. 

  • Assist teams with observability projects and initiatives spanning various technology stacks, helping identify requirements, technology considerations, and solutions to provide optimal observability. 

  • Partner with SaaS vendors on product features, defects, and contract renewals. 

  • Create and maintain documentation and governance related to observability and event management, ensuring understanding at all levels of the organization. 

  • Conduct enablement and troubleshooting sessions with teams and individuals, ensuring proper and ongoing engagement with monitoring and event management tools. 

  • Leverage Jira and ServiceNow to manage work and communicate project status to leadership and stakeholders. 

  • Mentor junior team members. 

  • Design, maintain, and implement observability and event management architecture documentation and diagrams, ensuring new and evolving technologies are within scope of current and future practices and solutions. 

  • Champion the design, implementation, and support for applications, systems, and IT products crucial for the business's objectives. 

  • Help teams implement observability tools and leverage the available telemetry data to troubleshoot and resolve incidents and problems. 

  • Implement event management concepts, such as event aggregation and correlation patterns, reducing incident noise while meaningfully combining event data. 

  • Leverage observability and event management to improve key incident management metrics, such as mean time to detect and mean time to restore service. 

  • Design, develop, and implement innovative solutions to improve observability and event management practices and processes. 

  • Influence and drive cultural organizational change from traditional IT Ops to modernized Agile operational philosophies and concepts. 

 

Qualifications 

  • 5+ years experience on public cloud platforms (AWS & Azure) 

  • 5+ years direct experience with observability and event management tools, including New Relic, BigPanda, PagerDuty, and ServiceNow. 

  • 5+ years working in application development and/or IT operations in large, complex environments, including on-prem and cloud infrastructure. 

  • Proven track record leveraging core observability concepts, including application performance monitoring, end-user monitoring, and infrastructure monitoring with SaaS solutions. 

  • Experience with programming and scripting languages, such as Go, Python, SQL, JavaScript, and PowerShell. 

  • Experience with Agile methodologies preferred. 

  • Experience with automation tools, such as Terraform. 

  • Excellent written and verbal communication. 

  • SRE and/or DevOps experience preferred, including practices, processes, and tools. 

  • Bachelor’s or master’s degree in computer science or related field preferred but not required.

 

 

Salary range: $132,500-$197,500 

Overtime eligible: Exempt 

Discretionary bonus eligible: Yes 

Sales bonus eligible: No 

 

Click here to learn more about our benefits. Starting salary is dependent upon several factors including previous work experience, specific industry experience, and/or skills required.

 

 

Recognized as one of Fortune’s World’s Most Admired Companies, New York Life is committed to improving local communities through a culture of employee giving and volunteerism, supported by the Foundation. We're proud that due to our mutuality, we operate in the best interests of our policy owners. We invite you to bring your talents to New York Life, so we can continue to help families and businesses “Be Good At Life.” To learn more, please visit LinkedIn, our Newsroom and the Careers page of www.NewYorkLife.com.

Job Requisition ID: 89576