Cloud Platfrom Reliability Engineer


Date: Dec 29, 2020

Location: Alpharetta, GA, US

Company: New York Life Insurance Co


A career at New York Life offers many opportunities. To be part of a growing and successful business. To reach your full potential, whatever your specialty. Above all, to make a difference in the world by helping people achieve financial security. It’s a career journey you can be proud of, and you’ll find plenty of support along the way. Our development programs range from skill-building to management training, and we value our diverse and inclusive workplace where all voices can be heard. Recognized as one of Fortune’s World’s Most Admired Companies, New York Life is committed to improving local communities through a culture of employee giving and service, supported by our Foundation. It all adds up to a rewarding career at a company where doing right by our customers is part of who we are, as a mutual company without outside shareholders. We invite you to bring your talents to New York Life, so we can continue to help families and businesses “Be Good At Life.” To learn more, please visit LinkedIn, our Newsroom and the Careers page of


Job Function: 

This position will serve as the Platform Reliability Engineer helping to ensure customer success in building, operating, recovering and optimizing applications utilizing AWS Databases (RDS & Standalone) and Data Migration Services.

The Cloud Platform Reliability Engineer will be a member of the Cloud Services team and drive cross-functional technology for delivering projects on an IaaS platform to meet critical technology and business requirements. This position is also responsible for the automation of availability, performance, maintainability and optimization of the NYL IaaS (AWS) platform at New York Life. 


Key Duties and Responsibilities:

  • Primary function is to automate and deliver repeatable, reliable solutions using tools like Terraform and GitHub
  • Automate solutions to eliminate manual work or for repeatable problems
  • Maintain and develop our growing Terraform infrastructure-as-code library which we use to deploy infrastructure and applications (RE’s handle defects and features)
  • Review architecture diagrams for RDS implementations. 
  • Recommend Reserved Instance purchases for RDS instances.
  • Work with the DBA’s to ensure the AWS database environment is sufficiently tuned and cost optimized.  Recommend cost savings opportunities.
  • Hands on experience with AWS Data Migration Services.
  • Train development teams and new users on services and automation capabilities.
  • Participate in remediating any gaps in the CIS Benchmarks for Cloud.
  • Aid with application implementations using primarily Linux technology on AWS
  • Assist and oversight production releases
  • Ensure alignment to cloud standards and best practices
  • Follow the enterprise change management process to deploy fully tested and documented solutions/applications to a production environment
  • Work with development teams to apply Terraform plans to production
  • Review usage to optimize, maximizing utilization of deployed resources and reduce spend.
  • Collaborate with Cloud Architects and Solution Engineers to deliver highly available, cost optimized RDS implementations (fully monitored, diagrams, logging, backups, etc.)
  • Provide tier 3 production operational support
  • Service the L3 Reliability Engineering ServiceNow ticket queue.
  • Potential for off-hours incident response
  • Ability to multi-task and manage tasks with varying priorities
  • Self-motivated, innovative and able to work across diverse technical and non-technical teams. 
  • Must be self-directed and willing to learn new tools and services to stay up to date with our evolving platforms
  • Strong verbal and written communications skills are a must, as well as the ability to work effectively in a virtual team setting.


Required Qualifications: 

  • Must have: To be successful you must be able to write and implement infrastructure and platform automation using Terraform.  Must be fluent as our environment is fully automated. 
  • Practical experience with operating, monitoring and performance tuning popular relational and NoSQL Databases (i.e.: Oracle, PostgreSQL, MSSQL, Aurora)
  • AWS Public cloud provider experience (AWS certification a plus)
  • Fluent operating system and admin knowledge of the Linux platform using shell scripting
  • Working knowledge of DevOps and delivery tools (GitHub)
  • Practical understanding of infrastructure technologies - compute, network, storage
  • Comfortable scripting in one or more of the following (bash, Python, C#, Java, Perl)


Required Experience

  • 3+ years of overall IT experience, proficient with the Linux platform
  • 2+ years with Amazon Web Services
  • 2+ years’ experience with at least 1 relational database technology (PostgreSQL, Oracle, MS SQL
  • 2+ years of automating solutions and implementation of highly scalable, highly available services using Terraform
  • 2+ years practical experience with one or more of the following: Python, Java, C#, Perl
  • Understanding of Software Development Lifecycle Methodologies
  • Understanding of application development including application servers, middleware, systems management, monitoring, configuration management, capacity planning and performance tuning
  • Self-motivated and able to work across diverse technical and non-technical teams

Education: BS degree in Computer Science or Engineering or the required on the job, hands on experience is acceptable. 





If you have difficulty using or interacting with any portions of this Web site due to incompatibility with an Assistive Technology, if you need the information in an alternative format, or if you have suggestions on how we can make this site more accessible, please contact us at: (212) 576-5811.

Job Requisition ID: 82974 

Nearest Major Market: Alpharetta
Nearest Secondary Market: Atlanta

Job Segment: Cloud, Engineer, Database, Oracle, Social Media, Technology, Engineering, Marketing