Senior Associate - AWS Cloud Data Platform Site Reliability Engineer (SRE)
Location Designation: Hybrid - 3 days per week
As part of Technology, you'll have the opportunity to contribute to groundbreaking initiatives that shape New York Life's digital landscape. Leverage cutting-edge technologies like Generative AI to increase productivity, streamline processes, and create seamless experiences for clients, agents, and employees. Your expertise fuels innovation, agility, and growth — driving the company's success.
Role Overview:
The Enterprise Data Management (EDM) team is seeking a skilled Cloud Data Platform SRE to help build and maintain our core data, reporting, and analytics platform for the Insurance & Agency Group at New York Life. You will be responsible for ensuring the reliability, performance, and scalability of our cloud-based data infrastructure, leveraging AWS services to create a robust and secure environment.
What You’ll Do:
- Monitoring and Incident Management:
- Develop and maintain monitoring, alerting, and logging systems to proactively detect and resolve incidents.
- Perform root cause analysis and implement solutions to prevent recurrence.
- Manage incident response, including on-call rotations, triaging, and escalation.
- Infrastructure Automation and Management:
- Create and manage Infrastructure as Code (IaC) using tools like Terraform.
- Automate deployments, scaling, backups, and disaster recovery processes.
- Develop and maintain CI/CD pipelines to ensure smooth deployment and rollback processes.
- Performance and Reliability Optimization:
- Analyze performance metrics and optimize infrastructure and application performance.
- Define and enforce Service Level Objectives (SLOs) and Service Level Indicators (SLIs).
- Conduct capacity planning and scaling to manage anticipated loads.
- Security and Compliance:
- Implement security best practices, including network security, IAM policies, and encryption.
- Conduct security audits and compliance checks to ensure regulatory adherence.
- Respond to security incidents and implement remediation measures.
- Collaboration and Continuous Improvement:
- Work with development teams to ensure services are reliable, scalable, and easily monitored.
- Collaborate with cross-functional teams to design, build, and maintain cloud infrastructure.
- Identify and implement improvements to operational processes and workflows.
- Disaster Recovery and Business Continuity:
- Design, implement, and test disaster recovery and business continuity plans.
- Ensure regular backups and replication to minimize data loss and downtime.
: What You’ll Bring:
- 3+ years of experience as a Cloud Site Reliability Engineer.
- 1+ years of experience with AWS services (AWS S3, EC2, Glue, Redshift, RDS) in shared service or hybrid environments.
- Proficiency in AWS services (EC2, S3, RDS, Lambda, VPC, CloudWatch, IAM, etc.).
- Strong knowledge of scripting languages (Python, Bash, etc.) and automation tools (Terraform).
- Experience with CI/CD tools and DevOps practices.
- Familiarity with monitoring and logging tools.
- Strong troubleshooting and problem-solving skills, with exposure to Machine Learning (ML) and Artificial Intelligence (AI) fields.
- Exposure to industry-standard Data Governance processes and procedures.
Bachelor’s degree in Computer Engineering, Computer Science, MIS, or a related field is preferred but not required.
#LI-KV1
Pay Transparency
Salary Range: $82,500-$140,000
Overtime eligible: Exempt
Discretionary bonus eligible: Yes
Sales bonus eligible: No
Actual base salary will be determined based on several factors but not limited to individual’s experience, skills, qualifications, and job location. Additionally, employees are eligible for an annual discretionary bonus. In addition to base salary, employees may also be eligible to participate in an incentive program.
Our Benefits
We provide a full package of benefits for employees – and have unique offerings for a modern workforce, including leave programs, adoption assistance, and student loan repayment programs. Based on feedback from our employees, we continue to refine and add benefits to our offering, so that you can flourish both inside and outside of work. Click here to discover more about our comprehensive benefit options or visit our NYL Benefits Site.
Our Diversity Promise
We believe in a diverse workforce because it is our mission to advocate for the financial security and success of people in every community. This is why diversity, equity, and inclusion (DEI) are guiding principles that are embedded in our brand and our culture. Click here to learn more about how we have been recognized for our leadership.
Recognized as one of Fortune’s World’s Most Admired Companies, New York Life is committed to improving local communities through a culture of employee giving and volunteerism, supported by the Foundation. We're proud that due to our mutuality, we operate in the best interests of our policy owners. To learn more about career opportunities at New York Life, please visit the Careers page of www.NewYorkLife.com.
Job Requisition ID: 90803
Job Segment:
Computer Science, Network Security, Outside Sales, Equity, Cloud, Technology, Security, Sales, Finance