We use cookies. Find out more about it here. By continuing to browse this site you are agreeing to our use of cookies.

Job posting has expired

#alert
Back to search results

Senior Site Reliability Engineer

Intercontinental Exchange
United States, Georgia, Atlanta
5660 New Northside Drive Northwest (Show on map)
Mar 25, 2025
Overview

Job Purpose

ICE Mortgage Technology (IMT) is the leading cloud-based platform provider for the mortgage finance industry. ICE Mortgage Technology solutions enable lenders to originate more loans, reduce origination costs, and reduce the time to close, all while ensuring the highest levels of compliance, quality and efficiency.

This is an exciting opportunity for a Senior Engineer in the Site Reliability Engineering team to provide resilient and secure services, design reliable, scalable and stable systems, and build actionable alerts/automation for preventing incidents and to detect performance bottlenecks. A Senior Engineer will also quickly troubleshoot issues to restore service.

Responsibilities

  • Employ deep troubleshooting skills to improve the availability, performance, and security of IMT Services.
  • Work closely with development teams to ensure services are resilient and highly available.
  • Implement proactive monitoring, alerting, trend analysis and self-healing systems.
  • Coding and automation of applications on Cloud Platform.
  • Define and measure KPIs and SLOs.
  • Implement automated deployments, automated tests, and operational tools.
  • Collaborate with Product and Support teams to plan and deploy product releases.
  • Define non-functional requirements as part of the product lifecycle to influence the new designs, standards, and methods for scalable, highly available distributed systems.
  • Partner with other SREs and lead by example - contributor more than a delegator.
  • Incident management during high stress issues and timelines.
  • Follow incident management lifecycle. Ensure issues are well documented and tasks are accomplished to ensure incidents do not repeat.

Knowledge and Experience

  • 7+ years of Systems/Applications automation and incident response in 24x7 Production Services environments.
  • BS in Computer Science, Computer Engineering, Math, or equivalent professional experience.
  • Fluency with one or more current generation scripting language used by DevOps professionals (Powershell, Python, Ruby, PHP, Perl) or Java/.NET development.
  • Excellent troubleshooter, utilizing a systematic problem-solving approach.
  • Demonstrated experience in designing, analyzing, and diagnosing large-scale distributed systems.
  • Experience with infrastructure as code and configuration as code, utilizing tools like Terraform, CloudFormation, SpaceLift, Chef, SaltStack, Puppet, DSC.
  • Knowledge of Windows Server and/or Linux systems internals (system libraries, file systems, kernel) and client-server network protocols.
  • Experience with elastic scaling, fault tolerance and other cloud architecture patterns.
  • Proven strength in SaaS services, experience in massive scale web operations.
  • Experience operating on AWS or other public Cloud (both PaaS and IaaS offerings).
  • Experience in Containerization/Docker/Micro-Services.
  • Experience in Jenkins and build/deploy automation.

Schedule

This role offers work from home flexibility of one day per week.

Intercontinental Exchange, Inc. is an Equal Opportunity and Affirmative Action Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, sexual orientation, gender identity, national origin or ancestry, age, disability or veteran status, or other protected status.

#LI-DR1

#LI-Hybrid

(web-6468d597d4-w6ps7)