Site Reliability Engineer

Philadelphia, Pennsylvania, United States · 57280

Description

Site Reliability Engineer

Philadelphia, PA

Candidates must be a US Citizen or US Permanent Resident

Job Description:

This is an opportunity to work in a department that provides back-end services to teams that create applications that are used by tens of millions of people.
As a member of the SRE team you will partner with development teams to increase speed of deployment, reliability, and availability of applications consumed internally by other development and operations teams. The right candidate will be passionate about automation, reduction of toil, and the measurement of all things (logging, alerting, health-checks).
Our team has a 'Cloud First' approach so the candidate must understand the inner workings of the cloud from a standpoint of reliability, distributed communication, and security.
A background in development is preferred since the right candidate should have a mindset of automating things using a programmatic approach. We do not fix things once; we fix things for good.
Our team is responsible for on-call support of the applications we manage.

Core Responsibilities:
• Contribute to a team that is at the forefront of the SRE practice at Comcast
• Ideate, design, engineer, and implement systems and solutions at a scale spanning regions, providers, and business

verticals
• Improve service reliability through effective use of monitoring, alerting, break-fix, blameless post-mortems, and

engineering of long-term fixes
• Uncover sources of toil and promote the automation of these tasks by convincing your teammates to implement

effective solutions that you suggest
• Provide documentation to the team that would allow a reasonably skilled, but inexperienced, individual to make it

through an on-call rotation without help from a teammate

Requirements

Qualifications:
• Facility with GitHub and the CLI
• Facility with Concourse and/or Jenkins
• Experience deploying and operating applications in the public cloud (AWS, Azure, Google Cloud)
• Previous experience developing applications and/or automation with one of the following languages (Java, Scala,

Python, Ruby)
• Facility with Linux
• Experience in building monitoring/alerting/reporting for deployment pipelines and applications
• Firm understanding of the practice of SRE and an ability to communicate with peers and management regarding

said practice
• Firm understanding of Agile methods, Scrum and Kanban

We get excited by candidates who:
• Have an interest in solving problems of scale
• Enjoy facilitating technical discussions between development teams and the SRE team
• Understand the importance of documentation and automation to the SRE practice

• Are confident in their experience, knowledge, and skills and readily share these for the betterment of the team

Benefits

Leading Path is an award-winning Information Technology and Management Consulting firm focused on providing solutions in process, technology, and operations to our government and Fortune 500 clients. We offer a professional and work environment with a strong work-life balance. Leading Path provides a comprehensive and competitive benefits package, 401K, tuition reimbursement and opportunities for professional growth and advancement.

Apply for this job