18 Sep Principal Site Reliability Engineer
Contact the recruiter, Shane Paynter, at firstname.lastname@example.org to learn more.
Ringside is working with a fast-paced start up company to locate a Principal Site Reliability Engineer to add to their team. This role will ensure production systems run smooth, apply sound engineering best practices to enhance deployments, stay on top of and manage any potential issues due to scale or vulnerabilities.
- Identify components of the application susceptible to performance, stability, and scalability issues.
- Develop metrics, monitoring, and alerting to observe the health of the production system.
- Develop or implement visual tools for technical and business teams to observe system health.
- Be proactive in anticipating production issues - outages, slowness, processing delays, errors, and failures - and take corrective action to prevent them.
- Quickly investigate and fix performance, stability, and scalability issues in production.
- Collaborate with SMEs as needed to investigate and resolve issues.
- Scope technical work for the backlog for larger initiatives.
- Maintain a record of system incidents; provide data-driven analysis to identify patterns and offer recommendations for preventative measures.
- Bachelor's in Computer Science, Information Technology, related field or equivalent experience
- 4+ years DevOps experience
- Experience in provisioned infrastructure resources using Terraform
- Experience enabling CI/CD pipelines using tools such as Jenkins, Gitlab, CircleCI, or others
- Enjoy collaborating with others to deliver successful solutions