|
Job Title: Site Reliability DevOps Engineer III
Category: Information Technology
Number of Openings: 1
Location: Mountain View, CA (Remote)
Schedule: Must be able to work PST hours
Employment Type: Contract (Potential for Full-Time Conversion Based on Performance)
Position Overview
We are seeking a Site Reliability DevOps Engineer III to join our Data Services team. This role focuses on building, maintaining, and scaling reliable systems and infrastructure in a 24×7 environment. The ideal candidate will have strong coding skills, deep experience with CI/CD and DevOps tools, and a passion for automation and operational excellence.
Key Responsibilities
- Manage and support large-scale distributed systems and services.
- Implement and maintain monitoring, alerting, and high-availability solutions.
- Support 24×7 operations with participation in on-call rotation.
- Build and improve CI/CD pipelines using modern DevOps tools.
- Develop automation scripts and tools in Python, Shell, or similar languages.
- Design and implement scalable back-end services for “web-scale” systems.
- Collaborate across teams to ensure system reliability, performance, and security.
Required Qualifications
- Strong system administration experience with Unix/Linux environments.
- Hands-on experience with cloud hosting platforms (AWS, Rackspace, CIS, OpenStack).
- Proficiency in one or more programming/scripting languages: Python, Shell, C, C++, Java, Go, Perl, Ruby.
- Experience with distributed computing and networking (TCP/IP, routing, SDN).
- Knowledge of CI/CD tools such as GitHub, GitHub Actions, Jenkins, Maven/Gradle, SonarQube, Hudson, Bamboo, or XLR.
- Experience with microservices design and implementation.
- Familiarity with Platform-as-a-Service (PaaS) environments.
- Knowledge of stream processing and messaging systems (RabbitMQ, Kafka, MongoDB, etc.).
- Experience with container orchestration and clustering (Kubernetes, Apache Mesos).
- Experience with service discovery and load balancing (Consul.io, HAProxy, etc.).
- Expertise in monitoring tools (CloudWatch, Nagios, New Relic, SPM, Sensu).
- Strong troubleshooting skills in high-demand infrastructure environments.
Preferred Qualifications
- Experience working on large-scale, enterprise-level projects.
- Prior experience at a major tech company.
- Open-source contribution or experience is a plus.
Education
- Advanced degree in Engineering, Computer Science, or related field (or equivalent professional experience).
|