We are looking for a Site Reliability Engineer to help us keep our production environments running and expand them further.
Responsibilities
- Responsible for the production environments and keeping them running
- Maintaining the backup strategy and provide effective backup models
- Monitoring sites and software to make sure they’re performing properly
- Monitor and maintains the core routers
- Anticipating potential problems before they occur
- Conducting post-incident reviews
- Documenting his/her work to turn findings into repeatable actions
- Mentoring and coaching junior engineers
Requirements and skills
- Experience with Linux operation systems
- Experience with Windows Server operation systems
- Experience with ESXI 6.5, 7.0
- Experience with Gitlab infrastructure is a plus
- Experience with Docker is a plus
- Experience with Ansible is a plus
- Strong networking knowledge. NAT, QoS, Routing, traffic mangle, VLAN
- Familiarity with production monitoring systems (Prometheus)
- Analytical and problem-solving skills
This job is perfect for you if you:
- Are open to knowledge sharing
- Understanding the importance of the last mile delivery
- Are a quick learner, proactive, ready to work on his/her own and in a team
- Have excellent communication skills and positive attitude
- Are passionate about space