Job Description
We are looking for engineers who are passionate about reliability, performance, and efficiency, and with experience in building tools, services, and automation to manage and improve production services.
Requirements:
- Systems internals/security, Linux, Network, and Monitoring
- Work to improve the reliability and performance of the next generation of distributed systems and containerized deployments
- Diagnose and troubleshoot complex distributed systems handling millions of queries per second
- Knowledge of Linux cloud services using kvm/qemu/lvm.
- Knowledge of containerisation technologies like docker and deployment and troubleshooting of containers
- Understanding of cloud platforms like Azure, GCP and AWS, ability to set up, configure, monitor and troubleshoot various PaaS components like Firewalls, VPN gateways, Load Balancers, Storage accounts, Networks and others
- In-depth knowledge in Perl/GoLang/Python to automate tasks with minimal intervention.
- Day-to-day work is heavily command-line driven, which requires a strong understanding of Linux.
- Troubleshoot issues across the entire stack - hardware, software, application, and network
- Knowledge in Database technologies, specifically inMySQL/NoSQL is good to have.
- Participate in 24x7 on-call rotations.
- Design, build and maintain core infrastructure that enables scaling to support hundreds of thousands of concurrent users.
- Actively take part in the Analysis and System improvement plan.
- Drive performance testing, capacity planning and high availability practices.
- Own implementations of new technologies while ensuring proper testing and documentation.
- Proactively monitor/identify/solve issues which could have a potential impact to our Infrastructure.
- Natural team player and also have a resourceful attitude.
- Buddy new team members, and get them production-ready.
Found this job inappropriate? Report to us