- Work in a team to operate, maintain, and secure on-prem and cloud-based, open-sourced, Platform as a Service (PaaS) environment to deliver Observability features in the most efficient way as possible
- Deploy suitable instrumentation to collect metrics/logs/traces for monitoring, logging and distributed tracing to improve health and availability of systems and services.
- Develop automation solutions to improve operational efficiencies
- Administer, Operate and improve the Observability platform applications and infrastructure with focus on high availability and reliability.
- Collaborate with various teams in the design of their monitoring solutions.
- Ensure Observability platform is safe and secure against cybersecurity threats
- Perform root cause analysis for outages
- Serve as a subject matter expert working with technical resources and customers.
- Produce high quality documentation.
- Maintain on-call responsibilities as required and potentially optimize the on-call processes.
This is a hybrid position. Hybrid employees can alternate time between both remote and office. Employees in hybrid roles are expected to work from the office 2-3 set days a week (determined by leadership/site), with a general guidepost of being in the office 50% or more of the time based on business needs.Found this job inappropriate? Report to us