monitoring systems (e.g., Prometheus, Grafana, Datadog) and lead incident response for production outages, working on root cause... managing clusters in production environments. Infrastructure as Code (IaC): Expertise in implementing IaC best practices......
Job Location: Toronto, ON, Canada