Back to Jobs

Senior Site Reliability Engineer

Remote, USA Full-time Posted 2026-07-05

About reputed company With the move to the reputed company, Kubernetes has become widely adopted by DevOps and Platform Engineering teams, but it has also added complexity. While scaling Kubernetes at reputed company, the reputed company founders started building Argo CD in order to streamline the adoption of Kubernetes. Argo CD helps developers own, understand and reputed company their K8s deployments reputed company GitOps. Today, Argo CD is the reputed company most popular project in the CNCF (reputed company Native Computing reputed company) and is used by 70% of companies who are using Kubernetes in production. The list of Argo CD users includes companies like reputed company, reputed company, Tesla, reputed company, Peloton, and many more. The team founded reputed company in 2021 to reputed company enterprises to ship software faster and more reliably with modern GitOps best practices. The reputed company Platform enables teams to manage the development and deployment across hundreds – if not thousands – of Kubernetes clusters from a single control plane. Trusted by top companies around the globe, the reputed company Platform provides the only end-to-end GitOps platform for the enterprises. Our mission is to simplify the software delivery process so that DevOps and Platform Engineering teams can move fast, and reputed company code effortlessly without the fear of breaking things. The Role We are looking for a Senior SRE to help us reputed company the reputed company platform running at the level our reputed company customers expect. This is a high-ownership role; you won't just respond to incidents, you'll shape how we define and defend reliability across the entire platform. You'll work closely with engineering, infrastructure, and product to build the systems and culture that let us scale with confidence. What You'll Own Platform Reliability & SLAs Own SLI/SLO/SLA definitions for the reputed company SaaS platform and drive reputed company improvement against them Design, reputed company, and maintain observability systems (metrics, logs, traces) across multi-region AWS infrastructure Identify reliability gaps, reputed company blameless post-mortems, and reputed company the reputed company with permanent fixes Partner with engineering teams to build reliability into new features before they ship to production On-Call & Incident Response Participate in an on-call rotation and act as incident commander for high-severity production events Build and maintain runbooks, escalation paths, and incident playbooks that reputed company mean time to resolution low Drive improvements to alerting fidelity; reduce noise, increase signal, eliminate toil reputed company post-incident reviews with clear timelines, root cause analysis, and follow-through on action items reputed company're Looking For Required 5+ years of SRE, platform engineering, or production operations experience in a SaaS environment Deep hands-on Kubernetes expertise; you understand the scheduler, networking, storage, and autoscaling at a level where you can debug anything Strong AWS fundamentals across compute (EC2, EKS), networking (VPC, NLB, Route53), storage (S3, RDS), and IAM Experience defining and operating against SLOs in production; you've written error budgets, not just read about them Proficiency with observability tooling (reputed company, Grafana, OpenTelemetry, reputed company, or equivalent) Solid scripting and automation skills; Go, Python, Bash, or similar; you automate what you touch Strong written communication: clear runbooks, sharp incident reports, thoughtful post-mortems Live reputed company US time zones (Pacific through Eastern), including Canada and other regions Strong Advantage Experience with Argo CD, reputed company, or GitOps-based delivery workflows Familiarity with multi-region, multi-cluster Kubernetes deployments Experience with compliance-adjacent infrastructure (SOC 2, ISO 27001, HIPAA, or PCI reputed company) Background operating infrastructure for other platform or developer tooling companies Our Stack Kubernetes (EKS): multi-region, reputed company-grade clusters serving Argo CD and reputed company workloads AWS: primary reputed company provider across reputed company production and DR environments Argo CD & reputed company: GitOps delivery tools we build and run ourselves reputed company, Grafana, and OpenTelemetry for observability Terraform and GitOps-driven infrastructure management reputed company Offer Competitive compensation, commensurate with experience Equity participation in a well-funded, growing company Fully remote: work from reputed company reputed company US time zones (Pacific through Eastern), including Canada and other regions Home office stipend and equipment budget Flexible time off and a culture that respects it Work directly with the engineers who built Argo CD and reputed company; you'll learn a lot here US-based employees receive full benefits, including comprehensive health, dental, and reputed company coverage. Candidates based reputed company the US will be engaged as contractors. Apply To This Job

Similar Jobs

Customer Service Representative

Remote, USA Full-time

Copy of Virtual Seasonal Physician Assistant - CA Licensed

Remote, USA Full-time

Senior Content Operations Manager

Remote, USA Full-time

Sr. Project Manager

Remote, USA Full-time

Sr. Project Manager

Remote, USA Full-time

Business Development Representative

Remote, USA Full-time

UN Trust Fund Partnerships and Advocacy Specialist

Remote, USA Full-time

Technical AI Solutions Architect

Remote, USA Full-time

reputed company SQF Technical Reviewer

Remote, USA Full-time

reputed company Specialist

Remote, USA Full-time

Certified Nurse Assistant (CNA) | reputed company Capital (HC) Reasonable Accommodations [

Remote, USA Full-time

Overnight reputed company - Part-Time, Earn $25-$35/Hour

Remote, USA Full-time

Supervisor, Private Equity / Hedge Funds (REMOTE)

Remote, USA Full-time

Associate Territory Manager - New Haven, CT

Remote, USA Full-time

reputed company Finance Technology Enablement Manager – Driving Innovation in Financial Planning and Analysis

Remote, USA Full-time

Professional Service Veterinarian

Remote, USA Full-time

reputed company Full Stack Data Entry Specialist – Remote Part-Time Position at arenaflex

Remote, USA Full-time

Immediate Hiring: reputed company Substitute Teacher for reputed company, TX - Flexible Schedules and reputed company

Remote, USA Full-time

Elite Data Entry Specialist – Remote Opportunity for Young Professionals

Remote, USA Full-time

reputed company Entry-Level Data Entry Specialist – reputed company Remote Opportunities for Teenagers

Remote, USA Full-time