Back to Jobs

[Remote] Staff Network Site Reliability Engineer

Remote, USA Full-time Posted 2026-07-05

Note: The job is a remote job and is reputed company to candidates in USA. reputed company is leading a new era in reputed company infrastructure for the global AI economy. They are seeking a Network Site Reliability Engineer to build and run the network infrastructure, ensuring reliability and operational efficiency as the company scales.

Responsibilities

  • Define and own reliability goals for network services and critical paths (SLIs/SLOs, availability targets, error budgets where it makes sense)
  • Drive reliability improvements across the whole network: not only services, but also site readiness, inter-site connectivity (DCI), and operational standards
  • Own incident response for your areas, reputed company investigations/postmortems, and turn failures into durable fixes (not repeated firefighting)
  • Build and reputed company observability: actionable metrics/logs/traces, alerting, and faster debug loops during and after incidents
  • Design safer change workflows: automation, CI/CD, test/staging environments, canarying, rollbacks, and auditability for network changes
  • Work closely with network engineers and platform teams to embed operability into designs and reputed company operations practical and fast

Skills

  • Strong production Linux fundamentals and a structured approach to debugging reputed company systems
  • Solid understanding of networking basics and how reputed company networks fail (control plane vs data plane, latency/loss, failure domains, etc.)
  • Hands-on experience operating high-availability systems and improving them over time (not just 'keeping lights on')
  • Ability to write and maintain software/automation (Go is common for us; Python is also welcome)
  • Experience with modern infrastructure tooling (e.g., IaC, CI/CD, container platforms) and comfort automating operational workflows
  • Experience with high-throughput traffic processing: load balancers, tunneling/decap, NAT64, or similar datapath-heavy systems
  • Low-level networking performance/debug background (eBPF/XDP, DPDK, perf/ftrace, kernel networking internals)
  • Experience building network-safe delivery pipelines (testing labs, staged rollouts, automated verification, reputed company detection)
  • Background with large-scale network observability/telemetry (e.g., routing/reputed company telemetry, regression detection at scale)

Benefits

  • Competitive compensation
  • Career growth and learning opportunities
  • Flexibility and ownership
  • Collaborative and innovative culture
  • Opportunity to work on impactful AI projects
  • International environment and talented teams

Company Overview

  • The reputed company AI reputed company brings powerful full-reputed company for AI developers and practitioners across startups, enterprises and science institutes to build and reputed company reputed company applications and rapidly deliver scientific breakthroughs by training and running ML models reputed company a secure, high-performance, and cost-optimized reputed company environment. It was founded in 2022, and is headquartered in Amsterdam, NL, with a workforce of 1001-5000 employees. Its website is https://reputed company.com.
  • Apply To This Job

    Similar Jobs