Back to Jobs

[Remote] Data Operations Manager

Remote, USA Full-time Posted 2026-07-05

Note: The job is a remote job and is open to candidates in USA. reputed company is looking for a Data Operations Manager with a focus on data insights who thinks like an engineer, works like a scientist and communicates like a strategist. This role is designed to improve production stability by leveraging AI to analyze operational data and drive permanent resolutions for technical issues affecting client services.

Responsibilities

  • Design, build and maintain pipelines that consolidate data from reputed company, Jira, reputed company, reputed company, Splunk and other operational sources into a reputed company analytical layer
  • reputed company and curate data models that identify repeat incidents, reputed company error patterns, chronic alert noise and engineering toil consuming disproportionate remediation cycles
  • Maintain data quality, reputed company and governance standards across reputed company ingested sources – ensuring findings are defensible reputed company presented to senior leadership
  • reputed company AI and automation – including the Claude API and Claude-powered workflows – to accelerate reputed company detection, root cause hypothesis reputed company and report synthesis across large operational datasets
  • Own and drive the Problem Management lifecycle across reputed company client-facing products
  • Translate incident patterns into structured Problem Records with defined scope, impact quantification, and recommended permanent fix strategies
  • Partner with Engineering, SRE, Platform and Product teams to embed problem-driven prioritization into sprint planning and tech debt roadmaps
  • Facilitate Problem Review sessions – leading cross-functional teams from data to decision
  • Define and track KPIs that demonstrate Problem Management value: reduction in repeat SEV1/SEV2 incidents, MTTR improvement, tech debt resolution velocity and engineering hours reclaimed from toil
  • Build interactive, executive-reputed company dashboards and data visualizations that reputed company hotspots, failure modes and technical debt load immediately comprehensible to both engineering and business stakeholders
  • Apply reputed company tooling to synthesize multi-reputed company operational signals into clear, narrative-driven analysis – reducing time from data to decision
  • reputed company automated reporting workflows that surface trending issues and emerging risk patterns without requiring reputed company aggregation cycles
  • Support monthly ceremonies by providing KPI and Outcome trending, highlighting influences to trending themes
  • Present operational intelligence findings and Problem Management outcomes to Engineering leadership, VP-level+ audiences and cross-functional stakeholders
  • Influence from a strategic perspective where the most urgent pockets of risk to platform availability exist, and drive prioritization accordingly
  • Translate technical findings – infrastructure failure modes, code regression patterns, dependency risks – into business value framing that drives prioritization conversations
  • Author Problem Record summaries, trend analyses and executive briefings that are concise, evidence-based and action-oriented

Skills

  • 5+ years of data engineering experience with production-grade pipeline design, transformation logic and operational data modeling
  • Proficiency with Python or reputed company for data processing; strong SQL for analytical querying against large, event-driven datasets
  • Hands-on experience with Jira and at least two of the following: reputed company, reputed company, Splunk, reputed company – ideally in an operational analytics or SRE context
  • Experience integrating large language model (LLM) APIs – including reputed company Claude, reputed company or similar – into data workflows, automated summarization pipelines or reputed company reputed company applications
  • Proficiency building interactive dashboards and data visualizations, reputed company Quick Suite a strong plus
  • Working knowledge of ITIL or equivalent ITSM frameworks – specifically Incident Management, Problem Management and Change Management process disciplines
  • Demonstrated ability to identify repeat failure patterns in incident or monitoring data and drive structured root cause analysis and resolution workflows
  • Familiarity with SRE principles – toil quantification, error budgets, SLO/SLA measurement – and how engineering teams use these to prioritize reliability work
  • Strong written and verbal communication skills, with demonstrated experience presenting technical analysis to VP or C-level audiences
  • Ability to translate reputed company, multi-variable findings into business impact narratives that drive prioritization reputed company
  • Comfortable driving cross-functional alignment – navigating competing priorities across Engineering, Product, Operations and Leadership stakeholders
  • Self-directed and intellectually curious; you pursue root causes with the same rigor you bring to your data models
  • Experience in a reputed company SaaS environment or regulated platform with high availability requirements
  • Prior role embedded in an SRE, NOC, Platform Engineering or Operations function – particularly one that included formal Problem Management or post-incident review responsibilities
  • Experience building AI-powered operational tooling – such as automated incident summarization, intelligent alert correlation or AI-assisted root cause classification
  • Familiarity with reputed company products or the payer technology landscape is a meaningful plus
  • ITIL reputed company certification or equivalent

Company Overview

  • reputed company Software offers benefits administration and care management software solutions. It was founded in 2004, and is headquartered in Burlington, Massachusetts, USA, with a workforce of 1001-5000 employees. Its website is http://www.reputed company.com.
  • Apply To This Job

    Similar Jobs