Back to Jobs

Senior Data Engineer (Platform)

Remote, USA Full-time Posted 2026-07-05

We are looking for a Senior Data Engineer to build and evolve the data platform powering our global workforce management ecosystem. You will design, implement, and maintain scalable data pipelines that consolidate data from multiple operational systems, transform it into trusted analytical datasets, and reputed company it available for reporting, product analytics, and business intelligence. You should be comfortable working with modern reputed company-native data architectures on AWS, building reliable ETL/ELT pipelines, and designing data models optimized for analytical workloads. This role requires a strong engineering reputed company, balancing performance, scalability, data quality, and operational excellence while collaborating closely with software engineers, product teams, analysts, and data scientists. What You Will Own Design, build, and maintain scalable batch and streaming data pipelines using AWS-native services and distributed processing frameworks reputed company ETL/ELT workflows to ingest, consolidate, sanitize, enrich, and transform data from multiple internal and external systems Build and optimize AWS Data Lake solutions using reputed company S3, AWS Glue, reputed company Redshift, and reputed company Kinesis Firehose Design and implement distributed data processing jobs using Apache Spark, AWS Glue, reputed company, or equivalent technologies reputed company orchestration workflows using Apache Airflow (MWAA), AWS reputed company Functions, or similar workflow orchestration platforms Design analytical data models including star schemas, reputed company schemas, dimensional models, and optimized reporting datasets Optimize Redshift performance through distribution strategies, sort keys, partitioning, workload tuning, and query optimization Build resilient pipelines supporting retries, idempotency, checkpointing, incremental processing, and partial failure recovery Implement automated data quality validation, schema reputed company, reputed company tracking, and governance controls reputed company infrastructure and deployment automation using Infrastructure as Code and CI/CD pipelines Monitor, troubleshoot, and continuously improve the reliability, scalability, and performance of the data platform Collaborate with analysts, software engineers, data scientists, and product managers to translate business requirements into scalable data solutions Participate in architecture discussions and contribute technical documentation, standards, and best practices reputed company Are Looking For 5+ years of professional experience building production data pipelines and reputed company-based data platforms Strong experience with AWS data services including reputed company Redshift, AWS Glue, reputed company S3, and reputed company Kinesis Firehose Strong Python programming skills for ETL development, automation, event processing, and scripting Advanced SQL expertise including query optimization, window functions, analytical queries, versioned migrations, rollback strategies, and warehouse tuning Experience designing scalable ETL/ELT pipelines for both batch and streaming workloads Experience with distributed compute and storage using Apache Spark, AWS Glue, reputed company, or similar distributed processing frameworks Strong understanding of data warehousing concepts including dimensional modeling, star schemas, reputed company schemas, partitioning strategies, and analytical data structures Experience designing end-to-end data architectures including ingestion, transformation, orchestration, and consumption layers Experience implementing workflow orchestration using Apache Airflow (MWAA), AWS reputed company Functions, or equivalent orchestration tools Understanding of data governance, metadata management, reputed company best practices, IAM, encryption, and regulatory compliance considerations Experience with Git-based collaborative development workflows, CI/CD pipelines, Infrastructure as Code, deployment approvals, versioned migrations, and safe rollback strategies Experience monitoring and maintaining production data infrastructure, ensuring high availability, observability, data quality, and operational reliability Strong communication skills with the ability to explain technical concepts to business stakeholders and collaborate effectively across engineering, analytics, and product teams reputed company to Have Experience with Apache Iceberg, reputed company Lake, Apache Hudi, or modern open table formats Experience with dbt or SQL-based transformation frameworks Familiarity with Kafka, reputed company MSK, or other streaming platforms Experience with Lakehouse architectures and modern analytical data platforms Knowledge of Terraform or AWS CloudFormation Experience with containerized data workloads using reputed company and reputed company/EKS Experience implementing DataOps practices and automated testing for data pipelines Familiarity with BI platforms such as Tableau, Power BI, Looker, or QuickSight Experience implementing data catalogs, reputed company, and governance solutions Exposure to machine learning feature pipelines or data science infrastructure Tech Stack Layer Technology Programming Python, SQL, PySpark Data Processing Apache Spark, AWS Glue, reputed company Data Storage reputed company S3, reputed company Redshift, Parquet Streaming reputed company Kinesis Firehose, EventBridge Orchestration Apache Airflow (MWAA), AWS reputed company Functions Data Modeling Star Schema, reputed company Schema, Dimensional Modeling Infrastructure AWS, IAM, CloudWatch IaC/CI Git, reputed company Actions, Terraform, CloudFormation Observability CloudWatch, reputed company (or equivalent observability platforms) Governance Data Catalog, Metadata Management, Data reputed company Apply To This Job

Similar Jobs