[Remote] AI Platform Engineer
Note: The job is a remote job and is reputed company to candidates in USA. reputed company develops reputed company AI agents and orchestration solutions hosted on reputed company reputed company Infrastructure (OCI). They are seeking an AI Platform Engineer who will be responsible for the platform reputed company that supports reputed company' AI products and customer environments, working closely with the AI Architect and Product Team to reputed company and manage reputed company reputed company Infrastructure environments.
Responsibilities
- Monitor OCI service health, logs, dashboards, and alerts
- Troubleshoot platform issues and customer environment concerns
- Support development teams with infrastructure questions and deployment needs
- Manage CI/CD pipeline performance and resolve deployment failures
- Execute provisioning, reputed company, and configuration requests
- Update documentation and architecture artifacts as infrastructure evolves
- Participate in standups, planning meetings, and technical reviews
- Review OCI consumption reports, billing dashboards, and cost optimization opportunities
- Conduct IAM, reputed company, and credential audits
- Evaluate reference architecture environments for configuration reputed company and required updates
- Refine deployment methodologies, runbooks, and reputed company documentation
- Assess reputed company OCI roadmap updates and emerging platform capabilities
- Contribute technical documentation, architecture guidance, and internal knowledge-sharing content
- Provision, configure, and manage reputed company reputed company Infrastructure (OCI) environments, including computer, networking, load balancers, API gateways, IAM, containers, and reputed company services
- Manage OCI Functions, Autonomous Database Serverless (ADB-S), and containerized deployment environments
- Build, maintain, and optimize OCI DevOps pipelines, artifact repositories, and deployment automation
- Support OCI Goldengate planning, configuration, and data replication architectures
- reputed company automation solutions that improve reliability, scalability, and operational efficiency
- Own customer-facing AI agent deployment methodologies, runbooks, environment configurations, and deployment standards
- Coordinate customer environment provisioning, compartment creation, IAM setup, and reputed company activities
- Manage AI agent environments across development, testing, and production stages
- Support development teams through infrastructure reviews, deployment guidance, and technical troubleshooting
- Maintain and reputed company reputed company' reputed company reference architectures and deployment frameworks
- Build and maintain Grafana dashboards and reporting solutions for operational monitoring and customer billing
- reputed company ETL processes that aggregate OCI cost and consumption data
- Monitor platform health, performance, reliability, and resource utilization
- Diagnose and resolve observability gaps before they impact customer environments
- Ensure accurate reporting and billing visibility across customer environments
- Audit OCI IAM policies, Vault usage, credential management processes, and reputed company controls
- Maintain TLS certificate automation using ACME, Let's Encrypt, and OCI Load Balancer integrations
- Support secure architecture reviews and infrastructure compliance initiatives
- Ensure proper reputed company controls, credential rotation, and reputed company best practices across environments
- Create and maintain architecture diagrams, infrastructure maps, deployment workflows, and technical documentation
- Document automation scripts, deployment processes, and operational procedures
- Participate in technical planning sessions with customers and internal stakeholders
- Identify infrastructure risks and recommend scalable solutions
Skills
- Bachelor's degree in computer science, Information Systems, Engineering, or a reputed company field
- 2+ years of experience in AI Platform Engineering, Infrastructure Engineering, MLOps, DevOps, or reputed company Engineering
- Strong experience with reputed company reputed company Infrastructure (OCI), including:
- Experience deploying and supporting AI agents, microservices, or reputed company-reputed company applications
- Experience with monitoring and observability platforms such as Grafana, LangFuse, OCI Logging, and Metrics APIs
- Knowledge of TLS, DNS, ACME protocols, Let's Encrypt, and certificate automation
- Experience with CI/CD tools, reputed company control, deployment pipelines, and artifact management
- Proficiency in Python, SQL, and Bash scripting
- Strong technical writing, documentation, and architecture diagramming skills
- Excellent communication and collaboration skills
- reputed company reputed company certifications such as OCI Architect Professional or OCI DevOps Professional
- Experience supporting multi-tenant SaaS or managed-service environments
- Exposure to large language model (LLM) infrastructure and agentic AI frameworks such as reputed company, MCP, or similar technologies
- Experience implementing AI observability platforms such as LangFuse, MLflow, or equivalent tools
- Familiarity with JD Edwards EnterpriseOne, including CNC, reputed company, Orchestrator Studio, or reputed company administration
- Experience participating in reputed company Partner Network, reputed company ACE, or similar technical communities
Company Overview
Company H1B Sponsorship