G02 - Platform Operations Engineer at FPT Asia Pacific Pte Ltd
Singapore, , Singapore -
Full Time


Start Date

Immediate

Expiry Date

06 May, 26

Salary

0.0

Posted On

05 Feb, 26

Experience

2 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Cloud Platform Operations, Monitoring, Performance Optimisation, Reliability, Release Management, Incident Management, Troubleshooting, AWS Cloud Infrastructure, Scalability, Security, Cost-Efficiency, Operational Processes, Change Management, Risk Assessment, Infrastructure as Code, CI/CD Pipelines

Industry

IT Services and IT Consulting

Description
Responsibilities: Lead cloud platform operations for Cloud File Transfer (CFT) with focus on monitoring, performance optimisation, reliability, release management, and continuous improvement within AWS environments. Own L2 incident management, troubleshooting, and escalation handling for high-throughput file transfer workflows across multiple agencies, working closely with engineering, security, and agency stakeholders to resolve incidents within defined SLAs. Manage, design, and continuously optimise AWS cloud infrastructure to ensure scalability, security, cost-efficiency, and high availability of the CFT platform. Establish, refine, and enforce operational processes including runbooks, dashboards, daily health checks, incident communication practices, and operational reporting with actionable insights. Drive change, release, and maintenance management by performing impact analysis, risk assessment, mitigation planning, and executing system upgrades and infrastructure improvements to ensure platform stability. Review testing results to ensure all changes meet operational, performance, and security requirements before release, while defining and improving operational OKRs, SLAs, and reliability metrics. Contribute to portal and backend enhancements, bug fixes, and operational tooling to continuously improve platform reliability, performance, and maintainability. Share operational best practices, incident learnings, and technical knowledge within the team and across the programme to improve engineering standards and platform reliability. Requirements Degree in Computer Science, Information Technology, or related field, or equivalent practical experience. Minimum 2 years of hands-on experience managing production workloads in public cloud environments (preferably AWS). Strong problem-solving skills across cloud infrastructure, applications, and distributed systems. Experience handling production incidents with ownership, urgency, and attention to detail. Experience defining and enforcing operational processes, procedures, and best practices. Familiarity with maintaining high-availability, secure cloud environments and implementing preventative operational controls. Understanding of change management, impact assessment, and service reliability improvements. Preferred: experience in operating applications on AWS, and experience working on Key Technologies: Terraform for infrastructure as code and cloud resource management. GitLab for CI/CD pipelines and version control. Strong understanding of AWS services and architecture supporting production workloads.
Responsibilities
Lead cloud platform operations for Cloud File Transfer (CFT) focusing on monitoring, performance, reliability, and release management within AWS environments. Own L2 incident management and troubleshooting for high-throughput file transfer workflows across multiple agencies.
Loading...