AI Platform Engineer at Deeplight
Abu Dhabi, Abu Dhabi, United Arab Emirates -
Full Time


Start Date

Immediate

Expiry Date

13 Jan, 26

Salary

0.0

Posted On

15 Oct, 25

Experience

5 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Kubernetes, Infrastructure Automation, Container Orchestration, CI/CD, Linux Administration, Docker, Networking Fundamentals, Infrastructure-As-Code, AI/ML Workloads, GPU Scheduling, Resource Management, Monitoring Tools, Scripting Skills, DevOps, MLOps

Industry

IT Services and IT Consulting

Description
DeepLight AI is a specialist AI and data consultancy with extensive experience implementing intelligent enterprise systems across multiple industries, with particular depth in financial services and banking. Our team combines deep expertise in data science, statistical modeling, AI/ML technologies, workflow automation, and systems integration with a practical understanding of complex business operations. The AI Platform Engineer helps design, build and maintain the on-premises AI Infrastructure for our client, a market leader in the environment sector. The engineer plays a key role in enabling large-scale machine learning and generative AI workloads by developing robust, scalable, and secure platform solutions that support data scientists and ML engineers across the organisation. The role requires deep Kubernetes and infrastructure automation experience to optimise performance, streamline deployments, and ensure reliability in an on-prem environment. Design, deploy, and manage on-premises Kubernetes clusters for AI and ML workloads. Develop and maintain infrastructure-as-code using tools like Terraform, Helm, or Ansible Build and optimise AI/ML pipelines and MLOps workflows for model training, deployment, and monitoring. Collaborate with data science and engineering teams to deliver high-performance computing environments for large model training and inference. Implement resource management, observability, and scaling strategies for GPU-based workloads. Manage containerisation, networking, and storage solutions for AI workloads. Ensure security, compliance, and reliability of the AI platform. Automate operational processes and continuously improve platform efficiency. What we need from you: Proven experience managing Kubernetes clusters on-premises (not just cloud-managed solutions). Experience working in a Consultancy capacity. Strong background in container orchestration, CI/CD, and automation. Proficiency in Linux administration, Docker, and networking fundamentals. Hands-on experience with infrastructure-as-code (Terraform, Helm, Ansible, etc.). Experience supporting AI/ML workloads (e.g., TensorFlow, PyTorch, Hugging Face, Ray, Kubeflow). Familiarity with GPU scheduling and resource management on Kubernetes. Knowledge of monitoring and logging tools (Prometheus, Grafana, ELK, etc.). Strong scripting skills (Python, Bash, or Go). Understanding of DevOps and MLOps best practices. Benefits & Growth Opportunities: · Competitive salary and performance bonuses · Comprehensive health insurance · Professional development and certification support · Opportunity to work on cutting-edge AI projects · International exposure and travel opportunities · Flexible working arrangements · Career advancement opportunities in a rapidly growing AI company This position offers a unique opportunity to shape the future of AI implementation while working with a talented team of professionals at the forefront of technological innovation. The successful candidate will play a crucial role in driving our company's success in delivering transformative AI solutions to our clients.
Responsibilities
The AI Platform Engineer designs, builds, and maintains the on-premises AI Infrastructure for clients, enabling large-scale machine learning and generative AI workloads. This role involves developing robust, scalable, and secure platform solutions that support data scientists and ML engineers across the organization.
Loading...