AI Infrastructure Engineer at Umpisa Inc.

Manila, Metro Manila, Philippines -

Full Time

Start Date

Immediate

Expiry Date

02 Jun, 26

Salary

0.0

Posted On

04 Mar, 26

Experience

5 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Gpu Environments, Deployment Automation, Model Training, Inference Optimization, Prompt Caching, Retrieval-Augmented Generation (RAG), Encryption, Cmek, Vector Search, Ann Algorithms, Distributed Storage, Data Systems, Cloud Platforms, Scrum Team, Agile Framework, Systems Design

Industry

IT Services and IT Consulting

Description

At Umpisa Inc., our mission is to make the Philippines be known globally as a tech hub. Umpisa Inc. is a progressive technology services company that partners with select industries, clients and people to work on pioneering and industry-changing solutions via digital transformation, modern software development and venture building. We create a set of world-class and impactful products and solutions to help organizations and individuals live better lives. We offer demanding, challenging and rewarding careers in software development, product development, emerging technologies, and more for the right candidates. Essential Skills: Aligns with our values: Excellence, Integrity, Professionalism, People Success, Customer Success, Fun, Innovation and Diversity Strong communication skills Strong problem solving and analytical skills Excellent problem-solving ability Would like to work as part of a self-organizing Scrum team in a scaled agile framework Must be a self-starter and loves to collaborate with the team and client Job Summary We are looking for a technical and hands-on AI Infrastructure Engineer to build and scale our AI platform from the ground up. You will work closely with Data Scientists and ML Engineers to design GPU environments, automate deployments, and ensure high-performance model training and inference. Key Responsibilities Define AI infrastructure architecture strategy Lead cross-functional collaboration with Data Science and Security teams Design multi-region GPU cluster strategy Evaluate emerging AI infrastructure technologies Establish best practices and governance models Generative AI Infrastructure & Inference Optimization Design and implement inference efficiency initiatives such as prompt/context caching. Build systems that allow fine-grained control over cache prefixes and retrieval strategies. Optimize latency and cost efficiency of large-scale LLM inference workloads. Support Retrieval-Augmented Generation (RAG) architectures. Secure AI Systems & Encryption Architect and implement end-to-end encryption for cached AI content. Integrate customer-managed encryption keys (CMEK) within cloud environments. Ensure secure multi-tenant data isolation and compliance standards. Vector Search & Ranking Systems Develop enterprise-ready vector similarity search systems. Optimize Approximate Nearest Neighbor (ANN) algorithms for scale and latency. Build ranking models for personalization, recommendation, and monetization. Contribute to highly scalable embedding search infrastructure. Distributed Storage & Data Systems Design and maintain petabyte-scale distributed storage systems. Implement materialized views with consistent cross-datacenter updates. Support high-update throughput systems with low-latency point queries. Optimize large-scale table scans and distributed data processing. 5+ years in Infrastructure/Cloud Engineering & IAM Extensive experience with large-scale distributed system Experience leading technical teams Strong architectural and documentation skills Knowledge of AI workload optimization Experience working with hyperscale cloud platforms such as Google Cloud Platform. Familiarity with vector databases and ANN indexing techniques. Exposure to LLM inference optimization techniques. Experience building infrastructure supporting generative AI applications. Background in storage engines similar to Google’s Mesa/Napa architecture. Strong systems design skills Performance optimization mindset Security-first engineering approach Experience building enterprise-ready cloud services Ability to work in high-scale, production-critical environments

Responsibilities

The engineer will be responsible for defining the AI infrastructure architecture strategy, leading cross-functional collaboration, and designing multi-region GPU cluster strategies. Key tasks include optimizing generative AI inference efficiency, implementing security measures like end-to-end encryption, and developing enterprise-ready vector search systems.