Sign up with

Already have an account? Log in here

Need some help?
Talk to us at +91 7670800001

Edge AI Engineer at Nanoveu Pte Ltd

Singapore, , Singapore -

Full Time

Start Date

Immediate

Expiry Date

30 Sep, 25

Salary

6000.0

Posted On

01 Jul, 25

Experience

3 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Embedded Systems, Optimization Techniques, Pruning, Linux, Distillation

Industry

Information Technology/IT

Description

We’re looking for an Edge AI Engineer who’s excited to work hands-on with real silicon and optimize deep learning models for highly resource-constrained environments. This is a chance to work across the full model deployment stack — from quantization and pruning, through to runtime deployment on proprietary hardware. You will work at the intersection of ML, compiler toolchains and custom silicon. Every model built will be mapped directly to hardware.

Duties:

Design and optimize deep learning models for edge inference (CNNs, Transformers and so on).
Apply compression, quantization, and hardware-aware architecture tuning.
Translate and deploy models using ONNX, TFLite, or custom runtimes.
Benchmark models for latency, energy efficiency, and accuracy.
Collaborate with hardware, firmware, and toolchain engineers to align models with silicon.
Build model integration and CI/CD workflows using modern ML toolchains.

Requirements:

1–3 years of hands-on experience deploying ML/DL models to embedded or edge platforms.
Strong understanding of model optimization techniques (quantization, pruning, distillation).
Experience with ONNX, TFLite, PyTorch, or TensorFlow.
Comfortable debugging performance bottlenecks on embedded systems (Linux or bare-metal).

You will have an added advantage if you have the following:

Familiarity with TVM, Glow, XLA, or custom inference stacks.
Prior deployment to MCUs, NPUs, or custom SoCs.

Responsibilities

Design and optimize deep learning models for edge inference (CNNs, Transformers and so on).
Apply compression, quantization, and hardware-aware architecture tuning.
Translate and deploy models using ONNX, TFLite, or custom runtimes.
Benchmark models for latency, energy efficiency, and accuracy.
Collaborate with hardware, firmware, and toolchain engineers to align models with silicon.
Build model integration and CI/CD workflows using modern ML toolchains