Edge AI Engineer at Nanoveu Pte Ltd
Singapore, , Singapore -
Full Time


Start Date

Immediate

Expiry Date

30 Sep, 25

Salary

6000.0

Posted On

01 Jul, 25

Experience

3 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Embedded Systems, Optimization Techniques, Pruning, Linux, Distillation

Industry

Information Technology/IT

Description

We’re looking for an Edge AI Engineer who’s excited to work hands-on with real silicon and optimize deep learning models for highly resource-constrained environments. This is a chance to work across the full model deployment stack — from quantization and pruning, through to runtime deployment on proprietary hardware. You will work at the intersection of ML, compiler toolchains and custom silicon. Every model built will be mapped directly to hardware.

Duties:

  • Design and optimize deep learning models for edge inference (CNNs, Transformers and so on).
  • Apply compression, quantization, and hardware-aware architecture tuning.
  • Translate and deploy models using ONNX, TFLite, or custom runtimes.
  • Benchmark models for latency, energy efficiency, and accuracy.
  • Collaborate with hardware, firmware, and toolchain engineers to align models with silicon.
  • Build model integration and CI/CD workflows using modern ML toolchains.

Requirements:

  • 1–3 years of hands-on experience deploying ML/DL models to embedded or edge platforms.
  • Strong understanding of model optimization techniques (quantization, pruning, distillation).
  • Experience with ONNX, TFLite, PyTorch, or TensorFlow.
  • Comfortable debugging performance bottlenecks on embedded systems (Linux or bare-metal).

You will have an added advantage if you have the following:

  • Familiarity with TVM, Glow, XLA, or custom inference stacks.
  • Prior deployment to MCUs, NPUs, or custom SoCs.
Responsibilities
  • Design and optimize deep learning models for edge inference (CNNs, Transformers and so on).
  • Apply compression, quantization, and hardware-aware architecture tuning.
  • Translate and deploy models using ONNX, TFLite, or custom runtimes.
  • Benchmark models for latency, energy efficiency, and accuracy.
  • Collaborate with hardware, firmware, and toolchain engineers to align models with silicon.
  • Build model integration and CI/CD workflows using modern ML toolchains
Loading...