Senior AI/ML Engineer II New at DigitalOcean
Boston, Massachusetts, USA -
Full Time


Start Date

Immediate

Expiry Date

30 Nov, 25

Salary

183300.0

Posted On

31 Aug, 25

Experience

0 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Good communication skills

Industry

Computer Software/Engineering

Description

Dive in and do the best work of your career at DigitalOcean. Journey alongside a strong community of top talent who are relentless in their drive to build the simplest scalable cloud. If you have a growth mindset, naturally like to think big and bold, and are energized by the fast-paced environment of a true industry disruptor, you’ll find your place here. We value winning together—while learning, having fun, and making a profound difference for the dreamers and builders in the world.
We’re building the next generation of agentic applications on the GradientAI platform—where multi-agent systems of LLM-powered agents collaborate, make decisions, and adapt at scale. You’ll be part of the team designing robust, scalable, and safe agent workflows that empower developers to build sophisticated AI-driven systems with confidence.
We’re looking for someone with a strong software engineering background and deep expertise in generative AI, multi-agent system design, guardrails, monitoring, and evaluation methodologies. Your work will directly shape how thousands of developers create and scale AI agents on our platform.

How To Apply:

Incase you would like to apply to this job directly from the source, please click here

Responsibilities
  • Architect and deliver production-grade agentic systems: multi-agent orchestration, workflow management, state/memory handling, and runtime governance.
  • Design and orchestrate modular, LLM-powered agents (e.g., Planner, Tool Executor, QA, Validator) using scalable orchestration patterns (sequential, router, parallel, map-reduce), with clear handoff protocols, shared memory, and structured communication.
  • Define and enforce guardrails and governance: prompt sanitization, access control, audit trails, threat modeling, and strategies for injection defense, hallucination control, misuse prevention, and compliance.
  • Establish evaluation and monitoring methods for multi-agent systems: accuracy, safety, cost, and latency—leveraging observability practices (logs, telemetry, tracing, capturing intermediate outputs) and feedback loops to continuously refine performance.
  • Build fine-tuning and deployment pipelines: supervised fine-tuning, inference optimization, post-deployment updates, and scaling hardened systems with retries, error handling, and fairness checks.
  • Rapidly define and deliver MCPs: identify minimal agent roles and orchestration logic, validate quickly, and expand iteratively into robust multi-agent applications.
  • Integrate seamlessly with the GradientAI platform: ensuring agents leverage DO services (inference, KBs, Functions, storage, networking) for scale, reliability, and cost-efficiency.
  • Apply strong software engineering practices: testing, CI/CD, code quality, scalable architectures, and distributed system design.
  • Collaborate cross-functionally with product managers, infra teams, design and UX, and other engineers to ship features that developers adopt and trust.
  • Participate and support in operational excellence
  • Independently ship product features from planning to launch to maintenance with high autonomy
  • Collaborate with other engineers to find elegant architectures and solutions
Loading...