Job Description:
The Machine Learning Operations Specialist plays a key technical role within the Data & Analytics team, responsible for building, deploying, and optimizing machine learning models that drive scalable, production-grade AI capabilities. Reporting to the AI Team Lead, the ML Engineer transforms data science prototypes into robust, high-performance solutions by implementing end-to-end pipelines, automating deployment, and operationalizing AI within modern cloud and enterprise platforms. This role is essential to delivering reliable, governed, and reusable AI capabilities across the organization.
The MLOps Specialist will also play a central role in advancing the organization’s MLOps and LLMOps maturity, ensuring models are not only performant, but also secure, monitored, and easily maintainable through automation and best practices.
The MLOps Specialist collaborates closely with data scientists, software engineers, and business stakeholders to ensure that machine learning solutions are aligned with business needs and seamlessly integrated into broader systems and workflows.
MAJOR JOB ACCOUNTABILITIES
- Translate data science prototypes into robust, scalable, and reusable machine learning pipelines that cover data ingestion, feature engineering, training, evaluation, and inference.
- Ensure solutions are modular, maintainable, and production-grade by following software engineering best practices.
- Develop and manage CI/CD workflows for automated model deployment, testing, and monitoring using containerized and cloud-native infrastructure.
- Operationalize models through proper registration, version control, rollback policies, and real-time monitoring for drift and performance anomalies.
- Deploy and integrate models within enterprise platforms such as Snowflake (Snowpark), Dataiku, and other Azure-native tools while ensuring optimal performance, cost-efficiency, and platform alignment.
- Collaborate with infrastructure and platform teams to ensure system compatibility and performance at scale.
- Implement processes for secure, explainable, and governed model deployment in alignment with regulatory and enterprise risk requirements.
- Maintain visibility over model behavior in production through monitoring dashboards, alerts, and audit logs.
- Contribute to the development and standardization of GenAI workflows and reusable components.
- Share knowledge and tools across teams to promote scalable best practices in ML engineering.
- Participate in agile ceremonies, sprint reviews, and architectural discussions to support cross-team alignment.
WHAT SKILLS AND TRAINING DO YOU NEED?
- Bachelor’s or Master’s degree in Computer Science, Engineering, Data Science, or a related technical field.
- 3–6 years of hands-on experience in ML engineering or applied MLOps roles within enterprise-scale environments.
- Proven experience designing, deploying, and maintaining machine learning models using modern MLOps and DevOps practices.
- Hands-on experience with Dataiku as a primary platform for ML orchestration, workflow automation, and operationalization of AI solutions.
- Strong experience with Azure cloud data platforms, including Snowflake and deployment frameworks like Snowpark or similar container services.
- Proficiency in building and maintaining CI/CD pipelines for ML, including model versioning, monitoring, and rollback strategies.
- Familiarity with GenAI tools and practices is considered a strong asset.
- Relevant certifications (e.g., Dataiku DSS, Snowflake, Azure ML, MLOps platforms) are considered an advantage.
- Experience working in Agile/SAFe environments and cross-functional product teams.
- Preferably: Bilingual (English and French) to support collaboration across regional and global initiatives.
- Implement projects or training throughout our different sites, including those outside Quebec.
- Work closely with teams from our Windsor, Ontario site.
- Strong understanding of ML engineering practices including MLOps, LLMOps, and model lifecycle management.
- Knowledge of cloud-based ML tools, containerized environments, and orchestration systems.
- Familiarity with ML observability, model drift monitoring, and pipeline instrumentation.
- Deep working knowledge of Dataiku as a central AI platform, including its capabilities for orchestrating ML workflows, automating pipelines, and operationalizing AI models at scale.
- Awareness of AI security, compliance, and governance principles in regulated environments.
- Familiarity with data engineering concepts and tools, particularly within Azure environments (e.g., Azure Data Factory), for building scalable data ingestion and transformation pipelines.
- Proficiency in building end-to-end ML pipelines using modern toolchains.
- Skilled in CI/CD for ML, containerized deployments, and model monitoring systems.
- Hands-on experience designing and deploying workflows within Dataiku, including plugins, scenarios, custom recipes, and model versioning.
- Hands-on experience with machine learning frameworks such as TensorFlow, PyTorch, or Keras for developing, training, and optimizing advanced models, as well as with Snowpark or similar frameworks for deploying and operationalizing models within cloud-based data platforms.
- Strong coding skills in Python and familiarity with data science libraries and frameworks (e.g., scikit-learn, MLflow, Hugging Face).
- Detail-oriented and committed to production-quality, scalable AI development.
- Highly collaborative and able to work closely with data scientists, engineers, and product teams.
- Adaptable to changing technologies and priorities in a fast-paced AI/ML environment.
- Strong communication skills with the ability to clearly articulate technical concepts to both technical and non-technical stakeholders.
- Curious, proactive, and passionate about building sustainable and impactful AI systems.