Staff/Lead AI Platform Engineer

at  Commonwealth Bank

Sydney, New South Wales, Australia -

Start DateExpiry DateSalaryPosted OnExperienceSkillsTelecommuteSponsor Visa
Immediate28 Jan, 2025Not Specified23 Jan, 2025N/AInfrastructure,Version Control,Scripting,Ec2,Ecs,Docker,Languages,Automation,Github,ArtifactoryNoNo
Add to Wishlist Apply All Jobs
Required Visa Status:
CitizenGC
US CitizenStudent Visa
H1BCPT
OPTH4 Spouse of H1B
GC Green Card
Employment Type:
Full TimePart Time
PermanentIndependent - 1099
Contract – W2C2H Independent
C2H W2Contract – Corp 2 Corp
Contract to Hire – Corp 2 Corp

Description:

  • You are highly experienced in building customer focussed solutions
  • We are a team of big thinkers, who love to push boundaries and create new solution
  • Together we will build tomorrow’s bank today, using world-leading technology and innovation

TECH SKILLS

We use a broad range of tools, languages, and frameworks. We don’t expect you to know them all but experience or exposure with some of these (or equivalents) will set you up for success in this team;
AWS Services : In depth knowledge of AWS services such as EC2, ECS, S3, Lambda, Step function, RDS, DynamoDB, IAM, VPC, Route 53, Cloudwatch, EKS
ML Services : Expertise in AWS ML services like SageMaker, AWS Glue, Amazon EMR. Familiarity with AWS Bedrock, Amazon Q services, NVIDIA GPUs and related frameworks, LLMs.
Model Lifecycle: Experience with the end-to-end ML lifecycle, including data preprocessing, feature engineering, model training, evaluation, and deployment
Scripting: Proficient in automation and Scripting (Bash, Python).
IaC Tools: Hands-on experience with infrastructure as code tools like AWS CloudFormation,
Version Control: Proficiency with version control systems like Github, Github Actions
Monitoring & Observability: Expertise in tools like Grafana, Prometheus
Engineering Tooling: Artifactory, Synk, Docker

Responsibilities:

  • Provide strategic technical leadership and mentorship driving best practices for ML platform architecture, deployment and scaling
  • Oversee the design and development of scalable and resilient AI infrastructure with a focus on performance and reliability and architect core components, ensuring performance, reliability, and scalability
  • Create a standardised set of tooling for deploying and running applications and setting them up with best practices.
  • Collaboratively work with customer facing product owners and engineers to design, build and run platforms that they can use to deliver customer value at greater quality, velocity, and safety.
  • Make all platforms entirely self-service, secure, and available within minutes without human approval.
  • Collaborate with data scientists, engineers and stakeholders to define and implement technical requirements. Translate needs into technical solutions and ensure the platform’s reliability through robust monitoring, logging, and alerting systems
  • Develop and maintain comprehensive documentation, including architecture blueprints and best practices as well as conduct workshops and training sessions to educate and align the team on platform usage and best practices
  • Stay up to date with the latest development in the field of ML, MLOps, LLMs, GPUs and related concepts
  • R&D on emerging AWS technology.


REQUIREMENT SUMMARY

Min:N/AMax:5.0 year(s)

Information Technology/IT

IT Software - Application Programming / Maintenance

Software Engineering

Graduate

Proficient

1

Sydney NSW, Australia