AWS Data Engineer - 6305 at Apptoza Inc

Toronto, ON, Canada -

Full Time

Start Date

Immediate

Expiry Date

26 Sep, 25

Salary

0.0

Posted On

27 Jun, 25

Experience

6 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Good communication skills

Industry

Information Technology/IT

Description

JOB DESCRIPTION:

Job Description:Role Overview:We are seeking a highly skilled and experienced Senior Data Engineer to join our dynamic team. In this role, you will be responsible for designing, building, and maintaining scalable and reliable data pipelines and infrastructure on AWS. You will play a critical role in enabling our data-driven decision-making processes by ensuring the availability and quality of our data. The ideal candidate will possess a strong background in AWS cloud services, Python, SQL, PySpark, Airflow, and infrastructure as code (CDK). Experience with DevOps practices is a significant plus.Responsibilities: * Data Pipeline Development: Design, develop, and maintain robust and scalable data pipelines using Python, PySpark, and Airflow to ingest, process, and transform large datasets. * Cloud Infrastructure (AWS): Architect, build, and manage data infrastructure on AWS using services like S3, EC2, EMR, Redshift, Glue, and Lambda. * Infrastructure as Code (CDK): Implement and manage infrastructure as code using AWS CDK to ensure consistency, repeatability, and scalability of our data platform. * Database Management: Design and optimize database schemas and queries using SQL for efficient data storage and retrieval. * Data Quality and Testing: Implement comprehensive unit testing strategies to ensure data quality and pipeline reliability. * Performance Optimization: Identify and resolve performance bottlenecks in data pipelines and infrastructure. * Collaboration: Work closely with data scientists, analysts, and other engineers to understand data requirements and deliver effective solutions. * Documentation: Create and maintain thorough documentation of data pipelines, infrastructure, and processes. * Monitoring and Alerting: Implement monitoring and alerting systems to ensure the health and performance of data pipelines and infrastructure. * DevOps (Preferred): Contribute to DevOps practices, including CI/CD pipelines, automated deployments, and infrastructure monitoring.Qualifications: * Experience: 6-10 years of experience in data engineering or a related field. * Programming Languages: Strong proficiency in Python and SQL. * Big Data Technologies: Extensive experience with PySpark for distributed data processing. * Workflow Orchestration: Proven experience with Airflow for scheduling and managing data pipelines. * Cloud Computing (AWS): Deep understanding of AWS cloud services and best practices for data engineering. * Infrastructure as Code (CDK): Hands-on experience with AWS CDK for infrastructure provisioning and management. * Database Systems: Solid understanding of relational and NoSQL databases. * Testing: Strong understanding and implementation of unit testing strategies. * Problem-Solving: Excellent analytical and problem-solving skills. * Communication: Strong communication and collaboration skills. * DevOps (Preferred): Knowledge of DevOps principles and practices, including CI/CD, containerization (Docker), and orchestration.
Job Type: Fixed term contrac

Responsibilities

Please refer the Job description for details