Start Date
Immediate
Expiry Date
09 Oct, 25
Salary
50.0
Posted On
09 Jul, 25
Experience
5 year(s) or above
Remote Job
Yes
Telecommute
Yes
Sponsor Visa
No
Skills
Python, Security, Docker, Data Processing, Transformation, Sql, Data Governance, Continuous Improvement, Containerization, Telecommunications
Industry
Information Technology/IT
We are seeking a highly skilled Senior PySpark Developer to join our data engineering team. The ideal candidate will have deep expertise in building scalable data pipelines using PySpark and Spark Streaming, and experience working with modern data platforms on AWS. This role requires strong consulting and communication skills, a passion for innovation, and the ability to work independently in a fast-paced environment. Experience in the telecommunications industry is highly preferred.
REQUIRED SKILLS & QUALIFICATIONS:
· 5+ years of experience in data engineering with a focus on PySpark
· Strong experience with Spark Streaming and real-time data processing
· Hands-on experience with AWS EMR, Airflow, and EKS
· Proficiency in Apache NiFi for data ingestion and transformation
· Experience working with Iceberg tables and S3-based data lakes
· Solid understanding of AWS Aurora PostgreSQL and API integration
· Proficient in T-SQL, Python, and SQL
· Demonstrated ability to analyze and improve job performance in distributed data environments
· Experience in client-facing roles or consulting environments
· Excellent communication and consulting skills
· Self-starter with a passion for innovation and continuous improvement
· Strong problem-solving skills and attention to detail
· Industry experience in telecommunications is highly preferred
PREFERRED QUALIFICATIONS:
· AWS certification (e.g., AWS Certified Data Analytics – Specialty)
· Experience with CI/CD pipelines and containerization (Docker, Kubernetes)
· Familiarity with data governance and security best practices
Job Type: Fixed term contract
Contract length: 12 months
Pay: $50.00-$55.00 per hour
Work Location: In perso
· Design and develop scalable data pipelines using PySpark and Spark Streaming
· Implement data workflows on AWS EMR orchestrated with Apache Airflow
· Deploy and manage streaming applications on Amazon EKS with Spark Streams
· Integrate and manage data flows using Apache NiFi
· Work with Apache Iceberg tables stored in Amazon S3
· Enable data access and consumption through AWS Aurora PostgreSQL for downstream APIs
· Write and optimize complex queries using T-SQL for data extraction and transformation
· Identify and implement opportunities for job task performance improvement, including pipeline optimization, resource tuning, and automation
· Drive innovation by identifying opportunities to improve data workflows and automation
· Work independently and manage multiple priorities in a fast-paced environment
· Collaborate with cross-functional teams, including clients, to deliver high-quality data solutions
· Participate in code reviews, architecture discussions, and performance tuning