Senior Site Reliability Engineer (SRE) at Ericsson
Cairo, Cairo, Egypt -
Full Time


Start Date

Immediate

Expiry Date

19 Mar, 26

Salary

0.0

Posted On

19 Dec, 25

Experience

5 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Terraform, AWS, IaC, CI/CD, Observability, Security, Compliance, AI/ML, Kubernetes, Helm, Python, DevSecOps, MLOps, Linux, Networking, Monitoring

Industry

Telecommunications

Description
Develop and maintain infrastructure definitions using Terraform to enable reliable, automated, and repeatable deployments. Collaborate with cross-functional teams to incorporate IaC principles into CI/CD pipelines, accelerating feature releases and minimizing downtime. Implement robust observability solutions (e.g., AWS CloudWatch, CloudTrail, AWS Config, etc.) to proactively detect and resolve performance bottlenecks. a. Embed security best practices within the software development lifecycle, covering identity and access management (IAM), networking, VPC, encryption, and monitoring. b. Ensure adherence to cloud compliance standards (SOC 2, HIPAA, GDPR, etc.), performing regular audits and vulnerability scans to maintain a robust security posture. Provide operational support for AI/ML models running on AWS, collaborating with data science teams to optimize performance and reliability. Performance & Cost Optimization a. Manage and mentor a cross-functional SRE team, promoting a collaborative, results-driven environment and advancing professional growth. b. Collaborate with product owners, development teams, and stakeholders to align SRE priorities with broader business objectives. Bachelor's degree in Computer Science, Computer Engineering, or a related field. Overall Software Development: 6+ years of professional experience in software development. Site Reliability Engineering: 3+ years of dedicated SRE experience with a primary focus on AWS cloud services and infrastructure. Cloud Computing Concepts: Deep understanding of virtualization, networking, and storage in public cloud environments. AWS Proficiency: Demonstrated ability to manage, operate, and secure AWS services (., IAM, S3, EKS, ECS, Fargate, App Runner, RedShift, SNS, SQS, EventBridge, Athena, SageMaker, Aurora, DynamoDB, Cognito, API Gateway, etc.). AWS for AI/ML: Hands-on support of AI/ML model operations on AWS, collaborating with data science teams and optimizing ML workloads. Kubernetes & Container Management: Proven experience with Kubernetes (preferably EKS) for container orchestration, including deploying and maintaining production workloads. Helm Package Management: Skilled in creating and managing Helm charts for Kubernetes-based applications. IaC Frameworks: Proficiency in Terraform and Burrito (if applicable), ensuring production-grade, scalable infrastructure definitions. Scripting & Automation: Advanced skills in Python (including AWS SDK/boto3), Bash, and/or PowerShell for automating cloud operations. DevSecOps & GitOps: Hands-on experience integrating security best practices into CI/CD pipelines, leveraging GitOps tools (Argo CD, Flux) for declarative deployments. MLOps: Working knowledge of machine learning lifecycle management, ensuring robust and efficient AI/ML model deployments. Linux Administration: Strong background in Linux system management, performance tuning, and troubleshooting. Networking: Expertise in VPNs, firewalls, routing, switching, DNS, load balancers, and related security considerations. Monitoring & Observability: Proficiency with one or more monitoring solutions (Datadog, Prometheus, Grafana, CloudWatch) to drive proactive incident response. Security & Compliance: In-depth familiarity with SOC 2, HIPAA, GDPR, and best practices around IAM, encryption, and network segmentation. Problem-Solving & Communication: Demonstrated strength in diagnosing complex technical issues and effectively communicating solutions to varied stakeholders. Other Cloud Environments: Exposure to Azure or further GCP services beyond AI/ML is beneficial. Advanced Programming/Scripting: Experience in Python, Go or other modern languages is a plus. Team Leadership: Demonstrated success in building and leading cross-functional teams, including performance management and strategic planning. Be inspired by the needs of fast-changing environments. The chance to use your skills and imagination to push the boundaries of what´s possible. To build solutions never seen before to some of the world's toughest problems. You´ll be challenged, but you won't be alone. You´ll be joining a team of diverse innovators, all driven to go beyond the status quo to craft what comes next. What happens once you apply? Click Here to find all you need to know about what our typical hiring process looks like. We truly believe that by collaborating with people with different experiences we drive innovation, which is essential for our future growth. learn more. Primary country and city: Egypt (EG) || Cairo Req ID: 769986 AWS Certifications: AWS Certified Solutions Architect (Associate/Professional), AWS Certified DevOps Engineer - Professional, or other relevant certifications. Additional certifications in GCP, Azure, security (CISSP, CISM) are considered advantageous.
Responsibilities
Develop and maintain infrastructure definitions using Terraform and collaborate with cross-functional teams to enhance CI/CD pipelines. Implement observability solutions and ensure security best practices while providing operational support for AI/ML models on AWS.
Loading...