Site Reliability Engineer
at KUBRA
Mississauga, ON, Canada -
Start Date | Expiry Date | Salary | Posted On | Experience | Skills | Telecommute | Sponsor Visa |
---|---|---|---|---|---|---|---|
Immediate | 25 Nov, 2024 | Not Specified | 30 Aug, 2024 | N/A | Computer Science,Scripting,Python,Go,Information Technology,Automation Tools,Logging,Communication Skills,Honeycomb,Containerization,Infrastructure | No | No |
Required Visa Status:
Citizen | GC |
US Citizen | Student Visa |
H1B | CPT |
OPT | H4 Spouse of H1B |
GC Green Card |
Employment Type:
Full Time | Part Time |
Permanent | Independent - 1099 |
Contract – W2 | C2H Independent |
C2H W2 | Contract – Corp 2 Corp |
Contract to Hire – Corp 2 Corp |
Description:
Are you passionate about transforming and optimizing complex infrastructures? Do you thrive on solving challenging technical problems and ensuring high availability, security, and performance in cloud environments?
At KUBRA, we’re seeking an enthusiastic and skilled Site Reliability Engineer to join our dynamic team. You’ll work with cutting-edge technologies like Kubernetes, AWS, Terraform, and CI/CD tools to build and maintain robust, scalable, and secure systems.
If you’re someone who loves to innovate, excels in a collaborative environment, and is dedicated to achieving excellence, this is the perfect opportunity for you to make a significant impact.
This is hybrid opportunity in Mississauga, ON.
WHAT SKILLS DO YOU NEED?
- Bachelor’s degree in computer science, Engineering, Information Technology, or equivalent experience.
- AWS Certifications (Solutions Architect, SysOps Administrator, DevOps Engineer) are desirable.
- Kubernetes Certifications (CKA, CKS, CKAD, KCNA) are desirable.
- Experience with a systems programming language, such as Go or Python, and shell scripting.
- Proficient with Terraform and infrastructure as code principles.
- Demonstrated proficiency in public cloud environments, particularly AWS.
- Hands-on experience with Kubernetes management within AWS EKS.
- Experience with CI/CD automation tools such as CircleCI and ArgoCD.
- Experience with monitoring and logging in cloud environments, using tools like Prometheus, Grafana, Open Telemetry, CloudWatch, Honeycomb, etc.
- In-depth understanding of containerization, microservice architecture, and related technologies.
- Strong communication skills and ability to facilitate effective technical problem-solving.
Responsibilities:
- Ensure that infrastructure and applications have high-quality Service Level Agreements (SLA) and Service Level Objectives (SLO) that are measured and adhered to.
- Ensure KUBRA maintains well-documented standards and best practices to ensure existing and new services are built for high availability and security.
- Ensure appropriate automation and observability exists to achieve low and continuously improving mean time to recovery (MTTR) for service-impacting incidents.
- Ensure that any incidents are thoroughly investigated and documented appropriately, along with the corresponding problem records with corrective actions.
- Participate in the Architectural Review Process for new and existing services being built for the KUBRA HQ platform, ensuring compliance with standards and best practices for high availability, observability, security, and cost efficiency.
- Work closely with Development, Infrastructure, and Operations teams to lead the root cause analysis related to any major incidents – leading senior stakeholder communication, driving problem-solving, and debugging with best practice techniques.
- Design and conduct fault injection experiments to identify potential weak points in high-availability architecture and work with Platform Engineering and Software Engineering teams to remediate any findings.
- Perform periodic audits of applications and infrastructure to ensure compliance with standards and identify necessary remediation.
REQUIREMENT SUMMARY
Min:N/AMax:5.0 year(s)
Information Technology/IT
IT Software - Other
Software Engineering
Graduate
Computer science engineering information technology or equivalent experience
Proficient
1
Mississauga, ON, Canada