Site Reliability Engineer/Ingénieur fiabilité des infrastructures

at  Tecsys Inc

Montréal, QC, Canada -

Start DateExpiry DateSalaryPosted OnExperienceSkillsTelecommuteSponsor Visa
Immediate26 Jul, 2024USD 51276 Annual30 Apr, 2024N/AJenkins,Aws,Computer Science,Languages,Geography,Ownership,Communication Skills,Platform Development,Teamwork,Large Scale Systems,Teams,Creativity,It,Orchestration,Iterative Design,Gitlab,Integration,Storage,Network Technologies,AzureNoNo
Add to Wishlist Apply All Jobs
Required Visa Status:
CitizenGC
US CitizenStudent Visa
H1BCPT
OPTH4 Spouse of H1B
GC Green Card
Employment Type:
Full TimePart Time
PermanentIndependent - 1099
Contract – W2C2H Independent
C2H W2Contract – Corp 2 Corp
Contract to Hire – Corp 2 Corp

Description:

La version française suit ci-dessous
Having recognized the advantages of remote work, including employee morale, productivity, reduced commuting on employee wellbeing and the environment, we are proud to be a digital-first company. The technologies and programs in which we invested have provided a fantastic foundation to this end. Our digital-first work environment, together with our conveniently located offices and collaborative workspaces, provide our team with the freedom and flexibility to work in the way that makes our employees most productive.

ABOUT US

Tecsys is a fast-growing innovator offering supply chain solutions to industry leading healthcare systems, hospitals, and pharmacy businesses to distributors, retailers, and 3PLs. We work with industry leaders to transform their supply chains through technology. If you thrive on tackling difficult challenges with continuous learning opportunities and want to work in a respectful, comfortable, and dynamic work environment, then this could be a good fit for you!

REQUIREMENTS:

  • Bachelor’s degree in computer science or related technical discipline.
  • At least 5 years’ experience in systems engineering experience; demonstrable technical experience in new platform development, orchestration, product ownership, and iterative design and deployment.
  • Experience designing and deploying large scale systems, multi-vendor platforms and globally distributed infrastructure.
  • Strong knowledge of system design; high performance computing; file, block, and storage technologies; integration of compute, storage, and network technologies to deliver cohesive infrastructure solutions.
  • High level of understanding and examples of executing projects with full stack automation; our scale is going to require a lot of it, we grow to use less manual intervention and work with both internal and open-source tools to automate day-to-day activities.
  • Self-organize, collaborate, and manage efforts with peers and teams across responsibility areas, languages, geography, and time zones.
  • Be a self-starter, curious, and not afraid to ask questions and challenge the way things are done today.
  • See a problem or opportunity, take ownership and act on it independently.
  • Knowledge of Datadog preferred (or at least, similar/equivalent product)
  • Knowledge of Rapid7 Insight preferred (or at least, similar/equivalent product)
  • Knowledge and experience of AWS or Azure required.
  • Basic knowledge of Java- or .Net-based development required.
  • Knowledge of GitLab (enterprise license) preferred (or at minimum, Jenkins required)
  • Experience with SaaS company is a strong asset.
  • Strong English communication skills, both written and spoken, are essential for effective correspondence with customers, business partners and colleagues beyond the province of Quebec.

ADDITIONAL REQUIREMENTS:

  • Escalation on-call rotation
  • Occasional travel (quarterly offsites, conferences – less than 10%)
    At Tecsys, we value creativity, innovation, and teamwork. Our employees enjoy a supportive work environment, competitive compensation packages, and opportunities for career growth and advancement.
    Tecsys is an equal opportunity employer. Accommodation is available for applicants selected for an interview.
    NB: if you are applying to this position, you must be a Canadian Citizen or a Permanent Resident of Canada, OR, have a valid Canadian work permit.

Ayant reconnu les avantages du travail à distance sur le bien-être des employés et l’environnement, notamment le moral des employés, la productivité, la réduction des trajets domicile-travail, nous sommes fière d’être une entreprise privilégiant le travail à distance. Les technologies et les programmes dans lesquels nous avons investi ont fourni une base fantastique à cette fin. Notre environnement qui privilégie le travail à distance, ainsi que nos bureaux bien situés et nos espaces de travail collaboratifs, offrent à notre équipe la liberté et la flexibilité de travailler de la manière qui rend nos employés les plus productifs.

VOS QUALIFICATIONS

  • Baccalauréat en informatique ou dans une discipline technique connexe.
  • Au moins 5 ans d’expérience en ingénierie des systèmes, expérience technique avérée dans le développement de nouvelles plateformes, l’orchestration, la propriété des produits et la conception et le déploiement itératifs.
  • Expérience dans la conception et le déploiement de systèmes à grande échelle, de plateformes multifournisseurs et d’infrastructures distribuées au niveau mondial.
  • Connaissance approfondie de la conception de systèmes, du calcul haute performance, des technologies de fichiers, de blocs et de stockage, de l’intégration des technologies de calcul, de stockage et de réseau pour fournir des solutions d’infrastructure cohérentes.
  • Haut niveau de compréhension et exemples d’exécution de projets avec une automatisation complète de la pile, notre échelle va en demander beaucoup, nous nous développons pour utiliser moins d’interventions manuelles et travailler avec des outils internes et source libre pour automatiser les activités quotidiennes.
  • Faire preuve d’initiative, de curiosité et ne pas avoir peur de poser des questions et de remettre en question la façon dont les choses sont faites aujourd’hui.
  • Voir un problème ou une opportunité, le ou la prendre en charge et agir en toute indépendance.
  • Connaissances de Datadog préférée (ou au moins, d’un produit similaire/équivalent).
  • Connaissances de Rapid7 Insight préférée (ou au moins, d’un produit similaire/équivalent)
  • Connaissances et expérience de AWS ou de Azure r
  • Connaissances de base en développement Java ou .Net requises.
  • Connaissances de GitLab (licence d’entreprise) de préférence (ou au minimum, Jenkins requis).
  • Avec de l’expérience dans un environnement SaaS constitue un atout majeur.
  • Une maîtrise solide de la communication en anglais, tant à l’écrit qu’à l’oral, est essentielle pour une correspondance efficace avec les clients, les partenaires commerciaux et les collègues au-delà de la province du Québec.

Responsibilities:

ABOUT THE ROLE

We are looking for a Site Reliability Engineer to work within our “Network and Security Operations Center” department. Our NOC team is aimed at improving the reliability and uptime of our platform and applications in a data-driven way to support internal and external customers’ needs.

YOUR RESPONSIBILITIES

  • Collaborate with other Engineering teams to support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning and launch reviews.
  • Maintain services once they are live by measuring and monitoring availability, latency and overall system health.
  • Develop tools & automation on top of Azure & AWS to continuously reduce the need for manual intervention.
  • Scale systems sustainably through automation and evolve systems by pushing for changes that improve reliability and velocity.
  • Be on-call.
  • Practice sustainable incident response and blameless postmortems.
  • Implement automated solutions for continuous integration and delivery (CI / CD).
  • Implement monitoring, Logging, alerting, and SLA Reporting.
  • Implement service monitoring dashboards displaying key metrics.
  • Create and maintain technical documentation.
  • Apply SRE best practices.
  • Take command of high-severity incidents and facilitate their resolution.
  • Provide support for our planning and deployment teams to enable stability, predictability, and scale in our continued growth.
  • Collaborate with members of the Platform Engineering team to implement and support far-reaching strategic efforts, provide constructive feedback, and foster a collaborative environment.
  • Work cross-functionally with internal teams and vendors to manage our growth around the globe, with a strong focus on maintaining the high level of performance, availability, and reliability for our users.


REQUIREMENT SUMMARY

Min:N/AMax:5.0 year(s)

Information Technology/IT

IT Software - System Programming

Software Engineering

Graduate

Computer science or related technical discipline

Proficient

1

Montréal, QC, Canada