Sr Site Reliability Engineer

at  IDeas

Bloomington, MN 55437, USA -

Start DateExpiry DateSalaryPosted OnExperienceSkillsTelecommuteSponsor Visa
Immediate26 Sep, 2024Not Specified27 Jun, 2024N/AInterpersonal Skills,Infrastructure,Relationship Building,Software Development,Agile Methodologies,Postgresql,Web Technologies,Mysql,Operations,Computer Science,Sql Server,Relational Databases,Angular,Cloud ServicesNoNo
Add to Wishlist Apply All Jobs
Required Visa Status:
CitizenGC
US CitizenStudent Visa
H1BCPT
OPTH4 Spouse of H1B
GC Green Card
Employment Type:
Full TimePart Time
PermanentIndependent - 1099
Contract – W2C2H Independent
C2H W2Contract – Corp 2 Corp
Contract to Hire – Corp 2 Corp

Description:

:
Passionate people. Loyal clients. Leading solutions.
With a rich culture of creative collaboration and professional growth, IDeaS’ team members build successful careers with us.
IDeaS is proud to be a global powerhouse of innovation and excellence; challenge and reward. No matter where we’re working, our teams come together to create leading revenue management solutions that accelerate our clients’ growth through revenue optimization.

NOW WE JUST NEED YOU!

We are seeking a Senior Site Reliability Engineer at IDeaS, a SAS Company. You will play a pivotal role in ensuring the reliability, scalability, and performance of our revenue science software solutions. With a minimum of eight years of experience, you bring a wealth of knowledge and expertise in software development and infrastructure operations. You will serve as a go-to expert in ensuring the stability and efficiency of our systems, collaborating closely with cross-functional teams to address complex challenges. Your strong communication skills will be instrumental as you proactively build relationships and streamline processes to enhance system reliability and performance. You are persistent in the face of roadblocks, dispatch them efficiently, and pull in others when necessary, taking the initiative to ensure stability of the production environments while creating a backlog to reduce re-occurrences of issues and ensure long-term scalability. Our systems are data-intensive and require a strong focus on data and machine-learning pipelines.

Responsibilities:

  • Collaborate closely with our development and operations teams to design, implement, and maintain highly available, scalable, and resilient software solutions, with a particular focus on data and ML pipelines.
  • Utilize your expertise in cloud computing and microservices architecture to enhance the reliability and performance of our data-intensive systems.
  • Engage with stakeholders to understand system requirements and ensure that our solutions meet rigorous reliability and performance standards, especially in the context of data processing and machine learning.
  • Actively participate in project scoping, scheduling, and task tracking, identifying potential reliability issues and implementing solutions to address them within our data-centric environment.
  • Implement Service Level Objectives (SLOs) and Service Level Indicators (SLIs) to measure and monitor the reliability and performance of our data and ML pipelines, ensuring that they meet agreed-upon targets.
  • Collaborate with the performance engineering team to design and implement performance regression test suites tailored to data and ML workloads, ensuring that system performance is continuously monitored and optimized in these critical areas.
  • Take ownership of the reliability and performance of our codebase, providing support to internal and external users as needed, particularly in the context of data processing and ML applications.
  • Collaborate closely with subject matter experts to gain domain-specific insights into data and ML pipelines and document system designs and configurations accordingly.
  • Utilize tools like Jira, Datadog, and GitHub to manage projects, track issues, and collaborate effectively with team members, with a focus on supporting data-intensive workflows.
  • Define success metrics and monitor system performance to ensure that our solutions meet or exceed reliability and performance targets, especially in the context of data processing and ML applications.
  • Proactively identify and address potential reliability issues before they impact system performance, with a particular emphasis on maintaining the integrity and efficiency of our data and ML pipelines.
  • Perform other duties, as assigned


REQUIREMENT SUMMARY

Min:N/AMax:5.0 year(s)

Computer Software/Engineering

IT Software - Other

Software Engineering

Graduate

Proficient

1

Bloomington, MN 55437, USA