Senior DevOps Engineer - AWS

at  3Pillar

Home Office, Iowa, Czech -

Start DateExpiry DateSalaryPosted OnExperienceSkillsTelecommuteSponsor Visa
Immediate10 Feb, 2025Not Specified11 Nov, 2024N/AAmazon Web Services,Software,Splunk,C++,Ownership,Distributed Systems,Operations,Automation,C,Time Management,Drive,Access Control,Java,Difficult Situations,Communication Skills,Python,Devops,Computer ScienceNoNo
Add to Wishlist Apply All Jobs
Required Visa Status:
CitizenGC
US CitizenStudent Visa
H1BCPT
OPTH4 Spouse of H1B
GC Green Card
Employment Type:
Full TimePart Time
PermanentIndependent - 1099
Contract – W2C2H Independent
C2H W2Contract – Corp 2 Corp
Contract to Hire – Corp 2 Corp

Description:

\uD83D\uDE80 Join Our Mission at 3Pillar: Elevate Your Impact! \uD83D\uDE80
As a Senior DevOps Engineer, you are responsible for ensuring that our platform is stable and healthy. We break down barriers to run our products by fostering developer-run ownership and empowering developers to build resilient products. We support our developers during the application build phase in software-run principles that include operational design, automation, capacity planning, and monitoring that leads to fault-tolerant, scalable products.

DESIRED CAPABILITIES:

  • Strong attention to detail
  • Excellent communication skills
  • Ability to work well in a team
  • Analytical and problem-solving skills
  • Time management and organizational skills
  • Ability to learn quickly
  • Adaptability and flexibility
  • Proven ability to lead and mentor junior members of the QA team.

MINIMUM QUALIFICATIONS:

  • Bachelor’s degree in computer science, software engineering, or a similar field.
  • Experience in Splunk and SignalFx
  • Experience with Amazon Web Services including RDS
  • Relevant data DevOps, SRE, or general systems engineering experience.
  • Experience in managing large production platforms.
  • Experience architecting and implementing data governance processes and tooling (data catalogues, lineage tools, role-based access control, PII handling)
  • Strong coding ability in Python or other languages like Java, C#, Golang, C, C++, Perl Ruby etc.

ADDITIONAL EXPERIENCE DESIRED:

  • Systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive.
  • Ability to help debug and optimize code and automate routine tasks.
  • Ability to support many different stakeholders. Experience in dealing with difficult situations and making decisions with a sense of urgency is needed.
  • Interest in designing, analyzing and troubleshooting large-scale distributed systems.
  • Appetite for change and pushing the boundaries of what can be done with automation.
  • Experience in working across development, operations, and product teams to prioritize needs and build relationships is a must.
  • Good Handle on Change Management and Release Management aspects of Software.

Responsibilities:

  • Plan, manage, and oversee all aspects of the production environment for all merchant loyalty use cases
  • Define strategies for all facets of observability
  • Identify areas of improvement in production
  • Ability to understand MTTR, SLO, SLI definitions and apply them to services.
  • Respond to Incidents and improvise platform based on feedback and measure the reduction of incidents over time.
  • Ensure reliable, fault-tolerant, efficiently scalable and cost-effective services and infrastructure.
  • Maintain services once they are live by measuring and monitoring availability, latency and overall system health.
  • Practice sustainable incident response and blameless postmortems.
  • Ensures that batch production scheduling and process are accurate and timely.
  • Able to create and execute queries to big data platforms and relational data tables to identify process issues or to perform mass updates, preferred.
  • Ability to isolate problems between hardware and software.
  • Analyze ITSM activities of the platform and provide a feedback loop to development teams on operational gaps or resiliency concerns
  • Support services before they go live through activities such as system design consulting, capacity planning and launch reviews.
  • Maintain services once they are live by measuring and monitoring availability, latency and overall system health.
  • Scale systems sustainably through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and velocity.
  • Work with a global team spread across tech hubs in multiple geographies and time zones


REQUIREMENT SUMMARY

Min:N/AMax:5.0 year(s)

Information Technology/IT

IT Software - System Programming

Software Engineering

Graduate

Computer science software engineering or a similar field

Proficient

1

Home Office, Czech