Senior DevOps Engineer - AWS
at 3Pillar
Home Office, Iowa, Czech -
Start Date | Expiry Date | Salary | Posted On | Experience | Skills | Telecommute | Sponsor Visa |
---|---|---|---|---|---|---|---|
Immediate | 10 Feb, 2025 | Not Specified | 11 Nov, 2024 | N/A | Amazon Web Services,Software,Splunk,C++,Ownership,Distributed Systems,Operations,Automation,C,Time Management,Drive,Access Control,Java,Difficult Situations,Communication Skills,Python,Devops,Computer Science | No | No |
Required Visa Status:
Citizen | GC |
US Citizen | Student Visa |
H1B | CPT |
OPT | H4 Spouse of H1B |
GC Green Card |
Employment Type:
Full Time | Part Time |
Permanent | Independent - 1099 |
Contract – W2 | C2H Independent |
C2H W2 | Contract – Corp 2 Corp |
Contract to Hire – Corp 2 Corp |
Description:
\uD83D\uDE80 Join Our Mission at 3Pillar: Elevate Your Impact! \uD83D\uDE80
As a Senior DevOps Engineer, you are responsible for ensuring that our platform is stable and healthy. We break down barriers to run our products by fostering developer-run ownership and empowering developers to build resilient products. We support our developers during the application build phase in software-run principles that include operational design, automation, capacity planning, and monitoring that leads to fault-tolerant, scalable products.
DESIRED CAPABILITIES:
- Strong attention to detail
- Excellent communication skills
- Ability to work well in a team
- Analytical and problem-solving skills
- Time management and organizational skills
- Ability to learn quickly
- Adaptability and flexibility
- Proven ability to lead and mentor junior members of the QA team.
MINIMUM QUALIFICATIONS:
- Bachelor’s degree in computer science, software engineering, or a similar field.
- Experience in Splunk and SignalFx
- Experience with Amazon Web Services including RDS
- Relevant data DevOps, SRE, or general systems engineering experience.
- Experience in managing large production platforms.
- Experience architecting and implementing data governance processes and tooling (data catalogues, lineage tools, role-based access control, PII handling)
- Strong coding ability in Python or other languages like Java, C#, Golang, C, C++, Perl Ruby etc.
ADDITIONAL EXPERIENCE DESIRED:
- Systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive.
- Ability to help debug and optimize code and automate routine tasks.
- Ability to support many different stakeholders. Experience in dealing with difficult situations and making decisions with a sense of urgency is needed.
- Interest in designing, analyzing and troubleshooting large-scale distributed systems.
- Appetite for change and pushing the boundaries of what can be done with automation.
- Experience in working across development, operations, and product teams to prioritize needs and build relationships is a must.
- Good Handle on Change Management and Release Management aspects of Software.
Responsibilities:
- Plan, manage, and oversee all aspects of the production environment for all merchant loyalty use cases
- Define strategies for all facets of observability
- Identify areas of improvement in production
- Ability to understand MTTR, SLO, SLI definitions and apply them to services.
- Respond to Incidents and improvise platform based on feedback and measure the reduction of incidents over time.
- Ensure reliable, fault-tolerant, efficiently scalable and cost-effective services and infrastructure.
- Maintain services once they are live by measuring and monitoring availability, latency and overall system health.
- Practice sustainable incident response and blameless postmortems.
- Ensures that batch production scheduling and process are accurate and timely.
- Able to create and execute queries to big data platforms and relational data tables to identify process issues or to perform mass updates, preferred.
- Ability to isolate problems between hardware and software.
- Analyze ITSM activities of the platform and provide a feedback loop to development teams on operational gaps or resiliency concerns
- Support services before they go live through activities such as system design consulting, capacity planning and launch reviews.
- Maintain services once they are live by measuring and monitoring availability, latency and overall system health.
- Scale systems sustainably through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and velocity.
- Work with a global team spread across tech hubs in multiple geographies and time zones
REQUIREMENT SUMMARY
Min:N/AMax:5.0 year(s)
Information Technology/IT
IT Software - System Programming
Software Engineering
Graduate
Computer science software engineering or a similar field
Proficient
1
Home Office, Czech