Senior Site Reliability Engineer

at  Publishingcom

Remote, Oregon, USA -

Start DateExpiry DateSalaryPosted OnExperienceSkillsTelecommuteSponsor Visa
Immediate23 Dec, 2024Not Specified29 Sep, 2024N/AAddition,Scripting Languages,Bash,Operational Efficiency,Aws,Scalability,Security,Incident Response,Reliability,Python,Collaboration,Communication Skills,Zapier,Operational Excellence,Scripting,Automation,Kubernetes,Jenkins,Hubspot,Heroku,InfrastructureNoNo
Add to Wishlist Apply All Jobs
Required Visa Status:
CitizenGC
US CitizenStudent Visa
H1BCPT
OPTH4 Spouse of H1B
GC Green Card
Employment Type:
Full TimePart Time
PermanentIndependent - 1099
Contract – W2C2H Independent
C2H W2Contract – Corp 2 Corp
Contract to Hire – Corp 2 Corp

Description:

COMPANY SUMMARY

Publishing.com empowers individuals from all walks of life to generate meaningful income streams through book publishing. As a leading online education platform, we specialize in guiding our students through the processes of writing, publishing, and selling books and audiobooks on major platforms like Amazon and Audible. We are thrilled to announce that Publishing.com has been recognized as the 19th fastest-growing private company in America for 2023, according to the prestigious Inc. 5000 list. Over the past two years, we’ve experienced an incredible 30% year-over-year growth and expanded our team by 500%. Recently, we hit a major milestone by helping 60,000+ students through our programs.
Our mission is to become the premier destination for all publishing-related needs. In line with this vision, we are excited to announce the launch of our latest innovation, Publishing.ai, a software designed to revolutionize the publishing industry further. This year marks a significant milestone in our journey toward achieving our goal, as we continue to expand our offerings and support our community of publishers.

REQUIREMENTS

  • Operational Expertise: Strong experience working closely with IT Ops teams to manage infrastructure and operational workflows.
  • Heroku: Experience managing services on Heroku, including scaling, performance optimization, and troubleshooting.
  • Pulumi: Hands-on experience using Pulumi or similar IaC tools to automate cloud infrastructure.
  • Sentry: Expertise in using Sentry for monitoring and alerting, ensuring prompt detection and resolution of system issues.
  • Cloud Platforms: Deep knowledge of cloud platforms such as AWS, Azure, or GCP, with a focus on reliability, security, and scalability.
  • CI/CD Expertise: Strong experience building and managing CI/CD pipelines using tools like Jenkins, GitLab CI, or similar.
  • Automation & Scripting: Proficiency in scripting languages such as Python or Bash to automate tasks and improve operational efficiency.
  • Monitoring & Observability: Experience with observability tools (Sentry, Prometheus, Grafana, Datadog) to ensure system reliability.
  • Collaboration: Excellent communication skills with the ability to collaborate cross-functionally with IT Ops, development, and other teams.
  • Security & Compliance: Strong understanding of cloud security best practices and experience with compliance frameworks such as SOC 2 and GDPR.

Preferred Skills

  • Incident Response: Experience leading incident response efforts and building automated incident management systems.
  • Terraform or Other IaC Tools: Experience with Terraform or other Infrastructure as Code tools in addition to Pulumi.
  • Kubernetes & Docker: Knowledge of container orchestration tools like Kubernetes and Docker.
  • Zapier, Webflow, HubSpot, and other no-code and low-code platforms: As a business, we use various low-code and no-code solutions and having experience with monitoring, operating, and contributing to reliability and operational excellence of such systems is highly desirable.

How To Apply:

Incase you would like to apply to this job directly from the source, please click here

Responsibilities:

ABOUT THE ROLE

As our first Site Reliability Engineer, you’ll be instrumental in building and maintaining the infrastructure and operational workflows that power our business. You will collaborate closely with the IT team to ensure the reliability, security, and performance of our cloud infrastructure and our website. Additionally, you will work with our software engineers to support their DevOps needs and lead key initiatives such as defining SLAs, SLIs, and SLOs. You will also implement automation, monitoring, and incident management systems to ensure smooth, scalable operations and continuously drive improvements in system reliability and performance.

RESPONSIBILITIES

  • Build highly scalable web applications
  • Propose, design, and implement scalable solutions to address our ever-growing ambitious marketing initiatives
  • Collaborate with other engineers by participating in design reviews and code reviews
  • Work closely with design, product, sales, and marketing teams to gather requirements and identify opportunities for improvement
  • Employ the most recent software development and deployment techniques
  • Implement data-oriented solutions for improving user experience


REQUIREMENT SUMMARY

Min:N/AMax:5.0 year(s)

Information Technology/IT

IT Software - Application Programming / Maintenance

Software Engineering

Graduate

Proficient

1

Remote, USA