Staff Software Engineer - Platform Architecture at b.well Connected Health
, , -
Full Time


Start Date

Immediate

Expiry Date

21 Sep, 26

Salary

220000.0

Posted On

23 Jun, 26

Experience

10 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Distributed Systems, Kafka, GraphQL, REST API, FHIR, Python, TypeScript, Java, Spring Boot, MongoDB, ClickHouse, AWS, Kubernetes, Event-Driven Architecture, System Design, AI Architecture

Industry

Hospitals and Health Care

Description
Staff Software Engineer - Platform Architecture    Remote (US) · Full-time Company Overview b.well is solving healthcare’s fragmentation problem with our FHIR-based health data management platform. The platform connects data from EHRs, wearables, portals, and other sources, while our intelligence engine personalizes the consumer experience. By simplifying the complex healthcare ecosystem, we make it easy and convenient for consumers to engage and take action—whether it’s scheduling care, setting reminders, accessing health data, and more. For our clients, this means better health outcomes, operational efficiency, and stronger consumer engagement. Position Overview The Architecture team defines how that platform is designed and built. We set the patterns, contracts, and standards the rest of engineering builds on, run technical design reviews, and take on the hardest, most cross-cutting systems ourselves. We're hiring a Staff Software Engineer to own work like event-driven data integration on Kafka, large-scale distributed processing, and the API and federated GraphQL layer behind our SDKs — and to partner on making AI a dependable part of the platform. This is a hands-on role. You design these systems, you build them, and you help operate them in production. You work horizontally across teams rather than inside a single product squad, so the patterns and services you create show up everywhere. The hard problems here are about scale, correctness, distributed-systems behavior, and the real-world messiness of healthcare data. What you'll do: * Design and build event-driven distributed systems on Kafka: orchestrator and worker services, schema and contract governance (AsyncAPI, CloudEvents), idempotent consumers, sagas and event choreography, and config-driven pipelines that onboard new data sources without code changes. * Own API architecture at scale across REST and a federated GraphQL gateway (WunderGraph Cosmo) that powers our SDKs, including schema composition, versioning, contract evolution, and FHIR conformance. * Partner on AI architecture and enablement: help make LLMs and agents dependable parts of the system (retrieval, evaluation, guardrails, and clear boundaries around protected health data), working alongside the dedicated Applied AI Engineering lead who owns that area. * Shape FHIR and health-data architecture: terminology services, resource modeling, clinical quality measurement (DQM and CQL), high-volume document storage, and query performance across MongoDB and ClickHouse. * Help run the technical design review process and document the decisions and standards, so significant designs are reviewed against consistent criteria and the platform stays coherent as it grows. * Stay close to the code and to production: write real software, build proofs of concept that prove out a pattern before teams adopt it, and use observability and operational data to confirm decisions hold up and to debug incidents when they don't. * Mentor senior engineers and tech leads on distributed-systems thinking, system design, and the trade-offs behind good architecture. What we're looking for: * Deep expertise in large-scale distributed systems and scalable design. You've built and operated systems that handle high volume and real failure modes, and you're fluent in partitioning, back-pressure, idempotency, eventual consistency, and the trade-offs between consistency and availability. * Strong event-driven architecture experience, hands-on with Kafka: topic and partition design, schema evolution and governance, consumer groups, idempotency, and the operational side of running it in production. * Strong API design across REST and GraphQL. We run a federated GraphQL gateway, so schema composition, versioning, and contract evolution matter here as much as clean endpoint design. * Exposure to AI as a component of a system is a plus — LLMs or agents and the practical concerns around them (retrieval, evaluation, guardrails, cost and latency). Deep AI architecture ownership sits with a dedicated Applied AI Engineering hire, so it isn't a primary requirement here. We do expect you to use AI tools like Claude fluently in your own work. * A well-rounded, hands-on engineer who goes deep in at least one strongly-typed language and moves comfortably across a polyglot stack. You have strong software-design instincts and care about clean interfaces, sensible abstractions, and keeping services loosely coupled. Our backend spans Python (data and AI), TypeScript/Node.js (which runs our FHIR server), and Java/Spring Boot (FHIR processing and backend services), with Kafka, our federated GraphQL gateway, Databricks, ClickHouse, MongoDB, and AWS around it. * Sound data and storage instincts: data modeling, caching, query optimization, and how batch and streaming workloads (for example on Spark or Databricks) fit together with operational systems. * Cloud-native and security fundamentals: Kubernetes and AWS, identity and auth (OIDC, token exchange), and multi-tenant isolation in a regulated environment that handles protected health data. * You can walk someone through a system you've designed and operated end to end: how it works, where it breaks, and the trade-offs you made. * Healthcare or FHIR experience is a strong plus and something you'll go deep in here. If you haven't worked in the domain yet, that's fine; it's learnable, and we'll help you get there. What success looks like after 12-18 months: * The platform's event-driven and API/GraphQL foundations are solid, well-governed, and adopted across teams. * The platform's AI integration points are dependable and safe to build on, in partnership with the Applied AI Engineering lead. * FHIR and health-data architecture (terminology, quality measurement, storage) performs well and scales to millions of patients. * Technical design reviews run efficiently, with clear decisions and standards that teams rely on. * Engineering moves faster because teams build on proven, reusable platform patterns instead of reinventing them. Compensation & benefits The target salary range for this position is $175,000 - $220,000 and is part of a competitive total rewards package including stock options, benefits, and incentive pay for eligible roles. Individual pay may vary from the target range and is determined by a number of factors including experience, location, internal pay equity, and other relevant business considerations. We review all employee pay and compensation programs annually at minimum to ensure competitive and fair pay. About applying Data shows that women, people of color, and other underrepresented groups may be less likely to apply for jobs unless they believe they are a perfect match. But b.well holds diversity amongst its key values, and we have a strong commitment to building our workforce and products through that lens. You don't have to check every box in this job description to be a great fit for the role. If you're excited about this position and the prospect of working for b.well, please apply. If it turns out this role isn't for you, there may be other openings that could align with your experience and expertise. We are committed to an inclusive and diverse b.well. We are an equal opportunity employer. We do not discriminate based on race, ethnicity, color, ancestry, national origin, religion, sex, sexual orientation, gender identity, age, disability, veteran, genetic information, marital status or any other legally protected status.  
Responsibilities
Design and build event-driven distributed systems and API architectures using Kafka and federated GraphQL. Lead technical design reviews and mentor senior engineers to ensure platform coherence and scalability.
Loading...