Software Engineer III - Java, Kafka, Kubernetes at JPMC Candidate Experience page
Mumbai, maharashtra, India -
Full Time


Start Date

Immediate

Expiry Date

23 Sep, 26

Salary

0.0

Posted On

25 Jun, 26

Experience

2 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Java, Spring Boot, Apache Kafka, Kubernetes, Oracle SQL, Elasticsearch, Linux, CI/CD, Incident Management, Root Cause Analysis, Python, Shell Scripting, GCP, AI-assisted Development, Microservices, Observability

Industry

Financial Services

Description
We have an exciting and rewarding opportunity for you to take your software engineering career to the next level.  As a Software Engineer III at JPMorgan Chase within the Commercial & Investment Bank, you serve as a seasoned member of an agile team to design and deliver trusted market-leading technology products in a secure, stable, and scalable way. You are responsible for carrying out critical technology solutions across multiple technical areas within various business functions in support of the firm’s business objectives. Job responsibilities   * Provide Level 3 production support for critical Java microservices and batch/streaming workloads. * Own major incident (P1/P2) triage, troubleshooting, mitigation, and restoration within SLA. Perform deep Root Cause Analysis (RCA) including log/metric/trace analysis; deliver corrective and preventive actions. * Support and tune Kafka-based event streaming: consumer lag issues, rebalancing, partition strategy, retries/DLQ patterns, idempotency, ordering concerns. Support Spring Boot services: thread dumps, heap analysis, GC behavior, connection pooling, dependency issues. * Diagnose and resolve database (Oracle) issues: slow queries, locks/deadlocks, indexing, execution plans, connection pool saturation. * Support Elasticsearch (Gaia): index health, query performance, mapping issues, shard/replica allocation, ingestion failures. * Collaborate with development teams on bug fixes, hotfix validation, release readiness, and post-deployment verification. * Improve observability: enhance dashboards/alerts, log correlation, runbooks, and operational KPIs. Participate in on-call rotations, change management, and planned maintenance activities. * Drive problem management: identify recurring issues, reduce noise, and implement automation/self-healing where feasible. * Leverages enterprise-authorized AI coding assist tools within the work environment to improve code quality, delivery speed, and productivity across complex deliverables (e.g., code generation/refactoring, unit test creation, documentation), while validating outputs through peer review, automated testing, and secure coding standards; contributes learnings and reusable patterns to improve broader team effectiveness. * Applies knowledge of tools within the Software Development Life Cycle toolchain, including enterprise-authorized AI-assisted development and automation capabilities, to improve the value realized by automation.     Required qualifications, capabilities, and skills   * Formal training or certification on software engineering concepts and 3+ years applied experience * Strong experience in Java production troubleshooting (Java 11+). Strong hands-on with Spring Boot (REST APIs, configuration, actuator/metrics, dependency management). * Experience supporting Apache Kafka in production (topics, partitions, consumer groups, offsets, retries/DLQ, schema/versioning concepts). Strong knowledge of Oracle (SQL tuning, indexing, locking, performance troubleshooting). * Experience supporting Elasticsearch (cluster health, index lifecycle, query DSL basics, performance diagnosis). * Proven ability with incident management and RCA (problem statements, timeline, 5-Whys, action items). Strong Linux skills: process inspection, file/log handling, networking basics (netstat/ss, DNS, TLS concepts). * CI/CD and release support exposure (any of): Jenkins, GitLab CI, ArgoCD, etc. Scripting/automation: Shell / Python basics for diagnostics and operational tooling. * Experience with Kubernetes and cloud platform operations (assumed GKE/GCP or your “GKP” platform): pods, deployments, configmaps/secrets, scaling, resource limits, troubleshooting restarts/OOM. * Hands-on experience using enterprise-authorized AI-assisted software development tools within the work environment (e.g., for coding, test creation, troubleshooting, or documentation) with demonstrated ability to critically evaluate, validate, and refine AI-generated outputs for correctness, performance, and security. * Understanding of responsible AI use in engineering workflows, including data sensitivity considerations, secure handling of inputs/outputs, and adherence to resiliency and security expectations; ability to guide peers on safe and effective usage within team practices.     Preferred qualifications, capabilities, and skills   * Experience with schema registries (e.g., Avro/Protobuf concepts) and message compatibility strategies. * Performance testing and capacity planning exposure (load patterns, bottleneck identification). Knowledge of ITIL processes (Incident/Problem/Change) and service management tools (e.g., ServiceNow). Experience, with 3+ years in L3/production support for distributed systems preferred * Familiarity with security basics: secrets handling, certificate rotation, least privilege, vulnerability remediation support. * Experience in regulated/high-availability environments (financial services is a plus). Familiarity with observability tools (any of): Splunk/ELK, Prometheus, Grafana, AppDynamics/Dynatrace, OpenTelemetry.
Responsibilities
Provide Level 3 production support for Java microservices and Kafka-based event streaming workloads. Responsible for incident triage, root cause analysis, and improving system observability and automation.
Loading...