Performance Tester – GenAI at Orion Systems Integrators LLC United Kingdom

Chennai, tamil nadu, India -

Full Time

Start Date

Immediate

Expiry Date

04 Jun, 26

Salary

0.0

Posted On

06 Mar, 26

Experience

5 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Performance Testing, Generative AI, LLM, JMeter, LoadRunner, K6, Gatling, Dynatrace, AppDynamics, Azure Monitor, AWS CloudWatch, Azure DevOps, Docker, Kubernetes, Python, Java

Industry

IT Services and IT Consulting

Description

Role: Performance Test Engineer – Generative AI Experience: 5+ years (with hands-on performance testing in GenAI / LLM-based applications) Role Overview: We are seeking a skilled and detail-oriented Performance Tester with strong experience in Generative AI (GenAI) projects. The ideal candidate will be responsible for ensuring scalability, reliability, and optimal performance of AI-powered applications, including Large Language Model (LLM) integrations, conversational AI systems, and Retrieval-Augmented Generation (RAG) pipelines. This role requires expertise in performance engineering, cloud platforms, and testing of AI/ML workloads in production environments. Key Responsibilities • Performance Strategy & Planning: Define and implement performance testing strategies for GenAI and LLM-based applications. Identify performance bottlenecks across APIs, model inference layers, vector databases, and cloud infrastructure. Establish performance benchmarks, SLAs, and scalability targets for AI-driven systems. • Performance Testing & Engineering: Design, develop, and execute load, stress, spike, endurance, and scalability tests for GenAI applications. Perform performance testing of LLM-powered APIs (e.g., ChatGPT-like applications) hosted on cloud platforms. Validate latency, throughput, token usage, concurrency handling, and cost-performance trade-offs. Conduct performance validation for RAG pipelines including embedding generation and vector search. Analyze model inference time, GPU/CPU utilization, memory usage, and autoscaling behavior. • Tools & Automation: Develop automated performance test scripts using tools such as JMeter, LoadRunner, k6, or Gatling. Monitor system performance using APM tools like Dynatrace, AppDynamics, Azure Monitor, or AWS CloudWatch. Integrate performance testing into CI/CD pipelines using Azure DevOps or similar platforms. Create dashboards and reports for performance metrics and trend analysis. • Cloud & Infrastructure Testing: Conduct performance testing on AI solutions deployed on Azure, AWS, or GCP. Validate autoscaling configurations, containerized deployments (Docker, Kubernetes), and serverless architectures. Assess performance of vector databases such as Chroma, Pinecone, Weaviate, or FAISS under load. • Collaboration & Optimization: Collaborate with AI engineers, data scientists, DevOps, and architects to optimize model serving and API performance. Recommend improvements in prompt engineering, caching strategies, batching, and parallelization. Support capacity planning and cost optimization for LLM-based applications. • Governance & Reporting: Document performance test results, bottlenecks, and optimization recommendations. Ensure compliance with security and data privacy standards in performance environments. Present findings to stakeholders and provide actionable insights. Key Requirements • Technical Skills: 5+ years of experience in Performance Testing and Engineering. Hands-on experience in performance testing GenAI / LLM-based applications. Experience working with LLM platforms such as OpenAI GPT models, Gemini, Llama 2, Claude, or Grok. Understanding of concepts like tokenization, embeddings, vector search, and RAG architecture. Experience testing AI services hosted on Azure AI Services, Azure ML, AWS Bedrock, or Google Vertex AI. Proficiency in performance testing tools such as JMeter, LoadRunner, k6, or Gatling. Knowledge of API testing tools like Postman or Rest Assured. Familiarity with monitoring tools such as Azure Monitor, AWS CloudWatch, Grafana, or Prometheus. Experience with containerization (Docker) and orchestration (Kubernetes). Basic scripting knowledge in Python or Java for test automation. Understanding of CI/CD pipelines and DevOps practices. • GenAI-Specific Knowledge: Experience testing conversational AI applications and chatbot performance. Knowledge of inference latency optimization techniques for LLMs. Understanding of GPU-based workloads and performance considerations. Exposure to agentic frameworks like LangChain, Semantic Kernel, AutoGen, or CrewAI (preferred). Experience validating performance of vector databases (Chroma, Pinecone, Weaviate, FAISS). Qualifications Bachelor’s degree in Computer Science, Information Technology, or related field. 5+ years of experience in performance testing, with at least 2 years in AI/ML or GenAI projects. Experience in testing cloud-native, microservices-based applications. Strong analytical and troubleshooting skills. Excellent communication and stakeholder management skills.

How To Apply:

Incase you would like to apply to this job directly from the source, please click here

Responsibilities

The role involves defining and executing performance testing strategies for Generative AI and LLM-based applications, focusing on identifying bottlenecks across APIs, model inference layers, and cloud infrastructure. Key tasks include designing and running various load tests, validating performance metrics like latency and throughput, and analyzing model inference time and resource utilization.