Sign up with

Already have an account? Log in here

Need some help?
Talk to us at +91 7670800001

Prompt Engineer (LLM Systems, Evals & Safety) at Supertech Group

New Delhi, delhi, India -

Full Time

Start Date

Immediate

Expiry Date

03 Jan, 26

Salary

0.0

Posted On

05 Oct, 25

Experience

2 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Prompt Design, Evaluation Datasets, Automated Scoring, Scripting, Clear Writing, Safety Tooling, Red-Teaming Techniques, LangChain, LLM Orchestration, Vector Stores, A/B Testing, Analytics

Industry

technology;Information and Internet

Description

Do you want to love what you do at work? Do you want to make a difference, an impact, and transform peoples lives? Do you want to work with a team that believes in disrupting the normal, boring, and average? If yes, then this is the job you are looking for , webook.com is Saudi’s #1 event ticketing and experience booking platform in terms of technology, features, agility, revenue serving some of the largest mega events in the Kingdom surpassing over 2 billion in sales. webook.com is part of the Supertech Group also consisting of UXBERT Labs, one of the best digital and user experience design agencies in the GCC, along with Kafu Games, the largest esports tournament platform in MENA. Role Overview Design high-quality prompts, system instructions, and tooling that make our LLM features accurate, safe, and cost-effective. You’ll own evaluation, prompt versioning, and continuous improvement. Key Responsibilities: Author, refactor, and chain prompts (system/tool/policy) for varied tasks. Create offline/online evaluation harnesses (rubrics, golden sets, metrics). Build prompt libraries with versioning, A/B testing, and telemetry. Reduce hallucinations via verification, constrained decoding, and tool use. Implement safety: jailbreak/prompt-injection tests, content policy checks, PII handling. Partner with engineers to integrate prompts into production features. Requirements Demonstrated prompt design across multiple task types and models. Experience building eval datasets and automated scoring (e.g., accuracy, faithfulness, utility, cost/latency). Familiarity with retrieval-augmented generation concepts and tool/function calling. Strong scripting (Python/TypeScript) for data prep, evals, and analysis. Clear writing; ability to translate business goals into measurable prompt specs. Nice-to-Haves Experience with LangChain/LLM orchestration, vector stores, and rerankers. Knowledge of safety tooling and red-teaming techniques. Experiment platforms (feature flags, A/B tests), analytics.

Responsibilities

Design high-quality prompts and system instructions for LLM features. Own evaluation, prompt versioning, and continuous improvement.