Linguistic Data Analyst / AI Linguistic Specialist- Diplomatic Domain (STT at Master-Works
Riyadh, Riyadh Region, Saudi Arabia -
Full Time


Start Date

Immediate

Expiry Date

30 May, 26

Salary

0.0

Posted On

01 Mar, 26

Experience

2 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Linguistic Data Collection, Data Cleaning, Terminology Analysis, Glossary Building, STT AI Models, Arabic, English, Data Structuring, Data Curation, Data Governance, NLP Concepts, Data Annotation, Linguistic Guidelines, AI Engineering Collaboration, Data Science Collaboration

Industry

IT Services and IT Consulting

Description
Role Purpose The Linguistic Data Analyst is responsible for collecting, analyzing, organizing, and cleaning multilingual conversational data, with a strong focus on diplomatic and formal terminology, to prepare high-quality datasets for training Speech-to-Text (STT) AI models. This role is critical to ensuring linguistic accuracy, terminology consistency, and data readiness for AI model development, particularly in government, diplomatic, and formal communication domains. Key Responsibilities Linguistic Data Collection: - Collect and curate conversational audio and text data (meetings, interviews, speeches). - Work with multilingual datasets, primarily Arabic and English. - Ensure compliance with privacy and data governance standards. Data Cleaning & Structuring: - Clean datasets by removing noise, duplication, and inconsistencies. - Normalize formal and semi-formal language usage. - Organize data by speaker, context, and formality. Linguistic & Terminology Analysis: - Extract and standardize diplomatic and official terminology. - Build and maintain a diplomatic glossary. AI Training Data Preparation: - Prepare AI-ready datasets with timestamps and metadata. - Support annotation teams with linguistic guidelines. Collaboration & Documentation: - Work with AI Engineers, Data Scientists and PMs. - Document standards and methodologies. Required Qualifications Education: - Bachelor’s degree in Linguistics, Translation, Arabic/English Studies, or related field. Core Skills: - Strong linguistic analysis skills. - Experience with conversational or textual datasets. - High attention to detail. Technical Skills (Preferred): - Familiarity with STT and NLP concepts. - Experience with data annotation workflows. Languages: - Arabic: Fluent (mandatory) - English: Fluent (mandatory) - Additional languages are a plus.
Responsibilities
The Linguistic Data Analyst is tasked with collecting, analyzing, organizing, and cleaning multilingual conversational data, focusing heavily on diplomatic and formal terminology, to create high-quality datasets for Speech-to-Text (STT) AI model training. Key duties include standardizing official terminology, building a diplomatic glossary, and preparing AI-ready datasets with necessary metadata.
Loading...