Data & AI Specialist - Data Scraping, Enrichment & Quality Assurance at division50

, , Pakistan -

Full Time

Start Date

Immediate

Expiry Date

20 Dec, 25

Salary

0.0

Posted On

21 Sep, 25

Experience

2 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Python, Web Scraping, Data Enrichment, Data Transformation, Quality Assurance, Machine Learning, NLP, SQL, NoSQL, Cloud Platforms, Data Privacy, Automation, Data Pipelines, ETL Tools, Data Warehousing, Containerization

Industry

Business Consulting and Services

Description

Overview We’re looking for a data-obsessed explorer who can build and maintain pipelines that collect, clean, and enhance large volumes of data, then apply AI tools to keep it accurate, useful, and ready for analysis. This is initially a project-based role with the possibility of evolving into a full-time contract based on performance and business needs. Key Responsibilities Data Acquisition & Scraping Design, develop, and maintain scalable web-scraping systems and APIs to collect structured and unstructured data from diverse sources. Ensure compliance with data privacy laws (GDPR, CCPA) and site-specific terms of service. Data Enrichment & Transformation Implement pipelines to clean, normalize, and enrich raw data using third-party datasets, NLP (natural language processing), and machine learning techniques. Build automated matching and deduplication processes to maintain a unified source of truth. Quality Assurance & Monitoring Create automated QA checks to validate data accuracy, completeness, and consistency. Set up monitoring and alert systems to catch anomalies or pipeline failures early. AI & Process Optimization Integrate AI models for entity extraction, text classification, and predictive enrichment. Work with the data science team to design features that feed analytics and machine learning models. Collaboration & Documentation Partner with product, engineering, and analytics teams to define data requirements and priorities. Maintain clear technical documentation and data lineage records. Strong programming skills in Python (Scrapy, BeautifulSoup, Selenium, Playwright) or equivalent languages. Experience with data pipelines and ETL tools (Airflow, Prefect, or similar). Proficiency in SQL/NoSQL databases and data warehousing (e.g., BigQuery, Snowflake). Familiarity with cloud platforms (AWS, GCP, or Azure) and containerization (Docker/Kubernetes). Knowledge of machine learning workflows and libraries (scikit-learn, spaCy, Hugging Face) is a big plus. Solid understanding of data privacy and ethical data collection practices. Nice-to-Have Experience with LLMs (large language models) for text enrichment. Background in data visualization or BI tools (Tableau, Looker, Power BI). Familiarity with real-time streaming data (Kafka, Kinesis). Traits for Success Detail-oriented with a knack for spotting hidden data issues. Curious problem solver who loves automation and efficiency. Comfortable in a fast-paced environment where requirements evolve quickly. Remote work. Flexible work schedule . Opportunity for a long term contract .

Responsibilities

The role involves designing and maintaining web-scraping systems to collect data while ensuring compliance with data privacy laws. Additionally, the specialist will implement data pipelines for cleaning and enriching data, and create automated quality assurance checks.