Web Scraping Specialist (Senior) ID30242
at
AgileEngine
Buenos Aires, Buenos Aires, Argentina
-
Full Time
Start Date
Immediate
Expiry Date
10 May, 25
Salary
0.0
Posted On
11 Feb, 25
Experience
5 year(s) or above
Remote Job
No
Telecommute
No
Sponsor Visa
No
Skills
Good communication skills
Industry
Information Technology/IT
Description
MUST HAVES
5+ years of hands-on experience in web scraping, data extraction, and integration;
Strong proficiency in Python and web scraping frameworks (Scrapy, BeautifulSoup, Selenium);
Expertise in handling dynamic content, browser fingerprinting, and bypassing anti-bot mechanisms (e.g., CAPTCHAs, rate limits, proxy rotation);
Deep understanding of HTML, CSS, XPath, and JavaScript-rendered content;
Experience working with large-scale data storage solutions and optimizing retrieval performance;
Strong grasp of ETL processes, data pipelines, and data warehousing;
Familiarity with APIs for data extraction and integration from public and restricted sources;
Strong problem-solving skills with an ability to debug and adapt to changing web structures;
Solid understanding of web scraping ethics, legal implications, and compliance guidelines;
Upper-Intermediate English level.
Responsibilities
Web Scraping & Data Extraction: design, develop, and optimize web scraping strategies for large-scale data extraction from dynamic websites; identify and assess relevant data sources, ensuring alignment with business objectives; implement automated web scraping solutions using Python and libraries like Scrapy, BeautifulSoup, and Selenium; build resilient and adaptable scrapers that can handle website structure changes, rate limits, and anti-scraping measures;
Data Processing & Integration: cleanse, validate, and transform extracted data to ensure accuracy, consistency, and usability; store and manage large volumes of scraped data using best-in-class storage solutions; develop ETL pipelines to integrate scraped data into data warehouses and analytics platforms; collaborate with cross-functional teams, including data scientists and engineers, to make scraped data actionable.
Web Scraping & Optimization: optimize scraping procedures to improve efficiency, reliability, and scalability across multiple data sources; implement solutions for bypassing CAPTCHAs, rotating user agents, and managing proxy services; continuously monitor, troubleshoot, and maintain scraping scripts to minimize disruptions due to site changes.
Compliance & Documentatio: stay up to date with legal, ethical, and compliance considerations related to web scraping and data collection; ensure data collection processes align with best practices and regulatory requirements; maintain clear and detailed documentation of scraping methodologies, data pipelines, and best practices.