Web Scraping Specialist
MLabs
Job Overview
Location
Remote
Salary
USD 75,000 - 150,000 yearly
Employment Type
Full-time
Work Arrangement
Remote
Sector
Information Technology & Software
Experience Level
Senior (5-8 years)
Application Deadline
April 29, 2026
About the Company
MLabs is a specialized consultancy focused on Haskell, Rust, Blockchain, and AI technologies. They also offer recruitment services within these niche sectors.
The company prides itself on being a Haskell, Rust, Blockchain, and AI consultancy, indicating a strong focus on cutting-edge technologies and deep technical expertise.
MLabs is committed to offering equal opportunities to all candidates, ensuring no discrimination and providing accessible job advertisements. Their goal is to foster a diverse and inclusive workplace.
MLabs Ltd collects and processes personal information for recruitment purposes only, managing data securely and in compliance with data protection laws. Data may be shared with clients and trusted partners for recruitment needs.
Job Description
We are seeking a skilled Web Scraping Specialist to join a dedicated technical team focused on building the infrastructure essential for training advanced AI models. This role is pivotal in developing systems that deliver vast quantities of web data.
Your responsibilities will include writing, testing, and refining high-performance code to extract data from diverse online sources, ensuring maximum reliability and efficiency. You will manage complex data retrieval tasks, including handling pagination and dynamic content, and ensure the quality of extracted data through rigorous cleaning and formatting.
The ideal candidate will possess advanced skills in Python or JavaScript, with expertise in libraries such as BeautifulSoup, Scrapy, or Selenium. A strong understanding of asynchronous programming, multithreading, and distributed scraping architectures is crucial. You should also have in-depth knowledge of HTML, CSS, JavaScript, and the DOM, along with experience in NoSQL databases like MongoDB.
This is a remote position requiring a 6-hour overlap with EST. We offer competitive compensation ranging from $75,000 to $150,000, along with a comprehensive benefits and equity package.
To apply for this role, click the Apply button on this page and follow the instructions.
Required Skills
Key Responsibilities
- Write, test, and refine high-performance code to extract data from various online sources.
- Manage complex data retrieval tasks, including handling pagination and dynamic content.
- Clean and format extracted data to ensure it meets rigorous quality standards.
- Store and manage scraped data in appropriate databases, optimizing for access speed and data integrity.
- Monitor scraping processes and infrastructure to identify and resolve issues.
Qualifications
- Demonstrated ability to extract data from complex websites with minimal supervision, supported by a portfolio of past projects.
- Advanced skills in Python or JavaScript, specifically with libraries and frameworks such as BeautifulSoup, Scrapy, or Selenium.
- Strong knowledge of asynchronous programming, multithreading, and distributed scraping architectures.
- In-depth knowledge of HTML, CSS, JavaScript, and the Document Object Model (DOM).
- Experience with NoSQL databases (e.g., MongoDB, Cassandra), including the ability to design efficient storage solutions.
- Experience deploying and managing large-scale scraping jobs using cloud services such as AWS, Google Cloud, or Azure.
- Ability to apply machine learning algorithms for data cleaning, categorization, or predictive analysis is preferred.
- Active participation in relevant open-source projects is a plus.
Benefits & Perks
- Competitive Compensation: A highly competitive salary ranging from $75,000 to $150,000.
- Comprehensive benefits and equity package.
- Impactful Work: Opportunity to work at the forefront of AI development and web-scale knowledge graph creation.
- High-Output Culture: A professional environment that prioritizes low ego, technical autonomy, and rapid execution.
- Remote Flexibility: This is a remote position requiring a 6-hour overlap with the core team's schedule.
How to Apply
Please submit your application through the provided link.
Join Our Communities
The demand for high-quality, web-scraped data is exploding, fueling the development of advanced AI models. This role is central to building the infrastructure that delivers massive datasets for AI training. You will leverage your expertise in Python or JavaScript to extract, clean, and manage data from complex online sources. Your work will directly impact the scaling of public web data accessibility, supporting cutting-edge AI research and development. This is an opportunity to contribute to a lean, technical team focused on rapid execution and innovation in a fast-paced environment.
Posted Date
April 14, 2026
Data Scientist (ML)
Black Tree Gaming Limited
Data Scientist (Python & SQL) - Freelance AI Trainer
Mindrift
AI/ML ENGINEER
Formas Ai
Freelance Data Scientist (Python & SQL)
Mindrift
Data Scientist
KDA Consulting Inc
Data Scientist (Contract)
Epoch AI
Senior Data Engineer
Mindera
Machine Learning Lead (LLM)
Blue Rose Research
Data Scientist - Machine Learning
Blue Rose Research
Staff Data Scientist, Marketplace Optimization and Ranking
Airbnb, Inc.