Back to search:Data Scraping / Jakarta

Key Responsibilities

  • Develop and maintain Python-based scraping scripts (e.g., using requests, BeautifulSoup, Selenium, Playwright, Scrapy).
  • Implement rotating proxy, CAPTCHA bypass, and user-agent randomization to ensure high scraping success rate.
  • Handle structured and unstructured data from APIs, HTML, JSON, and XML.
  • Schedule and orchestrate scraping jobs (via cron, Airflow, n8n, or Prefect).
  • Integrate pipelines with Snowflake, Google Sheets, or cloud storage (S3, GCS, SharePoint).
  • Monitor, log, and troubleshoot scraping workflows to ensure reliability.
  • Suggest and prototype new scraping targets or data enrichment sources.
  • Stay updated with web structure or API changes and adapt scripts accordingly.

Requirements

  • Bachelor's degree in Computer Science, Information Systems, or related field.
  • 2+ years of experience in web scraping, crawling, or automation scripting.
  • Proficiency in Python and libraries like requests, BeautifulSoup, Selenium, Playwright, or Scrapy.
  • Experience with headless browser automation (e.g., Puppeteer, Playwright).
  • Experience handling proxies, headers, and rate-limiting strategies.
  • Knowledge of containerization (Docker) and Git-based CI/CD.
  • Experience scraping social media or e-commerce