Responsibilities:
- Extract data using web scraping and API integrations.
- Develop and maintain scalable data crawlers for diverse sources.
- Developing and maintaining data pipelines for efficient data extraction, transformation, and loading (ETL) processes.
- Ensure data quality through validation processes.
- Document technical specifications and workflows
- Collaborating with data analysts, data architects, and software engineers to understand data requirements and deliver relevant data sets.
- Participating in data governance and compliance efforts to meet regulatory requirements.
- Collaborating with cross-functional teams to drive data-driven decision-making within the organization.
Requirements:
- More than 2 years of relevant experience or significant internship.
- Proficiency in programming languages like Python, Java, or SQL.
- Basic knowledge of data modeling and database management.
- Familiarity with ETL (Extract, Transform, Load) processes.
- Understanding of data warehousing concepts.
- Must have experience of extracting data from HTML, and web crawling.
- Strong problem-solving and analytical abilities.
- Eagerness to learn and adapt to new data technologies and tools.