RESPONSIBILITIES:
- Gather requirements for model parameters.
- Build feature extraction scripts to automate the process for the ML model.
- Collaborate with product owners/technical leaders to integrate ML models into products/services.
- Process data from streaming/raw data based on user needs.
- Design and maintain scalable and reliable data pipelines to move data across systems.
- Develop and optimize data warehousing solutions, ensuring efficient data delivery and storage for analysis.
- Design distributed systems to apply machine learning and data science techniques.
REQUIREMENTS:
- Bachelor's or higher in Computer Science or a related discipline.
- Minimum 1 year of experience with Apache Spark,SQL/NoSQL databases, and PostgreSQL.
- Familiar with cloud platforms such as GCP, AWS, or Azure.
- Advantageous to have worked with AI/ML tools like BigQuery or TensorFlow.
- Computer Vision project involvement is strongly valued.
- Experience in building and optimizing data warehousing solutions.
- Proficient in Java or Python programming.
- Proficient in Unix and Linux operating systems.
- Experience in writing Apache Spark is preferable.
- Proficiency in designing and maintaining end-to-end data pipelines.
- Strong knowledge of data warehousing concepts, architectures, and processes.
- Passionate about coding and programming, innovation, and solving challenging problems.