At Insignia, we're looking for a Mid-Level Data Engineer who's worked hands-on with Databricks and has solid experience across AWS, GCP, or Azure. You'll design and maintain end-to-end data pipelines that power analytics, machine learning, and business decision-making — from ingestion to transformation, warehousing, and beyond.
You don't need to be a cloud expert in all three platforms — but you should have deep experience in at least one, and comfort navigating multi-cloud environments where needed. If you've built production ETL/ELT workflows on the Lakehouse, optimized Delta tables, or integrated Databricks with orchestration tools like Airflow — this is your kind of challenge.
This is a hybrid role based in West Jakarta, blending focused collaboration with flexible execution.
What You'll Do:
Design, build, and maintain scalable data pipelines using Databricks (Lakehouse, Delta Lake, Spark)
Work across cloud platforms (AWS preferred, also GCP/Azure) — S3, BigQuery, Blob Storage, etc.
Transform raw data into structured, reliable datasets for analytics and ML teams
Optimize performance, cost, and governance across data workflows
Collaborate with analysts, MLEs, and software engineers to ensure data readiness
Implement CI/CD, monitoring, and documentation practices for data systems
Who You Are:
2–4 years of experience in data engineering, ideally within tech-driven or digital service environments
Hands-on experience with Databricks — including PySpark, SQL, and workflow automation
Proven track record working with at least one major cloud provider: AWS (S3, Glue, Redshift), GCP (BigQuery, Pub/Sub), or Azure (Data Lake, Synapse)
Proficient in Python, SQL, and data modeling (medallion architecture, star schema, etc.)
Experience with orchestration tools like Airflow, Prefect, or Step Functions
Bonus: Familiarity with Unity Catalog, MLflow, or real-time streaming (Kafka, Kinesis)
Fluent in English — written and spoken
Collaborative, proactive, and passionate about building clean, maintainable data infrastructure
Why Join Us?
Because great data systems aren't just fast — they're trusted, reusable, and built to evolve. If you're ready to work on high-impact projects where your pipelines power AI and insight, let's talk.
Hybrid role – West Jakarta
Tech stack: Databricks, AWS/GCP/Azure, Python, SQL, Git, CI/CD