Building and optimizing ‘big data’ data pipelines, architectures and data sets;
Design, develop, and maintain data pipelines, warehouses, datalake.
Build the data products that technical users will depend on for business intelligence and ad-hoc access.
Work side by side with our Data Science to build and automate data pipeline, data ETL, etc. on distributed data processing platform such as Spark.
Prepare data inputs for a generic blueprint "model builder"
Build production data pipeline for daily ETL and model retraining
End-to-end data processing, troubleshooting, and problem diagnosis.
Job Requirement
What we expect from you
We are looking for a candidate with a Data Engineer role, who has attained a Graduate degree in Computer Science, Informatics, Information Systems, Software Engineering, Statistic or another quantitative field. They should also have experience using the following/ have knowledge software/tools is plus:
Good logical thinking;
Passionate about coding and programming, innovation, and solving challenging problems;
Understanding of computer science fundamentals (data structures and algorithms, operating systems, networks, databases, etc);
Willing to work with cross-functional teams in a dynamic & fast-paced environment;
Preferred Qualifications (is plus)
Big data tools: Kafka, Hadoop, Hive, Spark, Elastic Search;
Relational SQL and NoSQL databases: MySQL, MongoDB, Hbase, Cassandra, Clickhouse;
Data pipeline and workflow management tools: Airflow, Nifi;