About Me

GCP Certified Professional Data Engineer with expertise in designing and implementing scalable, cloud-native solutions. A highly efficient full-stack data engineer with end-to-end experience in batch and streaming data processing. Passionate about building optimized data models for large-scale workloads, ensuring performance and reliability.

Data Engineering Experiences

Senior Data Engineer

2024 - Present
Time Dotcom, Malaysia

Tech Stack: Kafka, Debezium, Airflow, dbt, Pub/Sub, Data Modelling, Bigquery.

  • Led the proof of concept (POC) for CDC streaming ingestion from legacy OLTP MySQL and Oracle databases.
  • Designed and implemented in-house high-level declarative pipelines, reducing development time by 70%.
  • Own and maintain critical ELT pipelines supporting multiple business domains.
  • Onboarded the team to dbt, driving adoption of best practices.
  • Spearheaded the migration of 4000+ lines of legacy stored procedures to dbt models for a critical finance process.

Data Engineer

2022 - 2024
Mindvalley, Malaysia

Tech Stack: Airflow, dbt, Pub/Sub, Data Modelling, Bigquery, Kubernetes.

  • Designed and implemented a webhook infrastructure to enable real-time data ingestion from third-party APIs.
  • Developed and optimized semantic data layers to empower data analysts and business stakeholders with structured insights.
  • Built a near real-time dashboard solution leveraging event-driven data for timely decision-making.
  • Automated manual workflows for performance marketing teams with Supermetrics data, boosting productivity by 70%.

Data Analytics Engineer

2021 - 2022
Persuasion Technologies, Malaysia

Tech Stack: Cloud Run, Bigquery, dbt, Lookerstudio, Scoping, Collaboration.

  • Owned production‑level ETL/ELT pipelines from various data sources including on‑premise servers and API calls.
  • Led the improvement of data observability for existing and new pipelines with dbt and re‑data.
  • Worked with clients to understand business needs and translate those needs to actionable reports in LookerStudio.
  • Led the migration of legacy SQL data modellings to dbt.
  • Owned ML model‑serving application built on AppEngine and CloudRun to predict disease risk from 70k+ medical records.

Teaching and Research Experiences

Data Analyst, Quantitative Genomics

2018 - 2021
Deakin University, Australia

Tech Stack: Python, R, Multivariate Statistical analysis.

  • The research focuses on analysing quantitative genomic sequences data to identify and predict corrosion-specific gene expressions on mild steel. The project is part of a multidisciplinary effort including chemists and engineers trying to solve one of the biggest issues in corrosion of industrial infrastructures such as oil and gas pipelines.
  • Highlight: Winner of 3 Minute Thesis - https://www.youtube.com/watch?v=97okesjynqo

Instructor, Practical Classes

2018-2021
Deakin University, Australia
  • Face-to-face and online teaching for 24-80 First and Second Year students in Molecular Biology units.
  • Handled grading assessments and curriculum design.
  • 99.6% approval rating in student evaluation.
  • Proposed and implemented improvements in conducting classes to comply with government’s Covid-19 guidelines; successfully reduced 10-15% average class duration while preserving the quality of students’ experience.

Certifications

Professional Cloud Database Engineer

2023
Google

Professional Data Engineer

2022
Google

Projects

Ganax Social Media Performance Tracking - Scalable Web Scraping Pipeline for Instagram & Facebook – Batch Processing Made Efficient.
Fashion Catalog AI: LLM-Powered Description Generation - AI-Powered Web App for Generating Product Descriptions from Images and Features with GPT-4 Vision.
Real-Time Data Streaming and Analytics Dashboard - Real-Time Analytics Fueled by Event-Driven Data.