Skip to Content

Senior Data Engineer

Colombo, Sri Lanka

Job Title: Senior Data Engineer

Employment Type: Full-Time

Location: Hybrid

About the role:

We are looking for a Data Engineer to take ownership of an existing data pipeline and help evolve it into a more robust, scalable, and maintainable system. You will begin by developing a thorough understanding of the current architecture and workflows, then work to restructure and improve the pipeline in a systematic and considered way.

This is a hands-on engineering role suited to someone who is comfortable navigating existing codebases, improving what's there, and applying engineering best practices to real production systems.

Key Responsibilities:

  • Develop a working understanding of the existing data pipeline architecture, data flows, and business logic before driving changes.
  • Redesign and extend Apache Airflow DAG structures to properly orchestrate the end-to-end pipeline, replacing manual and ad-hoc processes.
  • Consolidate, harden, and productionise existing Python scripts and notebooks used for ETL processing.
  • Maintain and improve data outputs across PostgreSQL and MongoDB.
  • Contribute to pipeline reliability, observability, and documentation.
  • Work within AWS-hosted infrastructure, ensuring pipelines are stable and appropriately monitored.

Required Skills:

  • 3+ years of experience in data engineering or a closely related discipline.
  • Strong Python development skills, including experience writing production-quality ETL scripts and working across data-focused libraries (Pandas, etc.).
  • Hands-on experience with Apache Airflow or a comparable DAG-based orchestration tool (Prefect, Dagster) — including designing and managing DAG workflows in a production environment.
  • Demonstrated experience designing, building, and maintaining ETL/ELT pipelines.
  • Proficiency with PostgreSQL — query optimisation, schema design, and general database management.
  • Working experience with MongoDB for semi-structured data storage and retrieval.
  • Familiarity with AWS services relevant to data pipelines — including but not limited to S3, EC2, Lambda, RDS, and CloudWatch.
  • Ability to read, understand, and incrementally improve an inherited codebase.

Nice to Have:

  • Experience with distributed computing frameworks such as Dask or Apache Spark for parallelised data processing workloads.
  • Familiarity with containerisation and basic DevOps practices — Docker, CI/CD pipelines, environment and dependency management.
  • Experience with pipeline testing, data quality validation, and observability tooling.
  • Exposure to data modelling and warehouse design principles.