Data Engineer

Neurons Lab


Tanggal: 2 hari yang lalu
Kota: Jakarta, Jakarta
Jenis kontrak: Penuh waktu

About the project (description, duration, stage)

Join Neurons Lab as a Data Engineer on a new engagement with a regulated UK & Ireland credit and lending company. The client has lifted data from multiple business entities into a newly centralized, anonymized data lake, but lacks the data-engineering depth to make it trustworthy and analytics-ready: current pipelines were assembled quickly (partly AI-assisted), and the descriptive statistics cannot yet be validated or reproduced.

You put that foundation on solid ground so the Data Science Lead can model on it with confidence — validate and re-engineer the pipelines, build the harmonization / semantic layer across entities, enforce data quality and lineage, and prepare clean, feature-ready datasets.

This is a foundational data-engineering role on a regulated data estate; data protection and reproducibility are the primary constraints on every decision.

Full-time engagement preferable.

What you'll actually do (example tasks)

  • Reproduce a descriptive-statistics report end-to-end so any figure traces back to raw source — closing the gap the client admitted (numbers they can't currently defend).

  • Profile and reconcile differing source schemas across acquired entities: map differing field names, types, encodings and business definitions for the same concept into one conformed model.

  • Build dbt staging intermediate mart models with tests; codify the harmonized definitions the Data Science Lead specifies.

  • Write Great Expectations suites (null / range / uniqueness / referential checks) and wire them into the pipeline so bad data fails loudly rather than silently corrupting analysis.

  • Implement entity / identity resolution (deterministic + fuzzy matching) where there is no clean shared key for the same customer or account across sources.

  • Implement and verify anonymization / pseudonymization (hashing / tokenization / k-anonymity) and evidence that re-identification risk is controlled for the client's IT / compliance team.

  • Optimize Spark / Glue jobs over tens of millions of rows — partitioning, file formats (Parquet), incremental loads, cost control.

  • Orchestrate with Airflow / Step Functions; build repeatable, scheduled pipelines rather than one-off scripts.

  • Prepare clean, documented, feature-ready datasets for the PD / delinquency models.

  • Document runbooks so the offshore team can operate the pipelines and handover takes days, not weeks; help scope onboarding of the remaining (Ireland + additional) sources.

Skills

  • Strong SQL and Python for large-scale data processing

  • AWS data stack: S3, Glue, Lake Formation, Athena / Redshift, EMR / Spark, Step Functions / Airflow

  • Data modeling & semantic layer (dbt or equivalent); dimensional modeling

  • Entity resolution / record linkage across heterogeneous sources

  • Data-quality & testing frameworks (Great Expectations, dbt tests) and data lineage

  • Anonymization / pseudonymization techniques and their analytical trade-offs

  • Big-data processing (Spark) with performance and cost optimization at scale

  • Clear written / verbal English; documents for handover and works well with a distributed team

Knowledge

  • GDPR fundamentals as applied to anonymized / pseudonymized financial data and UK / EU data residency

  • AWS Well-Architected (Analytics, Security) for BFSI

  • Awareness of credit / risk data structures and what downstream modeling consumers need — a plus

Experience

  • 4+ years in data engineering, with strong AWS + Spark / SQL at scale

  • Demonstrated experience harmonizing / integrating data across multiple source systems

  • Experience building validated, reproducible pipelines in a regulated environment (BFSI, healthcare, government) — strong plus

  • Comfortable stepping into a messy, partly-built data estate and bringing it up to standard

  • Comfortable as the sole or lead data engineer on a small (3–4 person) delivery pod

Cara melamar

Untuk melamar pekerjaan ini, Anda perlu otorisasi di situs web kami. Jika Anda belum memiliki akun, silakan daftar.

Posting CV

Pekerjaan serupa

Editor & Photographer

PT. TIMEVERSE INDONESIA TECHNOLOGY, Jakarta, Jakarta
1 hari yang lalu
Job Description (Deskripsi Pekerjaan) Sebagai Editor & Videographer TikTok, kamu akan bertanggung jawab dalam proses produksi konten mulai dari pengambilan gambar hingga proses editing untuk menghasilkan konten yang menarik, kreatif, dan sesuai dengan identitas brand. Tugas Utama Meliputi: 1. Produksi Konten Melakukan pengambilan foto dan video untuk kebutuhan konten media sosial, termasuk mengatur angle, pencahayaan, dan komposisi visual agar menghasilkan...

Financial Consultant

PT. Wahana Inti Nugraha (Champion Agency), Jakarta, Jakarta
1 hari yang lalu
Membantu nasabah merencanakan perlindungan dan masa depan keuangan keluarga Melakukan konsultasi dan analisis sesuai kebutuhan nasabah Mengembangkan jaringan dan prospek baru Membangun dan menjaga hubungan jangka panjang dengan nasabah Mengikuti pelatihan dan pengembangan kompetensi yang disediakan perusahaan Minimum Qualifications: Memiliki kemampuan komunikasi yang baik dan senang bertemu orang baru Berorientasi pada pelayanan dan solusi bagi nasabah Memiliki semangat belajar dan...

Staff Armada

PT Jemla Ferry, Jakarta, Jakarta
5 hari yang lalu
Staff Armada Pria Usia Maks 30Th Pendidikan Minimal D4/S1 Teknik Sistem Perkapalan & Teknik Bangunan Kapal Bersedia ditempatkan di seluruh wilayah operasi perusahaan Memiliki integritas tinggi Sehat jasmani & rohani Komunikatif, jujur, dan tanggung jawab Bisa bekerjasama dalam tim KANDIDAT YANG DIUTAMAKAN : Memahami perancangan kontruksi dan permesinan kapal Dapat menggunakan aplikasi teknologi perancangan kapal (AutoCad, Maxsurf, Solid Work, Catia,...