Associate Principal Engineer (India Office)

India·BangaloreHybridlead

OtherPrincipal Engineer

0 views0 saves0 applied

Apply Now

Quick Summary

Overview

Data professional with experience in building and validating end-to-end data pipelines across Azure Data Factory, Databricks, and ADLS. Skilled in data quality assurance,

Technical Tools

OtherPrincipal Engineer

Data professional with experience in building and validating end-to-end data pipelines across Azure Data Factory, Databricks, and ADLS. Skilled in data quality assurance, automation using Python and PySpark, and integrating tests into CI/CD workflows. Adept at large-scale data validation, monitoring data health, and ensuring reliable batch and streaming processes, with additional expertise in developing scalable BI dashboards and reporting solutions.

Design and implement end-to-end data validation strategies for pipelines built on Azure Data Factory, Databricks (Delta Lake), and ADLS
Perform source-to-target reconciliation across ingestion, transformation, and consumption layers
Implement robust checks for Data Quality Dimensions
Build scalable data testing frameworks using python, pyspark and good knowledge of automation tools like playwright & pyspark based automation testing.
Integrate automated data tests into CI/CD pipelines (Azure DevOps / GitHub Actions)
Implement data quality-as-code practices using tools like Soda, dbt etc.
Validate Spark transformations, Delta Lake tables, and streaming pipelines
Perform large-scale data validation using PySpark and SQL
Optimize validation logic for high-volume datasets
Ensure correctness of: Batch and streaming jobs, Incremental loads (CDC pipelines), Slowly changing dimensions (SCD)
Validate data movement, orchestration workflows, and failure handling in Azure Data factory & Azure Databricks services
Define and implement data quality SLAs and KPIs
Build dashboards to track data health and pipeline reliability
Proactively identify anomalies and data drifts
Implement alerting mechanisms for data failures
Design, develop, and optimize interactive dashboards and reports using BI tools (e.g., Power BI, Tableau, Looker).
Translate business requirements into technical reporting solutions.
Identify bottlenecks and improve report/query performance.
Implement best practices for dashboard design and scalability.
Establish BI standards, frameworks, and best practices.

Requirements

~1 min read

8-10 years experience in enterprise data modeling and data architecture roles.
Strong hands-on experience with Azure Data Factory & Pyspark in large-scale environments is preferred.
Experience with Databricks (Delta Lake, Spark, Unity Catalog).
Advanced SQL and strong understanding of pyspark scripts & using pyspark for Data validation.
Experience integrating major enterprise applications (ERP, CRM, OMS, MDM, AR systems).
Strong understanding of data governance, data quality, metadata, and lineage.
Excellent communication skills across business and technical audiences.

Analytics Platform: Databricks (Delta Lake, Unity Catalog), Microsoft PowerBI
Languages: SQL, Spark SQL, Python / PySpark
Governance/Data Quality: Unity Catalog/Informatica DG/DQ

#LI-KS1