Job Summary:
We are looking for a highly skilled Data Engineer with 5+ years of experience to join our growing data team. The ideal candidate will have strong expertise in building scalable data pipelines, managing large datasets, and optimizing data workflows. You will be instrumental in enabling data-driven decision-making across the organization by ensuring robust data infrastructure and reliable data availability.
Key Responsibilities:
- Design, develop, and maintain scalable ETL/ELT pipelines to support data ingestion, transformation, and integration.
- Â Work with structured and unstructured data from various internal and external sources.
- Collaborate with data analysts, data scientists, and software engineers to understand data needs and ensure data quality and integrity.
- Build and manage data models, data lakes, and data warehouses (e.g., Azure databricks and Azure Data Factory, Snowflake).
- Optimize performance of large-scale batch and real-time data processing systems.
- Implement data governance, metadata management, and data lineage tracking.
- Monitor data pipeline performance, conduct root cause analysis for data issues, and resolve them proactively.
- Automate data validation and quality checks.
- Ensure compliance with data privacy and security policies.
- Document architecture, processes, and data workflows.
Mandatory Qualifications:
- Good experience (more than 5 years) in Azure Blob Storage, Azure Data Lake Storage, Azure SQL Database, Azure Logic Apps, Azure Data Factory, Azure Databricks, Azure Machine Learning, Azure DevOps Services, Azure API Management and Webhook
- Basic understanding of Power BI and functionalities
- Intermediate level of proficiency in Python scripting
Technical Skills & Experience Required:
- 5+ years of experience as a Data Engineer or in a similar role.
- Strong proficiency in SQL and experience working with relational and non-relational databases (e.g., SQL, PostgreSQL, MySQL, MongoDB, Cassandra).
- Hands-on experience with data pipeline frameworks such as Azure Data Factory.
- Experience with big data technologies such as Apache Spark, Hadoop, Hive.
- Proficiency in Python, PySpark for data processing.
- Experience with cloud platforms Azure and cloud-native data tools like Data Factory, Azure ML, Databricks, Azure Delta Lake etc.
- Strong understanding of data warehousing concepts, data modeling, and OLAP/OLTP systems.
- Experience with CI/CD pipelines, Git, and version control using Azure repos and able to build scalable devops pipeline.
- Solid understanding of data privacy, compliance, and security best practices.
- Strong analytical and problem-solving skills.
Â