Job Title: Lead Data Engineer
Location: Remote
Experience Required: 10+ Years Job Description
We are seeking an experienced Lead Data Engineer for one of our clients, specializing in the development and optimization of ETL processes. The successful candidate will have a deep understanding of Spark-based data engineering, particularly with PySpark Notebooks and Microsoft Fabric, and possess a strong command of SQL. In this role, you will lead data engineering initiatives, work closely with cross-functional teams, and support the modernization of legacy SQL Server environments. Key Responsibilities

ETL Pipeline Development: Design, develop, and maintain robust ETL pipelines using PySpark Notebooks and Microsoft Fabric.

Stakeholder Collaboration: Work with data scientists, analysts, and other stakeholders to understand data requirements and deliver effective data solutions.

Data Migration: Migrate and integrate data from legacy SQL Server environments to modern data platforms.

Pipeline Optimization: Ensure data pipelines are optimized for scalability, efficiency, and reliability.

Technical Leadership: Provide technical guidance and mentorship to junior developers and team members.

Troubleshooting: Identify and resolve complex data engineering issues related to performance, data quality, and scalability.

Best Practices: Establish and maintain data engineering best practices, coding standards, and documentation.

Code Review: Conduct code reviews and provide constructive feedback to improve team productivity and code quality.

Data Integrity: Support data-driven decision-making processes by ensuring data integrity, availability, and consistency across platforms.

Requirements

Education: Bachelor’s or Master’s degree in Computer Science, Data Science, Engineering, or a related field.

Experience: 10+ years in data engineering, with a strong focus on ETL development using PySpark or other Spark-based tools.

SQL Expertise: Proficiency in SQL, including complex queries, performance tuning, and data modeling.

Microsoft Fabric: Hands-on experience with Microsoft Fabric or similar cloud-based data integration platforms.

Data Lakes & Warehouses: Strong knowledge of Data Lake, Data Warehouse, and Delta Lake technologies.

Azure Data Services: Experience with Azure Data Factory, Azure Synapse, or similar data services.

Scripting Proficiency: Skilled in scripting languages such as Python or Scala for data manipulation and automation.

Big Data & ETL Frameworks: Solid understanding of data warehousing concepts, ETL frameworks, and big data processing.

Bonus Skills: Familiarity with Hadoop, Hive, Kafka, and DevOps practices, including CI/CD pipelines and containerization tools like Docker and Kubernetes.

Structured & Unstructured Data: Experience handling both structured and unstructured data sources.

Key Qualities

Strong problem-solving skills with the ability to troubleshoot complex data engineering issues.

Proven ability to work independently, as part of a team, and in leadership roles.

Excellent communication skills to translate complex technical concepts for business stakeholders.

Informica

You must sign in to apply for this position.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Lead Data Engineer (Remote – Full Time, Immediate Joiners)

Related