Job Title: Lead Data Engineer
Location: Remote
Experience Required: 10+ Years Job Description
We are seeking an experienced Lead Data Engineer for one of our clients, specializing in the development and optimization of ETL processes. The successful candidate will have a deep understanding of Spark-based data engineering, particularly with PySpark Notebooks and Microsoft Fabric, and possess a strong command of SQL. In this role, you will lead data engineering initiatives, work closely with cross-functional teams, and support the modernization of legacy SQL Server environments. Key Responsibilities

ETL Pipeline Development: Design, develop, and maintain robust ETL pipelines using PySpark Notebooks and Microsoft Fabric.

Stakeholder Collaboration: Work with data scientists, analysts, and other stakeholders to understand data requirements and deliver effective data solutions.

Data Migration: Migrate and integrate data from legacy SQL Server environments to modern data platforms.

Pipeline Optimization: Ensure data pipelines are optimized for scalability, efficiency, and reliability.

Technical Leadership: Provide technical guidance and mentorship to junior developers and team members.

Troubleshooting: Identify and resolve complex data engineering issues related to performance, data quality, and scalability.

Best Practices: Establish and maintain data engineering best practices, coding standards, and documentation.

Code Review: Conduct code reviews and provide constructive feedback to improve team productivity and code quality.

Data Integrity: Support data-driven decision-making processes by ensuring data integrity, availability, and consistency across platforms.

Requirements

Education: Bachelor’s or Master’s degree in Computer Science, Data Science, Engineering, or a related field.

Experience: 10+ years in data engineering, with a strong focus on ETL development using PySpark or other Spark-based tools.

SQL Expertise: Proficiency in SQL, including complex queries, performance tuning, and data modeling.

Microsoft Fabric: Hands-on experience with Microsoft Fabric or similar cloud-based data integration platforms.

Data Lakes & Warehouses: Strong knowledge of Data Lake, Data Warehouse, and Delta Lake technologies.

Azure Data Services: Experience with Azure Data Factory, Azure Synapse, or similar data services.

Scripting Proficiency: Skilled in scripting languages such as Python or Scala for data manipulation and automation.

Big Data & ETL Frameworks: Solid understanding of data warehousing concepts, ETL frameworks, and big data processing.

Bonus Skills: Familiarity with Hadoop, Hive, Kafka, and DevOps practices, including CI/CD pipelines and containerization tools like Docker and Kubernetes.

Structured & Unstructured Data: Experience handling both structured and unstructured data sources.

Key Qualities

Strong problem-solving skills with the ability to troubleshoot complex data engineering issues.

Proven ability to work independently, as part of a team, and in leadership roles.

Excellent communication skills to translate complex technical concepts for business stakeholders.

Informica

You must sign in to apply for this position.