Job Title: Lead Data Engineer
Location: Remote
Experience Required: 10+ Years Job Description
We are seeking an experienced Lead Data Engineer for one of our clients, specializing in the development and optimization of ETL processes. The successful candidate will have a deep understanding of Spark-based data engineering, particularly with PySpark Notebooks and Microsoft Fabric, and possess a strong command of SQL. In this role, you will lead data engineering initiatives, work closely with cross-functional teams, and support the modernization of legacy SQL Server environments. Key Responsibilities
ETL Pipeline Development: Design, develop, and maintain robust ETL pipelines using PySpark Notebooks and Microsoft Fabric.
Stakeholder Collaboration: Work with data scientists, analysts, and other stakeholders to understand data requirements and deliver effective data solutions.
Data Migration: Migrate and integrate data from legacy SQL Server environments to modern data platforms.
Pipeline Optimization: Ensure data pipelines are optimized for scalability, efficiency, and reliability.
Technical Leadership: Provide technical guidance and mentorship to junior developers and team members.
Troubleshooting: Identify and resolve complex data engineering issues related to performance, data quality, and scalability.
Best Practices: Establish and maintain data engineering best practices, coding standards, and documentation.
Code Review: Conduct code reviews and provide constructive feedback to improve team productivity and code quality.
Data Integrity: Support data-driven decision-making processes by ensuring data integrity, availability, and consistency across platforms.
Requirements
Education: Bachelor’s or Master’s degree in Computer Science, Data Science, Engineering, or a related field.
Experience: 10+ years in data engineering, with a strong focus on ETL development using PySpark or other Spark-based tools.
SQL Expertise: Proficiency in SQL, including complex queries, performance tuning, and data modeling.
Microsoft Fabric: Hands-on experience with Microsoft Fabric or similar cloud-based data integration platforms.
Data Lakes & Warehouses: Strong knowledge of Data Lake, Data Warehouse, and Delta Lake technologies.
Azure Data Services: Experience with Azure Data Factory, Azure Synapse, or similar data services.
Scripting Proficiency: Skilled in scripting languages such as Python or Scala for data manipulation and automation.
Big Data & ETL Frameworks: Solid understanding of data warehousing concepts, ETL frameworks, and big data processing.
Bonus Skills: Familiarity with Hadoop, Hive, Kafka, and DevOps practices, including CI/CD pipelines and containerization tools like Docker and Kubernetes.
Structured & Unstructured Data: Experience handling both structured and unstructured data sources.
Key Qualities
Strong problem-solving skills with the ability to troubleshoot complex data engineering issues.
Proven ability to work independently, as part of a team, and in leadership roles.
Excellent communication skills to translate complex technical concepts for business stakeholders.
Informica
You must sign in to apply for this position.