Job Summary
The Senior Data Engineer will design, develop, and optimize scalable cloud-based data platforms that support enterprise analytics, regulatory reporting, business intelligence, and machine learning initiatives. This role is responsible for building high-performance ETL/ELT pipelines, implementing data engineering best practices, and delivering reliable, high-quality datasets across cloud environments. The ideal candidate will have strong expertise in Python, SQL, cloud data platforms, distributed data processing, and modern data engineering technologies.
Key Responsibilities
• Design, develop, and maintain scalable ETL/ELT pipelines to ingest data from multiple internal and external sources.
• Build and optimize cloud-based data pipelines using modern data engineering frameworks and orchestration tools.
• Develop data transformation workflows for structured, semi-structured, and unstructured data.
• Design, develop, and maintain data models supporting analytics, reporting, and machine learning initiatives.
• Collaborate with data architects, business analysts, data scientists, and cross-functional teams to deliver high-quality data solutions.
• Implement data quality validation, metadata management, and data governance standards.
• Optimize SQL queries and distributed processing jobs for performance and scalability.
• Develop reusable data engineering frameworks, automation scripts, and best practices.
• Support batch and near real-time data processing requirements.
• Monitor production data pipelines, troubleshoot failures, and participate in production support and on-call rotations.
• Participate in Agile ceremonies, architecture discussions, code reviews, and technical design sessions.
• Ensure compliance with enterprise security, privacy, and regulatory standards.
• Create and maintain technical documentation for data pipelines, processes, and architecture.
Required Qualifications
• Bachelor's degree in Computer Science, Information Systems, or a related field.
• 6+ years of experience in Data Engineering or ETL development.
• Strong SQL skills with experience designing, optimizing, and troubleshooting complex queries.
• Hands-on experience with Python for data processing, automation, and pipeline development.
• Experience with cloud platforms such as AWS, Azure, or GCP.
• Experience with ETL tools such as Informatica, Talend, DataStage, AWS Glue, Azure Data Factory, or similar technologies.
• Experience working with analytical databases such as Snowflake, Amazon Redshift, Azure Synapse, BigQuery, or equivalent.
• Experience with Apache Spark or PySpark for distributed data processing.
• Experience using Git and CI/CD practices for source control and deployment automation.
• Strong understanding of data warehousing concepts, dimensional modeling, and data architecture.
• Strong analytical, troubleshooting, and problem-solving skills.
• Excellent communication and collaboration skills.
Preferred Qualifications
• Experience with Apache Airflow, Control-M, or other workflow orchestration platforms.
• Experience with Kafka or other streaming technologies.
• Experience with Databricks and Delta Lake.
• Experience working in financial services, banking, insurance, risk management, or related industries.
• Familiarity with data governance, data lineage, and metadata management tools.
• Experience supporting machine learning or AI data pipelines.
• Experience working in Agile/Scrum development environments.