Job Summary
We are seeking a Data Engineer – AI to design and build scalable data pipelines that support AI/ML workloads and advanced analytics. This role focuses on developing robust data ingestion, transformation, and storage solutions, including support for Retrieval-Augmented Generation (RAG) systems. The ideal candidate will have strong experience in Python, SQL, Azure cloud technologies, and AI-driven data platforms.
Key Responsibilities
- Design and build scalable data pipelines for ingestion, transformation, and processing of AI/ML data
- Develop and manage data storage solutions, including vector databases for RAG systems
- Implement data lineage, validation, and security controls to ensure data quality and compliance
- Clean, structure, and optimize complex datasets for analytical and operational use
- Support AI-driven analytics and data mapping initiatives
- Work extensively on Azure cloud platforms to deploy and manage data solutions
- Collaborate with cross-functional teams to enable AI and data-driven capabilities
- Optimize data workflows for performance, scalability, and reliability
Required Qualifications
- 5+ years of strong experience in SQL and Python
- 5+ years of experience working in Azure cloud environments
- 5+ years of experience with AI-driven analytics or data platforms
- Familiarity with AI tools and frameworks such as vector databases (e.g., Pinecone, Milvus) and frameworks like Hugging Face or LangChain
- Strong expertise in ETL/ELT pipeline design and data warehousing
- Bachelor’s degree in Computer Science or equivalent work experience
Preferred Qualifications
- Experience with Retrieval-Augmented Generation (RAG) architectures
- Strong understanding of data governance, lineage, and security best practices
- Experience working with large-scale, enterprise data platforms
- Strong problem-solving and analytical skills