|
Architect, build, and operate real-time/batch ETL pipelines, agentic orchestration flows, and AI/ML endpoints for autonomous, multi-agent production systems. Contribute actively to team processes, documentation, and operational quality.
Responsibilities
- Build event-driven data workflows (Snowflake, S3, Kafka, EventBridge, Celery, AWS Batch), integrate with FactSet, Sharepoint, Proxy connectors, and expose agentic features (LangChain, LangGraph, LlamaIndex, Pinecone).
- Develop, maintain, and monitor vector database (Pinecone) pipelines, LLM and ML endpoints, ensuring agent memory/state is managed for retrieval-augmented pipelines.
- Automate schema/version change management, event contract validation, lineage tracking, and observability.
- Write, run, and document QA/test coverage for ETL, agentic triggers, and GenAI model events, with incident and postmortem participation.
- Collaborate in agile ceremonies: propose improvements, troubleshoot delivery bottlenecks, and share technical knowledge via docs and training sessions.
- Implement and monitor security, compliance, and resiliency standards at all stages of the data/model workflows.
- Help onboard new team members; mentor engineers in agentic and event-driven data engineering best practices.
Qualifications
- Event-driven ETL/data pipeline experience (Kafka, EventBridge, Snowflake, S3, Python/Celery), plus direct work with LangChain/LangGraph/LlamaIndex/Pinecone multi-agent orchestration.
- QA, documentation, monitoring, and collaborative agile process skills.
- Experience with cross-cloud, multi-source integration and incident resolution.
|