While it has recently become widely accessible to develop a Proof-Of-Concept for Retrieval Augmented Generation (RAG) using OpenAI and one of the various open-source contributions on the topic, transitioning to a production-ready pipeline presents its own set of challenges. In this talk, Noé will share great practices for building a RAG product based on his experience developing Hikari, a bot utilized by hundreds of individuals within the Theodo Group, which continuously ingests documents from the group, such as those from Notion or HubSpot. These practices include (but are not limited to) using a Directed Acyclic Graph (DAG) for continuous document ingestion (e.g., with Airflow), iterating on prompts, chunks, models, and more (e.g., with DVC), as well as understanding when, why, and how to switch to open-source models.
Noé Achache, of Sicara, presents this work at the GenAI Days presented by Aleios. Find out more info about GenAI Days here: www.genaidays....
More about Sicara here: www.sicara.fr/en/
To learn more about Iterative's open-source and SaaS tools please visit:
🧑🏽💻 Our free online course: learn.iterativ...
✍🏼 Our docs: dvc.org/doc (Data Version Control, Pipelines, Experiments)
cml.dev/doc (CI/CD for Machine Learning)
mlem.ai/doc (Package and Serve your models)
studio.iterati... (Team Collaboration, Experiments, Model Registry)
Try out the DVC Extension for VS Code here: marketplace.vi...
Join the Community on our Discord server: / discord
#dvc #machinelearning #datascience #generativeai
12 сен 2024