docs: update rag

This commit is contained in:
Stefan Rotsch
2024-06-23 12:03:48 +02:00
committed by Stefan Rotsch
parent 0523bcacf9
commit 85bef49afe

View File

@@ -5,21 +5,21 @@ quadrant: methods-and-patterns
tags: [ai] tags: [ai]
--- ---
[Retrieval Augmented Generation (RAG)](https://blogs.nvidia.com/blog/what-is-retrieval-augmented-generation/) is a method that enhances generative AI models by incorporating external information retrieval. RAG combines the strengths of retrieval-based and generation-based models, enabling AI to generate more accurate and contextually relevant responses by accessing and integrating external data sources during the generation process. This approach significantly improves the quality and reliability of AI-generated content, making it particularly useful in applications that require up-to-date information or domain-specific knowledge. [Retrieval Augmented Generation (RAG)](https://blogs.nvidia.com/blog/what-is-retrieval-augmented-generation/) is a method that enhances generative AI models by incorporating external information retrieval. RAG combines the strengths of retrieval-based and generation-based models, enabling AI to generate more accurate and contextually relevant responses by accessing and integrating external data sources during the generation process. This approach often uses embeddings to convert text into high-dimensional vectors and vector databases to efficiently store and retrieve these vectors based on similarity searches. This significantly improves the quality and reliability of AI-generated content, making it particularly useful in applications that require up-to-date information or domain-specific knowledge.
### Benefits ### Benefits
- **Enhanced contextual accuracy**: By integrating external data, RAG models provide more accurate and contextually relevant answers than traditional generative models. - **Enhanced contextual accuracy**: By integrating external data, RAG models provide more accurate and contextually relevant answers than traditional generative models.
- **Dynamic information access**: Allows models to retrieve and use the most relevant information available at the time of the query, ensuring up-to-date responses. - **Dynamic information access**: Allows models to retrieve and use the most relevant information available at the time of the query, ensuring up-to-date responses.
- **Versatile applications**: Suitable for a wide range of use cases, from customer support and virtual assistants to content creation and research, where accurate information is critical. - **Versatile applications**: Suitable for a wide range of use cases, from customer support and virtual assistants to content creation and research, where accurate information is critical.
- **Scalability**: Efficiently scales with growing data sources and knowledge bases, improving the model's ability to provide accurate and relevant information over time. - **Sustainable resource utilization**: By leveraging external, up-to-date data sets for information retrieval and combining them with generative capabilities, RAG models reduce the need for extensive data processing and computing power, resulting in lower energy consumption.
- **Reduced hallucination**: Minimizes the instances of AI generating plausible but incorrect information by grounding responses in verifiable external data. - **Reduced hallucination**: Minimizes the instances of AI generating plausible but incorrect information by grounding responses in verifiable external data.
### Risks ### Challenges
- **Latency**: Retrieving and processing external data in real-time can introduce latency, potentially slowing down response times compared to purely generative models. - **Latency**: Retrieving and processing external data in real-time can introduce latency, potentially slowing down response times compared to purely generative models.
- **Dependence on external data quality**: The accuracy and relevance of the generated content is highly dependent on the quality and reliability of the external data sources. Poor quality or outdated data can negatively impact the results. - **Dependence on external data quality**: The accuracy and relevance of the generated content is highly dependent on the quality and reliability of the external data sources. Poor quality or outdated data can negatively impact the results.
- **Security and privacy risks**: Accessing external data sources can expose the system to security vulnerabilities and privacy concerns, especially when sensitive or proprietary information is involved. - **Security and privacy**: Accessing *external* data sources can expose the system to security vulnerabilities and privacy concerns. Mitigation strategies include careful data access management and robust security measures in the retrieval process.
- **Maintenance overhead**: Regular updates and maintenance of the retrieval system and data sources are required to ensure continued accuracy and relevance, adding to the operational burden. - **Maintenance overhead**: Regular updates and maintenance of the retrieval system and data sources are required to ensure continued accuracy and relevance, adding to the operational burden.
- **Potential for bias**: The retrieval process can introduce biases present in the external data, which can be propagated into the generated content, affecting the fairness and objectivity of the outputs. - **Potential for bias**: The retrieval process can introduce biases present in the external data, which can be propagated into the generated content, affecting the fairness and objectivity of the outputs.