๐ง DriveToRAG - Turn Your Google Drive into an AI Agent Ready RAG Knowledge Base - n8n workflow
Transform your Google Drive files into intelligent, searchable knowledge with this powerful n8n workflow.
DriveToRAG automatically monitors your Google Drive folders, converts documents into vector embeddings for advanced retrieval augmented generation (RAG) applications, and keeps your vector store in sync with your Drive content.
๐ Key Features
- New File Processing: Automatically detects and processes new files added to Google Drive folders, converting them into vector embeddings
- Intelligent Update Detection: Uses advanced hash-based comparison to identify actual content changes, automatically removes outdated embeddings, and regenerates vector embeddings only when files have been genuinely modified, ensuring optimal performance and accuracy
- Smart Deletion Handling: Automatically removes embeddings when files are moved to recycle bin
- Multi-Format Support: Processes Google Docs, PDFs, HTML, Google Sheets, Excel files, and images
- Advanced OCR Integration: Extracts text from images and scanned PDFs using Mistral OCR
- Intelligent Vector Database: Stores embeddings in Supabase for semantic search and RAG applications
๐ก Use Cases
- Customer Support Excellence: Create AI-powered help desks that instantly find relevant information from knowledge bases, reduce support tickets, and provide multilingual support
- Enterprise Knowledge Management: Transform company documents into intelligent assistants for onboarding, compliance searches, and internal wiki capabilities
- Research & Development: Convert research papers and technical documentation into queryable knowledge bases for literature reviews, patent analysis, and market intelligence
- Education & Training: Create AI tutors and study assistants from course materials, training manuals, and certification preparation resources
- Healthcare & Professional Services: Process medical literature, legal documents, and financial research into intelligent, searchable systems (ensure compliance with privacy regulations)
- Technical Documentation: Convert API docs, product specifications, and maintenance procedures into AI assistants for development teams and customer support
โ๏ธ Requirements
- n8n instance (cloud or self-hosted)
- Google Drive account
- OpenAI API key
- Supabase account
- Mistral AI API key (for advanced OCR)
๐ง Easy Setup
The workflow comes with comprehensive documentation that guides you through:
- Setting up n8n (both local and cloud deployment options)
- Installing and configuring required credentials (Google Drive, OpenAI, Supabase, Mistral AI)
- Configuring Google Drive folder monitoring
- Creating the necessary Supabase tables and database schema
- Testing and verifying the workflow functionality
๐ฌ FAQ
Q: Can this workflow handle large documents?
A: Yes! The workflow uses recursive character splitting to break large documents into manageable chunks while preserving context.
Q: How many documents can it process?
A: The workflow can handle thousands of documents, limited only by your Supabase and OpenAI API quotas
Q: Is the RAG vector database updated in real-time when files are added, deleted, or modified in the Google Drive folder?
A: Yes, the workflow continuously monitors your Google Drive folders and automatically updates the vector database whenever changes occur. The system uses efficient change detection to process only modified content, ensuring your knowledge base stays current with minimal processing overhead.
Q: Can I adapt this workflow for other storage systems like Dropbox, OneDrive, or SharePoint?
A: Absolutely! The workflow's modular design allows you to substitute the Google Drive trigger node with nodes for other storage platforms. You'll need to configure the appropriate credentials and adjust some data mapping, but the core document processing and embedding pipeline remains the same.
Q: Where are the RAG vector embeddings stored?
A: By default, the workflow stores all vector embeddings in Supabase using the provided database schema. However, the modular design allows you to easily integrate with other vector databases (like Pinecone, Weaviate, or Milvus) by modifying the database connection nodes.
Q: How does the OCR functionality work?
A: The workflow leverages Mistral AI's advanced OCR capabilities to extract text from images and scanned PDFs. This enables semantic search across all your visual documents, making previously inaccessible information retrievable.
Q: Do I have to use all the functionality provided in the workflow?
A: No, you don't have to use everything! The workflow is designed to be fully modular, allowing you to customize it to your specific needs. You can easily:
- Remove components you don't need (e.g., disable OCR processing by removing the Mistral AI nodes)
- Replace components with alternatives (e.g., swap OpenAI with another embedding provider)
- Add new components to extend functionality
- Configure existing components through the intuitive n8n interface
๐ Support
Need assistance? Connect with me on Twitter/X @victor_explore and send a DM for:
- Personalized setup assistance with this workflow
- Custom workflow development for your specific needs
Transform your documents into intelligent knowledge with DriveToRAG today!