Each week, I'll bring you the most relevant and insightful tech stories, saving you time and keeping you informed.
Here is what I'm curretly adopting for taking notes:
There are four main things related to note taking: type, folder, tags, links.
Type
First is the note type.
Fleeting Notes are more like raw notes that I can write at any time.
Permanent Note is more like ideas that writes in a good readable format.
Index Note is for indexing notes within same topic. or a organized notes just for indexing purpose
Project Note is for project related notes. Tracking project ideas or progress etc.
Concept: Defining one concept. Smaller than a index note
Question: notes related to one question. How to ....?
Folder
Folder is only used for process purpose. E.g. handling raw notes to permanent notes.
00_Inbox #raw notes
01_Journay # daily-notes
10_LiteratureNotes
20_PermanentNotes
21_IndexNotes
22_Weekly # folder for weekly newsletter
30_Projects # tracking project notes
40_Resources # images etc
50_Archive # notes not used any more
Tags
Pre-generated 20+ tags for topics related to the note.
Links
When two notes are related, just adding a link for the note. Making sure knowledge is connected.
Notes graph can be auto generated after you add links between two notes.
AI…
A birdeye view of RAG techniques:
Foundational RAG Techniques (🌱):
Basic Implementation: Setting up the fundamental RAG pipeline using LangChain and LlamaIndex.
Specific Data Sources: How to use CSV files as a data source for building RAG systems.
Reliability Enhancement (Reliable RAG): Adding validation and refinement steps to basic RAG to ensure information accuracy.
Chunk Size Optimization: Exploring how to choose appropriate text chunk sizes to balance context preservation and retrieval efficiency.
Proposition Chunking: Breaking down text into smaller, factual propositional units for more precise query matching.
Query Enhancement Techniques (🔍):
Query Transformations: Optimizing the user's original query through methods like rewriting, decomposition (sub-queries), or step-back prompting to better match relevant documents.
Hypothetical Questions (HyDE): Generating potential questions that a document chunk might answer, transforming retrieval into a "question-question" matching task to improve relevance.
Hypothetical Prompt Embeddings (HyPE): Pre-computing embeddings for hypothetical questions during indexing, allowing for faster query-time matching.
Context and Content Enrichment Techniques (📚):
Contextual Chunk Headers: Prepending summary information about the source document or section to each chunk to enrich its embedding context.
Relevant Segment Extraction: Dynamically merging adjacent relevant chunks after retrieval to provide more complete context.
Sentence Window Retrieval: Retrieving the most relevant single sentence and automatically including its preceding and succeeding sentences to expand local context.
Semantic Chunking: Dividing documents based on semantic coherence rather than fixed sizes.
Contextual Compression: Using an LLM to compress retrieved content, preserving the most query-relevant information.
Document Augmentation (via Question Generation): Generating various potential questions for documents and adding them to the index to increase the chances of retrieval.
Advanced Retrieval Methods (🚀):
Fusion Retrieval: Combining the strengths of multiple retrieval methods (e.g., keyword search + vector search).
Intelligent Reranking: Using more sophisticated models (like Cross-Encoders or LLMs) to re-order initial retrieval results, boosting the relevance of top results.
Multi-faceted Filtering: Applying various filters based on metadata (date, source), similarity thresholds, content keywords, etc.
Hierarchical Indices: Creating multi-level index structures (e.g., document summaries and detailed chunks) for efficiency.
Ensemble Retrieval: Combining results from multiple different retrieval models or algorithms.
Dartboard Retrieval: Optimizing retrieval for both relevance and diversity simultaneously.
Multi-modal Retrieval: Techniques for handling and retrieving data involving multiple types like text and images.
Iterative and Adaptive Techniques (🔁):
Retrieval with Feedback Loops: Using user feedback to continuously improve retrieval and generation models.
Adaptive Retrieval: Dynamically adjusting retrieval strategies based on query type or user context.
Iterative Retrieval: Performing multiple rounds of retrieval, using results from previous rounds to refine subsequent queries.
Evaluation (📊):
DeepEval / GroUSE Evaluation: Providing methods and metrics using specific frameworks (like DeepEval, GroUSE) for comprehensive RAG system performance evaluation.
Explainability (🔬):
Explainable Retrieval: Offering methods to explain why specific pieces of information were retrieved, increasing system transparency.
Advanced Architectures (🏗️):
Knowledge Graph Integration (Graph RAG): Incorporating structured information from knowledge graphs to enhance retrieval and generation.
GraphRag (Microsoft): Microsoft's open-source advanced RAG system utilizing knowledge graphs.
RAPTOR: A method involving recursive processing and summarization of information, organized in a tree structure for retrieval.
Self RAG / Corrective RAG: More intelligent RAG frameworks that can autonomously decide whether to retrieve, assess retrieval quality, and even use web search to correct or supplement information.
Special Advanced Technique (🌟):
Sophisticated Controllable Agent: An advanced agent solution using a deterministic graph, designed to tackle complex questions that simple semantic similarity retrieval cannot solve.
New Products…
TheLibrarian.io - Your WhatsApp AI Assistant
WhatsApp AI Assistant designed to Master Your Inbox, Control Your Schedule, Find Anything You Need - so you can focus on what truly matters. It seamlessly integrates with all your Google Apps (Gmail, Drive, Calendar, Contacts), Slack, and Notion.