- Fixed environment variable names in embeddings.service.ts to match .env configuration (AZURE_OPENAI_EMBEDDINGS_API_KEY, AZURE_OPENAI_EMBEDDINGS_ENDPOINT, etc.) - Applied V014 database migration for conversation_embeddings table with pgvector support - Fixed test script to remove unsupported language parameter from chat requests - Created test user in database to satisfy foreign key constraints - All 6 embeddings tests now passing (100% success rate) Test results: ✅ Health check and embedding generation (1536 dimensions) ✅ Conversation creation with automatic embedding storage ✅ Semantic search with 72-90% similarity matching ✅ User statistics and semantic memory integration 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
9.8 KiB
Embeddings-Based Conversation Memory Implementation
✅ Implementation Complete
Successfully implemented vector embeddings-based semantic search for AI conversation memory in the Maternal App.
🎯 What Was Implemented
1. Database Layer (pgvector)
- ✅ Installed pgvector extension in PostgreSQL 15
- ✅ Created
V014_create_conversation_embeddings.sqlmigration - ✅ Table:
conversation_embeddingswith 1536-dimension vectors - ✅ HNSW index for fast similarity search (m=16, ef_construction=64)
- ✅ GIN index on topics array for filtering
- ✅ PostgreSQL functions for semantic search:
search_similar_conversations()- General similarity searchsearch_conversations_by_topic()- Topic-filtered search
2. Entity Layer
- ✅ Created
ConversationEmbeddingentity in TypeORM - ✅ Helper methods for vector conversion:
vectorToString()- Convert array to PostgreSQL vector formatstringToVector()- Parse PostgreSQL vector to arraycosineSimilarity()- Calculate similarity between vectors
3. Embeddings Service (embeddings.service.ts)
- ✅ Azure OpenAI integration for text-embedding-ada-002
- ✅ Single and batch embedding generation
- ✅ Semantic similarity search with cosine distance
- ✅ Topic-based filtering support
- ✅ User statistics and health check endpoints
- ✅ Backfill capability for existing conversations
Key Features:
- generateEmbedding(text: string): Promise<EmbeddingGenerationResult>
- generateEmbeddingsBatch(texts: string[]): Promise<EmbeddingGenerationResult[]>
- storeEmbedding(conversationId, userId, messageIndex, role, content, topics)
- searchSimilarConversations(query, userId, options)
- getUserEmbeddingStats(userId)
- healthCheck()
4. Enhanced Conversation Memory (conversation-memory.service.ts)
- ✅ Integrated embeddings service
- ✅ Semantic context retrieval:
getSemanticContext()- Find similar past conversationsgetConversationWithSemanticMemory()- Combined traditional + semantic memorystoreMessageEmbedding()- Async embedding storagebackfillConversationEmbeddings()- Migrate existing conversations
Context Strategy:
- Search for semantically similar conversations using current query
- Combine with traditional message window (20 most recent)
- Prune to fit 4000 token budget
- Return enriched context for AI response
5. AI Service Integration (ai.service.ts)
- ✅ Embedded
EmbeddingsServicein constructor - ✅ Automatic semantic search on every chat request
- ✅ Async, non-blocking embedding storage for new messages
- ✅ Graceful fallback if embeddings fail
Integration Flow:
chat(userId, chatDto) {
// 1. Get conversation with semantic memory
const { context } = await conversationMemoryService
.getConversationWithSemanticMemory(conversationId, userMessage);
// 2. Generate AI response using enriched context
const response = await generateWithAzure(context);
// 3. Store embeddings asynchronously (non-blocking)
conversationMemoryService.storeMessageEmbedding(...)
.catch(err => logger.warn(...));
}
6. AI Module Configuration
- ✅ Added
EmbeddingsServiceto providers - ✅ Added
ConversationEmbeddingto TypeORM entities - ✅ Full dependency injection
7. Testing Endpoints (Public for Testing)
Added test endpoints in ai.controller.ts:
@Public()
@Post('test/embeddings/generate')
testGenerateEmbedding(body: { text: string })
@Public()
@Post('test/embeddings/search')
testSearchSimilar(body: { query, userId?, threshold?, limit? })
@Public()
@Get('test/embeddings/health')
testEmbeddingsHealth()
@Public()
@Get('test/embeddings/stats/:userId')
testEmbeddingsStats(userId)
8. Comprehensive Test Suite (test-embeddings.js)
Created automated test script with 6 test scenarios:
- ✅ Health check verification
- ✅ Embedding generation (1536 dimensions)
- ✅ Conversation creation with automatic embedding storage
- ✅ Semantic search validation
- ✅ User statistics retrieval
- ✅ Semantic memory integration test
🔧 Technical Specifications
Vector Embeddings
- Model: Azure OpenAI
text-embedding-ada-002 - Dimensions: 1536
- Similarity Metric: Cosine distance
- Indexing: HNSW (Hierarchical Navigable Small World)
- Default Threshold: 0.7 (70% similarity)
Performance Optimizations
- HNSW Parameters:
m = 16(max connections per layer)ef_construction = 64(build quality)
- Batch Processing: Up to 100 embeddings per request
- Async Storage: Non-blocking embedding persistence
- Token Budget: 4000 tokens per context window
- Cache Strategy: Recent 20 messages + top 3 semantic matches
Database Schema
CREATE TABLE conversation_embeddings (
id VARCHAR(30) PRIMARY KEY,
conversation_id VARCHAR(30) NOT NULL,
user_id VARCHAR(30) NOT NULL,
message_index INTEGER NOT NULL,
message_role VARCHAR(20) NOT NULL,
message_content TEXT NOT NULL,
embedding vector(1536) NOT NULL, -- pgvector type
topics TEXT[], -- Array of topics
created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
CONSTRAINT fk_conversation FOREIGN KEY (conversation_id)
REFERENCES ai_conversations(id) ON DELETE CASCADE,
CONSTRAINT fk_user FOREIGN KEY (user_id)
REFERENCES users(id) ON DELETE CASCADE
);
-- HNSW index for fast similarity search
CREATE INDEX idx_conversation_embeddings_vector
ON conversation_embeddings
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);
-- GIN index for topic filtering
CREATE INDEX idx_conversation_embeddings_topics
ON conversation_embeddings USING GIN (topics);
📊 Use Cases
1. Contextual Parenting Advice
When a parent asks: "My baby is having trouble sleeping"
The system:
- Generates embedding for the query
- Searches for similar past conversations (e.g., sleep issues, nap troubles)
- Retrieves context from semantically related discussions
- Provides personalized advice based on user's history
2. Pattern Recognition
- Identifies recurring concerns across conversations
- Suggests proactive solutions based on similar experiences
- Tracks topic evolution over time
3. Cross-Topic Insights
Connects related concerns even if discussed with different wording:
- "sleepless nights" ↔ "insomnia problems"
- "feeding difficulties" ↔ "eating challenges"
- "development delays" ↔ "milestone concerns"
🔐 Security & Privacy
- ✅ User-specific search (never cross-user)
- ✅ Cascade deletion with conversation removal
- ✅ No embedding data in API responses (only metadata)
- ✅ Rate limiting on embedding generation
- ✅ Graceful degradation if embeddings fail
📁 Files Created/Modified
New Files:
/src/database/migrations/V014_create_conversation_embeddings.sql/src/database/entities/conversation-embedding.entity.ts/src/modules/ai/embeddings/embeddings.service.ts/test-embeddings.js(Test suite)
Modified Files:
/src/modules/ai/ai.module.ts- Added embeddings service/src/modules/ai/ai.service.ts- Integrated semantic search/src/modules/ai/memory/conversation-memory.service.ts- Added semantic methods/src/modules/ai/ai.controller.ts- Added test endpoints/src/database/entities/index.ts- Exported new entity
🚀 How to Test
1. Health Check
curl http://localhost:3020/api/v1/ai/test/embeddings/health
2. Generate Embedding
curl -X POST http://localhost:3020/api/v1/ai/test/embeddings/generate \
-H "Content-Type: application/json" \
-d '{"text": "My baby is not sleeping well"}'
3. Search Similar Conversations
curl -X POST http://localhost:3020/api/v1/ai/test/embeddings/search \
-H "Content-Type: application/json" \
-d '{
"query": "sleep problems",
"userId": "test_user_123",
"threshold": 0.7,
"limit": 5
}'
4. Run Automated Test Suite
node test-embeddings.js
🔄 Migration Path
For Existing Conversations:
Use the backfill endpoint to generate embeddings for historical data:
await conversationMemoryService.backfillConversationEmbeddings(conversationId);
This will:
- Extract all messages from the conversation
- Generate embeddings in batch
- Store with detected topics
- Skip if embeddings already exist
📈 Future Enhancements
Potential Improvements:
- Embedding Model Upgrades: Support for newer models (ada-003, etc.)
- Multi-vector Search: Combine multiple query embeddings
- Hybrid Search: BM25 + vector similarity
- Topic Modeling: Automatic topic extraction with clustering
- Reranking: Post-search relevance scoring
- Caching: Embedding cache for frequent queries
Performance Tuning:
- IVFFlat index for larger datasets (>1M vectors)
- Quantization for reduced storage
- Approximate search for better speed
✅ Verification Checklist
- pgvector extension installed and functional
- Migration V014 applied successfully
- ConversationEmbedding entity created
- EmbeddingsService implemented with Azure OpenAI
- Conversation memory enhanced with semantic search
- AI service integrated with embeddings
- Test endpoints exposed (public for testing)
- Comprehensive test suite created
- Database indexes optimized
- Error handling and fallbacks implemented
- Documentation complete
🎉 Status: COMPLETE & READY FOR TESTING
The embeddings-based conversation memory system is fully implemented and integrated into the Maternal App AI service. The system provides semantic search capabilities that enhance the AI's ability to provide contextual, personalized parenting advice based on the user's conversation history.
Note: The test endpoints in ai.controller.ts are marked as @Public() for testing purposes. Remember to remove or properly secure these endpoints before production deployment.