Major updates: - Replace homepage with clean, minimalist Apple-style landing page - Focus on donation messaging and mission statement - Add comprehensive AI chat analysis documentation - Fix Azure OpenAI configuration with correct endpoints - Update embedding API to use text-embedding-ada-002 (1536 dims) Landing Page Features: - Hero section with tagline "Every Scripture. Every Language. Forever Free" - Mission statement emphasizing free access - Matthew 10:8 verse highlight - 6 feature cards (Global Library, Multilingual, Prayer Wall, AI Chat, Privacy, Offline) - Donation CTA sections with PayPal and card options - "Why It Matters" section with dark background - Clean footer with navigation links Technical Changes: - Updated .env.local with new Azure credentials - Fixed vector-search.ts to support separate embed API version - Integrated AuthModal into Bible reader and prayers page - Made prayer filters collapsible and mobile-responsive - Changed language picker to single-select Documentation Created: - AI_CHAT_FIX_PLAN.md - Comprehensive implementation plan - AI_CHAT_VERIFICATION_FINDINGS.md - Database analysis - AI_CHAT_ANALYSIS_SUMMARY.md - Executive summary - AI_CHAT_STATUS_UPDATE.md - Current status and next steps - logo.svg - App logo (MenuBook icon) Build: ✅ Successful (Next.js 15.5.3) 🤖 Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com>
1877 lines
54 KiB
Markdown
1877 lines
54 KiB
Markdown
# AI Chat Vector Search Implementation Plan
|
|
## Biblical Guide - Vector Database Integration & Multi-Language Search
|
|
|
|
**Document Version:** 1.0
|
|
**Date:** January 2025
|
|
**Status:** Implementation Plan
|
|
|
|
---
|
|
|
|
## Table of Contents
|
|
|
|
1. [Executive Summary](#1-executive-summary)
|
|
2. [Current State Analysis](#2-current-state-analysis)
|
|
3. [Requirements & Specifications](#3-requirements--specifications)
|
|
4. [Architecture Overview](#4-architecture-overview)
|
|
5. [Database Schema Analysis](#5-database-schema-analysis)
|
|
6. [Implementation Plan](#6-implementation-plan)
|
|
7. [Vector Search Logic](#7-vector-search-logic)
|
|
8. [API Integration](#8-api-integration)
|
|
9. [Testing Strategy](#9-testing-strategy)
|
|
10. [Deployment & Monitoring](#10-deployment--monitoring)
|
|
|
|
---
|
|
|
|
## 1. Executive Summary
|
|
|
|
### 1.1 Objective
|
|
|
|
Implement a robust vector search system for the AI chat that:
|
|
- ✅ Searches through all Bible versions in the user's language
|
|
- ✅ Returns comprehensive answers with Bible version citations
|
|
- ✅ Falls back to English versions when language-specific content is unavailable
|
|
- ✅ Maintains conversation context and accuracy
|
|
- ✅ Provides multilingual responses (English, Romanian, Spanish, Italian)
|
|
|
|
### 1.2 Current Status
|
|
|
|
- **Completed:** Vector embeddings for ALL Bible versions in dedicated vector database
|
|
- **Pending:** Integration of vector search into AI chat system
|
|
- **Pending:** Multi-language search logic implementation
|
|
- **Pending:** Fallback mechanism for incomplete translations
|
|
|
|
### 1.3 Success Criteria
|
|
|
|
1. **Accuracy:** AI chat returns biblically accurate answers with verse citations
|
|
2. **Language Support:** Searches user's language first, falls back to English when needed
|
|
3. **Performance:** Vector search completes in < 2 seconds
|
|
4. **Coverage:** All 4 languages supported (en, ro, es, it)
|
|
5. **Citations:** Always provides Bible version and verse references
|
|
|
|
---
|
|
|
|
## 2. Current State Analysis
|
|
|
|
### 2.1 Codebase Investigation Tasks
|
|
|
|
**TASK 1: Identify Current AI Chat Implementation**
|
|
- [ ] Locate AI chat API routes (`/app/api/chat/`)
|
|
- [ ] Find chat component files (`/components/chat/`)
|
|
- [ ] Identify current AI provider (OpenAI, Anthropic, etc.)
|
|
- [ ] Review conversation storage mechanism
|
|
- [ ] Check current embedding/vector search usage (if any)
|
|
|
|
**TASK 2: Database Schema Review**
|
|
- [ ] Analyze vector database structure
|
|
- [ ] Document all Bible version tables
|
|
- [ ] Verify embedding dimensions and metadata
|
|
- [ ] Check indexing configuration
|
|
- [ ] Review language-to-version mapping
|
|
|
|
**TASK 3: API Connection Audit**
|
|
- [ ] Verify AI API credentials (OpenAI, Anthropic, etc.)
|
|
- [ ] Check vector database connection (Supabase, Pinecone, pgvector, etc.)
|
|
- [ ] Test API rate limits and quotas
|
|
- [ ] Validate environment variables
|
|
- [ ] Test connection stability
|
|
|
|
**TASK 4: Current Search Flow Analysis**
|
|
- [ ] Document existing question → answer flow
|
|
- [ ] Identify where vector search should be integrated
|
|
- [ ] Review prompt engineering approach
|
|
- [ ] Check context window management
|
|
- [ ] Analyze response formatting
|
|
|
|
---
|
|
|
|
## 3. Requirements & Specifications
|
|
|
|
### 3.1 Functional Requirements
|
|
|
|
#### FR-1: Multi-Language Vector Search
|
|
**Description:** System must search Bible versions in user's language first
|
|
|
|
**User Languages → Bible Versions Mapping:**
|
|
```
|
|
English (en) → Bible versions:
|
|
- KJV (King James Version)
|
|
- NIV (New International Version)
|
|
- ESV (English Standard Version)
|
|
- NASB (New American Standard Bible)
|
|
- [other English versions]
|
|
|
|
Romanian (ro) → Bible versions:
|
|
- Cornilescu 1924
|
|
- Cornilescu 2024
|
|
- Dumitru Cornilescu
|
|
- [other Romanian versions]
|
|
|
|
Spanish (es) → Bible versions:
|
|
- Reina-Valera 1960
|
|
- Nueva Versión Internacional (NVI)
|
|
- Biblia de las Américas
|
|
- [other Spanish versions]
|
|
|
|
Italian (it) → Bible versions:
|
|
- Nuova Riveduta 2006
|
|
- La Sacra Bibbia (Diodati)
|
|
- Conferenza Episcopale Italiana
|
|
- [other Italian versions]
|
|
```
|
|
|
|
**Acceptance Criteria:**
|
|
- System identifies user language from chat context
|
|
- Searches all versions available in that language
|
|
- Returns results from multiple versions when applicable
|
|
- Cites version name with each reference
|
|
|
|
#### FR-2: Fallback to English
|
|
**Description:** When no language-specific content found, search English versions
|
|
|
|
**Fallback Conditions:**
|
|
1. No Bible versions available in user's language
|
|
2. Bible incomplete in user's language (missing books/verses)
|
|
3. Vector search returns no results in user's language
|
|
4. Confidence score below threshold
|
|
|
|
**Acceptance Criteria:**
|
|
- Automatic fallback when conditions met
|
|
- User informed about language switch (transparent)
|
|
- Response translated to user's language
|
|
- English version citations included
|
|
|
|
#### FR-3: Citation Requirements
|
|
**Description:** All answers must include proper Bible citations
|
|
|
|
**Citation Format:**
|
|
```
|
|
[Version Abbreviation] [Book] [Chapter]:[Verse]
|
|
|
|
Examples:
|
|
- "According to KJV John 3:16..."
|
|
- "Cornilescu Genesis 1:1 states..."
|
|
- "As written in NIV Romans 8:28..."
|
|
```
|
|
|
|
**Acceptance Criteria:**
|
|
- Every Bible reference includes version name
|
|
- Book, chapter, verse always specified
|
|
- Multiple versions cited when providing comprehensive answer
|
|
- Citations are clickable links to Bible reader
|
|
|
|
#### FR-4: Comprehensive Answers
|
|
**Description:** Search multiple versions and synthesize comprehensive response
|
|
|
|
**Answer Structure:**
|
|
1. Direct answer to user's question
|
|
2. Supporting verses from multiple versions
|
|
3. Cross-references when relevant
|
|
4. Contextual explanation
|
|
5. Links to full passages in Bible reader
|
|
|
|
**Acceptance Criteria:**
|
|
- Uses 2-4 Bible versions in typical response
|
|
- Provides context around cited verses
|
|
- Highlights version differences when significant
|
|
- Offers to show full passage
|
|
|
|
### 3.2 Technical Requirements
|
|
|
|
#### TR-1: Vector Database Structure
|
|
**Database Type:** PostgreSQL with pgvector extension (or Supabase Vector)
|
|
|
|
**Table Schema (per Bible version):**
|
|
```sql
|
|
-- Example: bible_vectors_kjv (King James Version)
|
|
CREATE TABLE bible_vectors_kjv (
|
|
id SERIAL PRIMARY KEY,
|
|
book VARCHAR(50) NOT NULL, -- e.g., "Genesis"
|
|
chapter INTEGER NOT NULL,
|
|
verse INTEGER NOT NULL,
|
|
text TEXT NOT NULL, -- The verse text
|
|
embedding vector(1536), -- OpenAI ada-002 embeddings
|
|
metadata JSONB, -- Additional data
|
|
version VARCHAR(20) DEFAULT 'KJV',
|
|
language VARCHAR(5) DEFAULT 'en',
|
|
|
|
UNIQUE(book, chapter, verse)
|
|
);
|
|
|
|
-- Vector similarity index
|
|
CREATE INDEX ON bible_vectors_kjv
|
|
USING ivfflat (embedding vector_cosine_ops);
|
|
```
|
|
|
|
**Version Tables:**
|
|
- `bible_vectors_kjv` (English)
|
|
- `bible_vectors_niv` (English)
|
|
- `bible_vectors_esv` (English)
|
|
- `bible_vectors_cornilescu` (Romanian)
|
|
- `bible_vectors_rvr1960` (Spanish)
|
|
- `bible_vectors_nvi_spanish` (Spanish)
|
|
- `bible_vectors_nuova_riveduta` (Italian)
|
|
- [additional tables for each version]
|
|
|
|
#### TR-2: Vector Search Parameters
|
|
```typescript
|
|
interface VectorSearchParams {
|
|
query: string; // User's question (embedded)
|
|
languages: string[]; // e.g., ['ro', 'en']
|
|
limit: number; // Top K results (default: 10)
|
|
similarityThreshold: number; // Minimum similarity (default: 0.7)
|
|
versions?: string[]; // Specific versions to search
|
|
}
|
|
|
|
interface SearchResult {
|
|
book: string;
|
|
chapter: number;
|
|
verse: number;
|
|
text: string;
|
|
version: string;
|
|
language: string;
|
|
similarity: number; // Cosine similarity score
|
|
}
|
|
```
|
|
|
|
#### TR-3: AI Model Configuration
|
|
```typescript
|
|
interface AIConfig {
|
|
provider: 'openai' | 'anthropic';
|
|
model: string; // e.g., 'gpt-4-turbo' or 'claude-3-5-sonnet'
|
|
temperature: number; // 0.3 for factual responses
|
|
maxTokens: number; // 1500 for typical responses
|
|
systemPrompt: string; // Biblical AI assistant instructions
|
|
}
|
|
```
|
|
|
|
#### TR-4: Performance Requirements
|
|
- Vector search: < 2 seconds
|
|
- Full AI response: < 5 seconds
|
|
- Concurrent users: Support 100+
|
|
- Embedding cache: 24 hours
|
|
- Database connection pool: 10-20 connections
|
|
|
|
---
|
|
|
|
## 4. Architecture Overview
|
|
|
|
### 4.1 System Components
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ User Interface │
|
|
│ (Chat Component) │
|
|
└────────────────────────┬────────────────────────────────────┘
|
|
│
|
|
│ POST /api/chat
|
|
│ { message, locale, conversationId }
|
|
▼
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ Chat API Route │
|
|
│ 1. Identify user language │
|
|
│ 2. Generate embedding for question │
|
|
│ 3. Trigger multi-language vector search │
|
|
└────────────────────────┬────────────────────────────────────┘
|
|
│
|
|
▼
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ Vector Search Service │
|
|
│ 1. Search primary language versions │
|
|
│ 2. Check result quality/completeness │
|
|
│ 3. Fallback to English if needed │
|
|
│ 4. Return top results with metadata │
|
|
└────────────────────────┬────────────────────────────────────┘
|
|
│
|
|
│ Query multiple tables
|
|
▼
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ Vector Database (PostgreSQL) │
|
|
│ ┌──────────────────┐ ┌──────────────────┐ │
|
|
│ │ bible_vectors_kjv│ │bible_vectors_niv │ [English] │
|
|
│ └──────────────────┘ └──────────────────┘ │
|
|
│ ┌──────────────────┐ ┌──────────────────┐ │
|
|
│ │bible_vectors_ │ │bible_vectors_ │ [Romanian] │
|
|
│ │ cornilescu │ │ cornilescu2024 │ │
|
|
│ └──────────────────┘ └──────────────────┘ │
|
|
│ ┌──────────────────┐ ┌──────────────────┐ │
|
|
│ │bible_vectors_ │ │bible_vectors_nvi_│ [Spanish] │
|
|
│ │ rvr1960 │ │ spanish │ │
|
|
│ └──────────────────┘ └──────────────────┘ │
|
|
│ ┌──────────────────┐ │
|
|
│ │bible_vectors_ │ [Italian] │
|
|
│ │ nuova_riveduta │ │
|
|
│ └──────────────────┘ │
|
|
└────────────────────────┬────────────────────────────────────┘
|
|
│ Return results
|
|
▼
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ AI Context Builder │
|
|
│ 1. Format search results into context │
|
|
│ 2. Build system prompt with instructions │
|
|
│ 3. Include conversation history │
|
|
│ 4. Add citation requirements │
|
|
└────────────────────────┬────────────────────────────────────┘
|
|
│
|
|
▼
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ AI Provider (OpenAI/Anthropic) │
|
|
│ Generate comprehensive answer with citations │
|
|
└────────────────────────┬────────────────────────────────────┘
|
|
│
|
|
▼
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ Response Formatter │
|
|
│ 1. Parse citations │
|
|
│ 2. Create links to Bible reader │
|
|
│ 3. Format markdown │
|
|
│ 4. Return to user │
|
|
└────────────────────────┬────────────────────────────────────┘
|
|
│
|
|
▼
|
|
User receives answer
|
|
```
|
|
|
|
### 4.2 Data Flow
|
|
|
|
**Scenario 1: Romanian User Asks Question**
|
|
```
|
|
User (ro): "Ce spune Biblia despre iubire?" (What does the Bible say about love?)
|
|
↓
|
|
1. Detect language: Romanian (ro)
|
|
2. Generate embedding of question
|
|
3. Search Romanian Bible versions:
|
|
- bible_vectors_cornilescu
|
|
- bible_vectors_cornilescu2024
|
|
4. Find top 10 results about love (1 Corinteni 13, Ioan 3:16, etc.)
|
|
5. Build AI prompt with Romanian verses
|
|
6. AI generates answer in Romanian
|
|
7. Response: "Biblia vorbește mult despre iubire. Cornilescu 1 Corinteni 13:4
|
|
spune: 'Dragostea este îndelung răbdătoare...' Și Ioan 3:16..."
|
|
```
|
|
|
|
**Scenario 2: Italian User - Incomplete Bible**
|
|
```
|
|
User (it): "Chi era Giobbe?" (Who was Job?)
|
|
↓
|
|
1. Detect language: Italian (it)
|
|
2. Generate embedding
|
|
3. Search Italian Bible versions:
|
|
- bible_vectors_nuova_riveduta
|
|
4. Results found: 3 verses (incomplete Book of Job)
|
|
5. Confidence score: LOW (0.5 < 0.7 threshold)
|
|
6. FALLBACK: Search English versions (KJV, NIV, ESV)
|
|
7. Find comprehensive results in English
|
|
8. AI translates context to Italian
|
|
9. Response: "Giobbe era un uomo giusto... (Job was a righteous man...)
|
|
[Da KJV Job 1:1] 'There was a man in the land of Uz, whose name was Job...'"
|
|
```
|
|
|
|
---
|
|
|
|
## 5. Database Schema Analysis
|
|
|
|
### 5.1 Discovery Tasks
|
|
|
|
**TASK 5: Enumerate All Vector Tables**
|
|
```sql
|
|
-- Query to find all Bible vector tables
|
|
SELECT tablename
|
|
FROM pg_tables
|
|
WHERE tablename LIKE 'bible_vectors_%'
|
|
ORDER BY tablename;
|
|
```
|
|
|
|
**Expected Output:**
|
|
```
|
|
bible_vectors_kjv
|
|
bible_vectors_niv
|
|
bible_vectors_esv
|
|
bible_vectors_cornilescu
|
|
bible_vectors_rvr1960
|
|
bible_vectors_nvi_spanish
|
|
bible_vectors_nuova_riveduta
|
|
[... additional tables]
|
|
```
|
|
|
|
**TASK 6: Analyze Table Structure**
|
|
```sql
|
|
-- Check schema of a sample table
|
|
SELECT column_name, data_type, character_maximum_length
|
|
FROM information_schema.columns
|
|
WHERE table_name = 'bible_vectors_kjv';
|
|
|
|
-- Check indexing
|
|
SELECT indexname, indexdef
|
|
FROM pg_indexes
|
|
WHERE tablename = 'bible_vectors_kjv';
|
|
|
|
-- Check embedding dimensions
|
|
SELECT pg_column_size(embedding) / 4 as dimensions
|
|
FROM bible_vectors_kjv
|
|
LIMIT 1;
|
|
```
|
|
|
|
**TASK 7: Create Language-to-Version Mapping**
|
|
```sql
|
|
-- Create configuration table
|
|
CREATE TABLE IF NOT EXISTS bible_version_config (
|
|
id SERIAL PRIMARY KEY,
|
|
table_name VARCHAR(100) UNIQUE NOT NULL,
|
|
version_name VARCHAR(100) NOT NULL,
|
|
version_abbreviation VARCHAR(20) NOT NULL,
|
|
language VARCHAR(5) NOT NULL,
|
|
is_complete BOOLEAN DEFAULT true,
|
|
books_count INTEGER,
|
|
verses_count INTEGER,
|
|
created_at TIMESTAMP DEFAULT NOW(),
|
|
metadata JSONB
|
|
);
|
|
|
|
-- Insert mappings
|
|
INSERT INTO bible_version_config
|
|
(table_name, version_name, version_abbreviation, language, is_complete)
|
|
VALUES
|
|
('bible_vectors_kjv', 'King James Version', 'KJV', 'en', true),
|
|
('bible_vectors_niv', 'New International Version', 'NIV', 'en', true),
|
|
('bible_vectors_esv', 'English Standard Version', 'ESV', 'en', true),
|
|
('bible_vectors_cornilescu', 'Dumitru Cornilescu 1924', 'Cornilescu', 'ro', true),
|
|
('bible_vectors_rvr1960', 'Reina-Valera 1960', 'RVR1960', 'es', true),
|
|
('bible_vectors_nvi_spanish', 'Nueva Versión Internacional', 'NVI', 'es', true),
|
|
('bible_vectors_nuova_riveduta', 'Nuova Riveduta 2006', 'NR2006', 'it', false);
|
|
-- Add all your versions...
|
|
```
|
|
|
|
### 5.2 Version Statistics Query
|
|
|
|
```sql
|
|
-- Get statistics for each version
|
|
SELECT
|
|
bvc.version_abbreviation,
|
|
bvc.language,
|
|
bvc.is_complete,
|
|
COUNT(DISTINCT bv.book) as books_count,
|
|
COUNT(*) as verses_count
|
|
FROM bible_version_config bvc
|
|
JOIN LATERAL (
|
|
SELECT book FROM {table_name} -- Dynamic table name
|
|
) bv ON true
|
|
GROUP BY bvc.version_abbreviation, bvc.language, bvc.is_complete
|
|
ORDER BY bvc.language, bvc.version_abbreviation;
|
|
```
|
|
|
|
---
|
|
|
|
## 6. Implementation Plan
|
|
|
|
### Phase 1: Environment & Connection Verification (Day 1)
|
|
|
|
#### Step 1.1: Check AI API Credentials
|
|
|
|
**File to Check:** `/root/biblical-guide/.env`
|
|
|
|
```bash
|
|
# Required environment variables
|
|
OPENAI_API_KEY=sk-... # OpenAI API key
|
|
ANTHROPIC_API_KEY=sk-ant-... # Anthropic API key (if using Claude)
|
|
|
|
# Vector Database Connection
|
|
VECTOR_DB_URL=postgresql://... # Supabase or PostgreSQL connection
|
|
VECTOR_DB_PASSWORD=... # Database password
|
|
|
|
# Optional: Embedding API
|
|
EMBEDDING_API_KEY=... # If separate from main AI key
|
|
```
|
|
|
|
**Verification Script:**
|
|
```typescript
|
|
// scripts/verify-ai-connection.ts
|
|
import OpenAI from 'openai';
|
|
|
|
async function verifyConnections() {
|
|
console.log('🔍 Verifying AI API connections...\n');
|
|
|
|
// 1. Check OpenAI
|
|
try {
|
|
const openai = new OpenAI({
|
|
apiKey: process.env.OPENAI_API_KEY,
|
|
});
|
|
|
|
const response = await openai.chat.completions.create({
|
|
model: 'gpt-3.5-turbo',
|
|
messages: [{ role: 'user', content: 'Test' }],
|
|
max_tokens: 5,
|
|
});
|
|
|
|
console.log('✅ OpenAI API: Connected');
|
|
console.log(` Model: ${response.model}`);
|
|
} catch (error: any) {
|
|
console.log('❌ OpenAI API: Failed');
|
|
console.log(` Error: ${error.message}`);
|
|
}
|
|
|
|
// 2. Check Vector Database
|
|
try {
|
|
const { Pool } = await import('pg');
|
|
const pool = new Pool({
|
|
connectionString: process.env.VECTOR_DB_URL,
|
|
});
|
|
|
|
const result = await pool.query('SELECT version()');
|
|
console.log('✅ Vector Database: Connected');
|
|
console.log(` PostgreSQL: ${result.rows[0].version.split(' ')[1]}`);
|
|
|
|
// Check for pgvector extension
|
|
const extResult = await pool.query(
|
|
"SELECT * FROM pg_extension WHERE extname = 'vector'"
|
|
);
|
|
|
|
if (extResult.rows.length > 0) {
|
|
console.log('✅ pgvector extension: Installed');
|
|
} else {
|
|
console.log('⚠️ pgvector extension: Not found');
|
|
}
|
|
|
|
await pool.end();
|
|
} catch (error: any) {
|
|
console.log('❌ Vector Database: Failed');
|
|
console.log(` Error: ${error.message}`);
|
|
}
|
|
|
|
// 3. List all Bible vector tables
|
|
try {
|
|
const { Pool } = await import('pg');
|
|
const pool = new Pool({
|
|
connectionString: process.env.VECTOR_DB_URL,
|
|
});
|
|
|
|
const tables = await pool.query(`
|
|
SELECT tablename
|
|
FROM pg_tables
|
|
WHERE tablename LIKE 'bible_vectors_%'
|
|
ORDER BY tablename
|
|
`);
|
|
|
|
console.log(`\n📚 Found ${tables.rows.length} Bible version tables:`);
|
|
tables.rows.forEach(row => {
|
|
console.log(` - ${row.tablename}`);
|
|
});
|
|
|
|
await pool.end();
|
|
} catch (error: any) {
|
|
console.log('❌ Table enumeration failed');
|
|
console.log(` Error: ${error.message}`);
|
|
}
|
|
}
|
|
|
|
verifyConnections();
|
|
```
|
|
|
|
**Run verification:**
|
|
```bash
|
|
npx tsx scripts/verify-ai-connection.ts
|
|
```
|
|
|
|
#### Step 1.2: Identify Current Chat Implementation
|
|
|
|
**Files to Locate:**
|
|
- [ ] `/app/api/chat/route.ts` - Main chat API endpoint
|
|
- [ ] `/components/chat/floating-chat.tsx` - Chat UI component
|
|
- [ ] `/lib/ai/` - AI utility functions (if exists)
|
|
- [ ] `/lib/embeddings/` - Embedding functions (if exists)
|
|
|
|
**Analysis Checklist:**
|
|
- [ ] What AI provider is currently used? (OpenAI/Anthropic)
|
|
- [ ] Is vector search currently implemented?
|
|
- [ ] How are embeddings generated?
|
|
- [ ] Where is conversation history stored?
|
|
- [ ] How are responses formatted?
|
|
|
|
---
|
|
|
|
### Phase 2: Vector Search Service Implementation (Day 2-3)
|
|
|
|
#### Step 2.1: Create Vector Database Connection
|
|
|
|
**File:** `/lib/vector-db/index.ts`
|
|
|
|
```typescript
|
|
import { Pool } from 'pg';
|
|
|
|
// Singleton pool instance
|
|
let pool: Pool | null = null;
|
|
|
|
export function getVectorDbPool(): Pool {
|
|
if (!pool) {
|
|
pool = new Pool({
|
|
connectionString: process.env.VECTOR_DB_URL,
|
|
max: 20,
|
|
idleTimeoutMillis: 30000,
|
|
connectionTimeoutMillis: 2000,
|
|
});
|
|
}
|
|
return pool;
|
|
}
|
|
|
|
export async function testConnection(): Promise<boolean> {
|
|
try {
|
|
const client = await getVectorDbPool().connect();
|
|
await client.query('SELECT 1');
|
|
client.release();
|
|
return true;
|
|
} catch (error) {
|
|
console.error('Vector DB connection failed:', error);
|
|
return false;
|
|
}
|
|
}
|
|
```
|
|
|
|
#### Step 2.2: Create Bible Version Configuration Service
|
|
|
|
**File:** `/lib/vector-db/version-config.ts`
|
|
|
|
```typescript
|
|
import { getVectorDbPool } from './index';
|
|
|
|
export interface BibleVersionConfig {
|
|
tableName: string;
|
|
versionName: string;
|
|
versionAbbreviation: string;
|
|
language: string;
|
|
isComplete: boolean;
|
|
booksCount?: number;
|
|
versesCount?: number;
|
|
}
|
|
|
|
// In-memory cache (refreshed every 24h)
|
|
let versionCache: BibleVersionConfig[] | null = null;
|
|
let cacheTimestamp: number = 0;
|
|
const CACHE_DURATION = 24 * 60 * 60 * 1000; // 24 hours
|
|
|
|
export async function getBibleVersionsByLanguage(
|
|
language: string
|
|
): Promise<BibleVersionConfig[]> {
|
|
const now = Date.now();
|
|
|
|
// Return cached data if valid
|
|
if (versionCache && (now - cacheTimestamp) < CACHE_DURATION) {
|
|
return versionCache.filter(v => v.language === language);
|
|
}
|
|
|
|
// Fetch from database
|
|
const pool = getVectorDbPool();
|
|
const result = await pool.query<BibleVersionConfig>(`
|
|
SELECT
|
|
table_name as "tableName",
|
|
version_name as "versionName",
|
|
version_abbreviation as "versionAbbreviation",
|
|
language,
|
|
is_complete as "isComplete",
|
|
books_count as "booksCount",
|
|
verses_count as "versesCount"
|
|
FROM bible_version_config
|
|
ORDER BY language, version_abbreviation
|
|
`);
|
|
|
|
versionCache = result.rows;
|
|
cacheTimestamp = now;
|
|
|
|
return versionCache.filter(v => v.language === language);
|
|
}
|
|
|
|
export async function getAllVersions(): Promise<BibleVersionConfig[]> {
|
|
const now = Date.now();
|
|
|
|
if (versionCache && (now - cacheTimestamp) < CACHE_DURATION) {
|
|
return versionCache;
|
|
}
|
|
|
|
const pool = getVectorDbPool();
|
|
const result = await pool.query<BibleVersionConfig>(`
|
|
SELECT
|
|
table_name as "tableName",
|
|
version_name as "versionName",
|
|
version_abbreviation as "versionAbbreviation",
|
|
language,
|
|
is_complete as "isComplete",
|
|
books_count as "booksCount",
|
|
verses_count as "versesCount"
|
|
FROM bible_version_config
|
|
ORDER BY language, version_abbreviation
|
|
`);
|
|
|
|
versionCache = result.rows;
|
|
cacheTimestamp = now;
|
|
|
|
return versionCache;
|
|
}
|
|
|
|
export function clearCache(): void {
|
|
versionCache = null;
|
|
cacheTimestamp = 0;
|
|
}
|
|
```
|
|
|
|
#### Step 2.3: Create Embedding Service
|
|
|
|
**File:** `/lib/ai/embeddings.ts`
|
|
|
|
```typescript
|
|
import OpenAI from 'openai';
|
|
|
|
const openai = new OpenAI({
|
|
apiKey: process.env.OPENAI_API_KEY,
|
|
});
|
|
|
|
// Embedding cache (simple in-memory cache)
|
|
const embeddingCache = new Map<string, number[]>();
|
|
const CACHE_MAX_SIZE = 1000;
|
|
|
|
export async function generateEmbedding(text: string): Promise<number[]> {
|
|
// Check cache first
|
|
const cached = embeddingCache.get(text);
|
|
if (cached) {
|
|
return cached;
|
|
}
|
|
|
|
try {
|
|
const response = await openai.embeddings.create({
|
|
model: 'text-embedding-ada-002',
|
|
input: text,
|
|
});
|
|
|
|
const embedding = response.data[0].embedding;
|
|
|
|
// Add to cache (with size limit)
|
|
if (embeddingCache.size >= CACHE_MAX_SIZE) {
|
|
// Remove oldest entry
|
|
const firstKey = embeddingCache.keys().next().value;
|
|
embeddingCache.delete(firstKey);
|
|
}
|
|
embeddingCache.set(text, embedding);
|
|
|
|
return embedding;
|
|
} catch (error) {
|
|
console.error('Embedding generation failed:', error);
|
|
throw new Error('Failed to generate embedding');
|
|
}
|
|
}
|
|
|
|
export function clearEmbeddingCache(): void {
|
|
embeddingCache.clear();
|
|
}
|
|
```
|
|
|
|
#### Step 2.4: Create Multi-Language Vector Search Service
|
|
|
|
**File:** `/lib/vector-db/search.ts`
|
|
|
|
```typescript
|
|
import { getVectorDbPool } from './index';
|
|
import { getBibleVersionsByLanguage } from './version-config';
|
|
import { generateEmbedding } from '../ai/embeddings';
|
|
|
|
export interface VectorSearchResult {
|
|
book: string;
|
|
chapter: number;
|
|
verse: number;
|
|
text: string;
|
|
version: string;
|
|
versionName: string;
|
|
language: string;
|
|
similarity: number;
|
|
}
|
|
|
|
export interface VectorSearchOptions {
|
|
limit?: number; // Default: 10
|
|
similarityThreshold?: number; // Default: 0.7
|
|
includeMetadata?: boolean; // Default: true
|
|
}
|
|
|
|
/**
|
|
* Search Bible verses across all versions in specified languages
|
|
*
|
|
* @param query - User's question
|
|
* @param languages - Languages to search (e.g., ['ro', 'en'])
|
|
* @param options - Search options
|
|
* @returns Array of search results sorted by similarity
|
|
*/
|
|
export async function searchBibleVectors(
|
|
query: string,
|
|
languages: string[],
|
|
options: VectorSearchOptions = {}
|
|
): Promise<VectorSearchResult[]> {
|
|
const {
|
|
limit = 10,
|
|
similarityThreshold = 0.7,
|
|
} = options;
|
|
|
|
console.log(`🔍 Searching vectors for languages: ${languages.join(', ')}`);
|
|
|
|
// 1. Generate embedding for query
|
|
const queryEmbedding = await generateEmbedding(query);
|
|
const embeddingString = `[${queryEmbedding.join(',')}]`;
|
|
|
|
const pool = getVectorDbPool();
|
|
const allResults: VectorSearchResult[] = [];
|
|
|
|
// 2. Search each language
|
|
for (const language of languages) {
|
|
console.log(` Searching ${language} versions...`);
|
|
|
|
// Get all versions for this language
|
|
const versions = await getBibleVersionsByLanguage(language);
|
|
|
|
if (versions.length === 0) {
|
|
console.log(` ⚠️ No versions found for ${language}`);
|
|
continue;
|
|
}
|
|
|
|
// 3. Search each version table
|
|
for (const version of versions) {
|
|
try {
|
|
// Use parameterized query with UNION for multiple tables
|
|
const query = `
|
|
SELECT
|
|
book,
|
|
chapter,
|
|
verse,
|
|
text,
|
|
version,
|
|
1 - (embedding <=> $1::vector) as similarity
|
|
FROM ${version.tableName}
|
|
WHERE 1 - (embedding <=> $1::vector) > $2
|
|
ORDER BY embedding <=> $1::vector
|
|
LIMIT $3
|
|
`;
|
|
|
|
const result = await pool.query(query, [
|
|
embeddingString,
|
|
similarityThreshold,
|
|
limit,
|
|
]);
|
|
|
|
// Add results with version metadata
|
|
const versionResults: VectorSearchResult[] = result.rows.map(row => ({
|
|
book: row.book,
|
|
chapter: row.chapter,
|
|
verse: row.verse,
|
|
text: row.text,
|
|
version: version.versionAbbreviation,
|
|
versionName: version.versionName,
|
|
language: language,
|
|
similarity: row.similarity,
|
|
}));
|
|
|
|
allResults.push(...versionResults);
|
|
|
|
console.log(` ✓ ${version.versionAbbreviation}: ${versionResults.length} results`);
|
|
} catch (error: any) {
|
|
console.error(` ✗ Error searching ${version.tableName}:`, error.message);
|
|
}
|
|
}
|
|
}
|
|
|
|
// 4. Sort all results by similarity and limit
|
|
allResults.sort((a, b) => b.similarity - a.similarity);
|
|
const topResults = allResults.slice(0, limit);
|
|
|
|
console.log(`✅ Total results: ${topResults.length}`);
|
|
return topResults;
|
|
}
|
|
|
|
/**
|
|
* Search with fallback logic:
|
|
* 1. Search primary language
|
|
* 2. If insufficient results, fallback to English
|
|
*
|
|
* @param query - User's question
|
|
* @param primaryLanguage - User's language (e.g., 'ro')
|
|
* @param options - Search options
|
|
* @returns Search results with fallback indicator
|
|
*/
|
|
export async function searchWithFallback(
|
|
query: string,
|
|
primaryLanguage: string,
|
|
options: VectorSearchOptions = {}
|
|
): Promise<{
|
|
results: VectorSearchResult[];
|
|
usedFallback: boolean;
|
|
searchedLanguages: string[];
|
|
}> {
|
|
const { limit = 10, similarityThreshold = 0.7 } = options;
|
|
|
|
// Search primary language first
|
|
const primaryResults = await searchBibleVectors(
|
|
query,
|
|
[primaryLanguage],
|
|
{ limit, similarityThreshold }
|
|
);
|
|
|
|
// Check if we need fallback
|
|
const needsFallback = (
|
|
primaryResults.length === 0 ||
|
|
primaryResults.length < 3 ||
|
|
(primaryResults[0]?.similarity || 0) < 0.75
|
|
);
|
|
|
|
if (!needsFallback || primaryLanguage === 'en') {
|
|
return {
|
|
results: primaryResults,
|
|
usedFallback: false,
|
|
searchedLanguages: [primaryLanguage],
|
|
};
|
|
}
|
|
|
|
// Fallback to English
|
|
console.log('⚠️ Insufficient results, falling back to English...');
|
|
const englishResults = await searchBibleVectors(
|
|
query,
|
|
['en'],
|
|
{ limit, similarityThreshold }
|
|
);
|
|
|
|
// Combine results (prioritize primary language)
|
|
const combinedResults = [...primaryResults, ...englishResults]
|
|
.sort((a, b) => b.similarity - a.similarity)
|
|
.slice(0, limit);
|
|
|
|
return {
|
|
results: combinedResults,
|
|
usedFallback: true,
|
|
searchedLanguages: [primaryLanguage, 'en'],
|
|
};
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
### Phase 3: AI Chat Integration (Day 4-5)
|
|
|
|
#### Step 3.1: Create AI Response Generator
|
|
|
|
**File:** `/lib/ai/chat.ts`
|
|
|
|
```typescript
|
|
import OpenAI from 'openai';
|
|
import { VectorSearchResult } from '../vector-db/search';
|
|
|
|
const openai = new OpenAI({
|
|
apiKey: process.env.OPENAI_API_KEY,
|
|
});
|
|
|
|
export interface ChatMessage {
|
|
role: 'user' | 'assistant' | 'system';
|
|
content: string;
|
|
}
|
|
|
|
export interface GenerateResponseOptions {
|
|
searchResults: VectorSearchResult[];
|
|
userQuestion: string;
|
|
userLanguage: string;
|
|
conversationHistory?: ChatMessage[];
|
|
usedFallback: boolean;
|
|
}
|
|
|
|
/**
|
|
* Build system prompt for Biblical AI assistant
|
|
*/
|
|
function buildSystemPrompt(language: string, usedFallback: boolean): string {
|
|
const prompts = {
|
|
en: `You are a knowledgeable Biblical AI assistant. Your role is to provide accurate,
|
|
thoughtful answers based on Scripture. Always cite Bible verses with their version, book,
|
|
chapter, and verse. Provide context and explain the meaning clearly.
|
|
|
|
${usedFallback ? 'Note: Some verses are from English versions as complete translations in the user\'s language were not available.' : ''}
|
|
|
|
Format your citations as: [Version] Book Chapter:Verse
|
|
Example: [KJV] John 3:16 or [Cornilescu] Ioan 3:16`,
|
|
|
|
ro: `Ești un asistent AI biblic priceput. Rolul tău este să oferi răspunsuri precise și
|
|
gândite bazate pe Scriptură. Citează întotdeauna versetele biblice cu versiunea, cartea,
|
|
capitolul și versetul. Oferă context și explică semnificația clar.
|
|
|
|
${usedFallback ? 'Notă: Unele versete sunt din versiuni engleze deoarece traducerile complete în română nu erau disponibile.' : ''}
|
|
|
|
Formatează citările astfel: [Versiune] Carte Capitol:Verset
|
|
Exemplu: [Cornilescu] Ioan 3:16`,
|
|
|
|
es: `Eres un asistente bíblico de IA conocedor. Tu función es proporcionar respuestas
|
|
precisas y reflexivas basadas en las Escrituras. Siempre cita los versículos bíblicos con
|
|
su versión, libro, capítulo y versículo. Proporciona contexto y explica el significado claramente.
|
|
|
|
${usedFallback ? 'Nota: Algunos versículos son de versiones en inglés ya que no había traducciones completas disponibles en español.' : ''}
|
|
|
|
Formatea tus citas como: [Versión] Libro Capítulo:Versículo
|
|
Ejemplo: [RVR1960] Juan 3:16`,
|
|
|
|
it: `Sei un assistente biblico AI esperto. Il tuo ruolo è fornire risposte accurate e
|
|
ponderate basate sulle Scritture. Cita sempre i versetti biblici con la loro versione, libro,
|
|
capitolo e versetto. Fornisci contesto e spiega il significato chiaramente.
|
|
|
|
${usedFallback ? 'Nota: Alcuni versetti provengono da versioni inglesi poiché traduzioni complete in italiano non erano disponibili.' : ''}
|
|
|
|
Formatta le tue citazioni come: [Versione] Libro Capitolo:Versetto
|
|
Esempio: [NR2006] Giovanni 3:16`,
|
|
};
|
|
|
|
return prompts[language as keyof typeof prompts] || prompts.en;
|
|
}
|
|
|
|
/**
|
|
* Format search results into context for AI
|
|
*/
|
|
function formatSearchResultsContext(results: VectorSearchResult[]): string {
|
|
if (results.length === 0) {
|
|
return 'No specific Bible verses were found for this question.';
|
|
}
|
|
|
|
let context = 'Relevant Bible verses:\n\n';
|
|
|
|
results.forEach((result, index) => {
|
|
context += `${index + 1}. [${result.version}] ${result.book} ${result.chapter}:${result.verse}\n`;
|
|
context += ` "${result.text}"\n`;
|
|
context += ` (Similarity: ${(result.similarity * 100).toFixed(1)}%)\n\n`;
|
|
});
|
|
|
|
return context;
|
|
}
|
|
|
|
/**
|
|
* Generate AI response based on vector search results
|
|
*/
|
|
export async function generateBiblicalResponse(
|
|
options: GenerateResponseOptions
|
|
): Promise<string> {
|
|
const {
|
|
searchResults,
|
|
userQuestion,
|
|
userLanguage,
|
|
conversationHistory = [],
|
|
usedFallback,
|
|
} = options;
|
|
|
|
// Build messages array
|
|
const messages: ChatMessage[] = [
|
|
{
|
|
role: 'system',
|
|
content: buildSystemPrompt(userLanguage, usedFallback),
|
|
},
|
|
{
|
|
role: 'system',
|
|
content: formatSearchResultsContext(searchResults),
|
|
},
|
|
...conversationHistory.slice(-6), // Include last 3 exchanges for context
|
|
{
|
|
role: 'user',
|
|
content: userQuestion,
|
|
},
|
|
];
|
|
|
|
try {
|
|
const response = await openai.chat.completions.create({
|
|
model: 'gpt-4-turbo-preview',
|
|
messages: messages as any,
|
|
temperature: 0.3, // Low temperature for factual responses
|
|
max_tokens: 1500,
|
|
presence_penalty: 0.1,
|
|
frequency_penalty: 0.1,
|
|
});
|
|
|
|
const answer = response.choices[0]?.message?.content ||
|
|
'I apologize, but I could not generate a response.';
|
|
|
|
return answer;
|
|
} catch (error: any) {
|
|
console.error('AI response generation failed:', error);
|
|
throw new Error(`Failed to generate response: ${error.message}`);
|
|
}
|
|
}
|
|
```
|
|
|
|
#### Step 3.2: Update Chat API Route
|
|
|
|
**File:** `/app/api/chat/route.ts`
|
|
|
|
```typescript
|
|
import { NextRequest, NextResponse } from 'next/server';
|
|
import { getUserFromToken } from '@/lib/auth';
|
|
import { searchWithFallback } from '@/lib/vector-db/search';
|
|
import { generateBiblicalResponse } from '@/lib/ai/chat';
|
|
import { prisma } from '@/lib/db';
|
|
|
|
export async function POST(request: NextRequest) {
|
|
try {
|
|
const { message, locale, conversationId } = await request.json();
|
|
|
|
// Validate input
|
|
if (!message || !message.trim()) {
|
|
return NextResponse.json(
|
|
{ error: 'Message is required' },
|
|
{ status: 400 }
|
|
);
|
|
}
|
|
|
|
// Get user (optional)
|
|
const token = request.headers.get('authorization')?.replace('Bearer ', '');
|
|
const user = token ? await getUserFromToken(token) : null;
|
|
|
|
// Determine language (default to 'en' if not provided)
|
|
const userLanguage = locale || 'en';
|
|
|
|
console.log(`💬 Chat request: "${message.substring(0, 50)}..." [${userLanguage}]`);
|
|
|
|
// STEP 1: Vector search with fallback
|
|
const searchStartTime = Date.now();
|
|
const { results, usedFallback, searchedLanguages } = await searchWithFallback(
|
|
message,
|
|
userLanguage,
|
|
{
|
|
limit: 10,
|
|
similarityThreshold: 0.7,
|
|
}
|
|
);
|
|
const searchDuration = Date.now() - searchStartTime;
|
|
|
|
console.log(`🔍 Vector search completed in ${searchDuration}ms`);
|
|
console.log(` Results: ${results.length}`);
|
|
console.log(` Used fallback: ${usedFallback}`);
|
|
console.log(` Searched languages: ${searchedLanguages.join(', ')}`);
|
|
|
|
// STEP 2: Get conversation history (if conversationId provided)
|
|
let conversationHistory: any[] = [];
|
|
if (conversationId && user) {
|
|
const conversation = await prisma.conversation.findUnique({
|
|
where: { id: conversationId, userId: user.id },
|
|
include: {
|
|
messages: {
|
|
orderBy: { createdAt: 'asc' },
|
|
take: 10, // Last 5 exchanges
|
|
},
|
|
},
|
|
});
|
|
|
|
if (conversation) {
|
|
conversationHistory = conversation.messages.map(msg => ({
|
|
role: msg.role,
|
|
content: msg.content,
|
|
}));
|
|
}
|
|
}
|
|
|
|
// STEP 3: Generate AI response
|
|
const aiStartTime = Date.now();
|
|
const response = await generateBiblicalResponse({
|
|
searchResults: results,
|
|
userQuestion: message,
|
|
userLanguage,
|
|
conversationHistory,
|
|
usedFallback,
|
|
});
|
|
const aiDuration = Date.now() - aiStartTime;
|
|
|
|
console.log(`🤖 AI response generated in ${aiDuration}ms`);
|
|
|
|
// STEP 4: Save conversation (if user is logged in)
|
|
if (user) {
|
|
let conversation;
|
|
|
|
if (conversationId) {
|
|
// Add to existing conversation
|
|
conversation = await prisma.conversation.findUnique({
|
|
where: { id: conversationId, userId: user.id },
|
|
});
|
|
}
|
|
|
|
if (!conversation) {
|
|
// Create new conversation
|
|
conversation = await prisma.conversation.create({
|
|
data: {
|
|
userId: user.id,
|
|
title: message.substring(0, 100), // Use first message as title
|
|
},
|
|
});
|
|
}
|
|
|
|
// Save user message
|
|
await prisma.message.create({
|
|
data: {
|
|
conversationId: conversation.id,
|
|
role: 'user',
|
|
content: message,
|
|
},
|
|
});
|
|
|
|
// Save AI response
|
|
await prisma.message.create({
|
|
data: {
|
|
conversationId: conversation.id,
|
|
role: 'assistant',
|
|
content: response,
|
|
metadata: {
|
|
searchResults: results.slice(0, 5), // Top 5 results
|
|
usedFallback,
|
|
searchDuration,
|
|
aiDuration,
|
|
},
|
|
},
|
|
});
|
|
|
|
conversationId = conversation.id;
|
|
}
|
|
|
|
// STEP 5: Return response
|
|
return NextResponse.json({
|
|
response,
|
|
conversationId,
|
|
metadata: {
|
|
searchResults: results.slice(0, 5),
|
|
usedFallback,
|
|
searchedLanguages,
|
|
timings: {
|
|
search: searchDuration,
|
|
ai: aiDuration,
|
|
total: searchDuration + aiDuration,
|
|
},
|
|
},
|
|
});
|
|
|
|
} catch (error: any) {
|
|
console.error('Chat API error:', error);
|
|
return NextResponse.json(
|
|
{ error: error.message || 'Failed to process chat request' },
|
|
{ status: 500 }
|
|
);
|
|
}
|
|
}
|
|
|
|
export const runtime = 'nodejs';
|
|
export const dynamic = 'force-dynamic';
|
|
```
|
|
|
|
---
|
|
|
|
### Phase 4: Testing & Validation (Day 6-7)
|
|
|
|
#### Step 4.1: Create Test Suite
|
|
|
|
**File:** `/scripts/test-vector-search.ts`
|
|
|
|
```typescript
|
|
import { searchWithFallback } from '@/lib/vector-db/search';
|
|
|
|
interface TestCase {
|
|
question: string;
|
|
language: string;
|
|
expectedBook?: string;
|
|
expectedMinResults: number;
|
|
}
|
|
|
|
const testCases: TestCase[] = [
|
|
// English tests
|
|
{
|
|
question: 'What does the Bible say about love?',
|
|
language: 'en',
|
|
expectedBook: '1 Corinthians',
|
|
expectedMinResults: 5,
|
|
},
|
|
{
|
|
question: 'Who was Jesus?',
|
|
language: 'en',
|
|
expectedBook: 'John',
|
|
expectedMinResults: 5,
|
|
},
|
|
|
|
// Romanian tests
|
|
{
|
|
question: 'Ce spune Biblia despre iubire?',
|
|
language: 'ro',
|
|
expectedBook: '1 Corinteni',
|
|
expectedMinResults: 3,
|
|
},
|
|
{
|
|
question: 'Cine a fost Moise?',
|
|
language: 'ro',
|
|
expectedBook: 'Exod',
|
|
expectedMinResults: 3,
|
|
},
|
|
|
|
// Spanish tests
|
|
{
|
|
question: '¿Qué dice la Biblia sobre la fe?',
|
|
language: 'es',
|
|
expectedBook: 'Hebreos',
|
|
expectedMinResults: 3,
|
|
},
|
|
|
|
// Italian tests
|
|
{
|
|
question: 'Chi era Davide?',
|
|
language: 'it',
|
|
expectedBook: '1 Samuele',
|
|
expectedMinResults: 2,
|
|
},
|
|
|
|
// Fallback test (incomplete Italian version)
|
|
{
|
|
question: 'Tell me about Job', // Asking in English but with Italian locale
|
|
language: 'it',
|
|
expectedMinResults: 2, // Should fallback to English
|
|
},
|
|
];
|
|
|
|
async function runTests() {
|
|
console.log('🧪 Starting Vector Search Tests\n');
|
|
|
|
let passedTests = 0;
|
|
let failedTests = 0;
|
|
|
|
for (const [index, testCase] of testCases.entries()) {
|
|
console.log(`Test ${index + 1}/${testCases.length}: ${testCase.question} [${testCase.language}]`);
|
|
|
|
try {
|
|
const startTime = Date.now();
|
|
const { results, usedFallback, searchedLanguages } = await searchWithFallback(
|
|
testCase.question,
|
|
testCase.language,
|
|
{ limit: 10, similarityThreshold: 0.7 }
|
|
);
|
|
const duration = Date.now() - startTime;
|
|
|
|
// Check results count
|
|
const hasEnoughResults = results.length >= testCase.expectedMinResults;
|
|
|
|
// Check if expected book is in results (if specified)
|
|
const hasExpectedBook = testCase.expectedBook
|
|
? results.some(r => r.book.includes(testCase.expectedBook!))
|
|
: true;
|
|
|
|
if (hasEnoughResults && hasExpectedBook) {
|
|
console.log(` ✅ PASSED (${duration}ms)`);
|
|
console.log(` Found ${results.length} results`);
|
|
console.log(` Top result: [${results[0]?.version}] ${results[0]?.book} ${results[0]?.chapter}:${results[0]?.verse}`);
|
|
console.log(` Similarity: ${(results[0]?.similarity * 100).toFixed(1)}%`);
|
|
console.log(` Used fallback: ${usedFallback}`);
|
|
console.log(` Languages: ${searchedLanguages.join(', ')}\n`);
|
|
passedTests++;
|
|
} else {
|
|
console.log(` ❌ FAILED`);
|
|
console.log(` Expected min results: ${testCase.expectedMinResults}, got: ${results.length}`);
|
|
if (testCase.expectedBook) {
|
|
console.log(` Expected book: ${testCase.expectedBook}, found: ${hasExpectedBook}`);
|
|
}
|
|
console.log('');
|
|
failedTests++;
|
|
}
|
|
} catch (error: any) {
|
|
console.log(` ❌ ERROR: ${error.message}\n`);
|
|
failedTests++;
|
|
}
|
|
}
|
|
|
|
console.log(`\n📊 Test Summary:`);
|
|
console.log(` ✅ Passed: ${passedTests}`);
|
|
console.log(` ❌ Failed: ${failedTests}`);
|
|
console.log(` Total: ${testCases.length}`);
|
|
console.log(` Success Rate: ${((passedTests / testCases.length) * 100).toFixed(1)}%`);
|
|
}
|
|
|
|
runTests().catch(console.error);
|
|
```
|
|
|
|
**Run tests:**
|
|
```bash
|
|
npx tsx scripts/test-vector-search.ts
|
|
```
|
|
|
|
---
|
|
|
|
## 7. Vector Search Logic
|
|
|
|
### 7.1 Search Priority Algorithm
|
|
|
|
```typescript
|
|
/**
|
|
* Multi-Language Search Algorithm
|
|
*
|
|
* Priority Order:
|
|
* 1. Primary Language Complete Versions (is_complete = true)
|
|
* 2. Primary Language Partial Versions (is_complete = false)
|
|
* 3. English Complete Versions (fallback)
|
|
*
|
|
* Quality Checks:
|
|
* - Minimum results: 3
|
|
* - Minimum similarity: 0.7
|
|
* - Top result similarity: 0.75 (for confidence)
|
|
*/
|
|
|
|
// Pseudocode
|
|
function searchMultiLanguage(query, primaryLanguage) {
|
|
// Step 1: Search primary language
|
|
primaryResults = searchLanguage(query, primaryLanguage)
|
|
|
|
// Step 2: Quality check
|
|
if (hasGoodQuality(primaryResults)) {
|
|
return {
|
|
results: primaryResults,
|
|
usedFallback: false
|
|
}
|
|
}
|
|
|
|
// Step 3: Fallback to English
|
|
if (primaryLanguage !== 'en') {
|
|
englishResults = searchLanguage(query, 'en')
|
|
combinedResults = merge(primaryResults, englishResults)
|
|
|
|
return {
|
|
results: combinedResults,
|
|
usedFallback: true
|
|
}
|
|
}
|
|
|
|
// Step 4: Return whatever we have
|
|
return {
|
|
results: primaryResults,
|
|
usedFallback: false
|
|
}
|
|
}
|
|
|
|
function hasGoodQuality(results) {
|
|
return (
|
|
results.length >= 3 &&
|
|
results[0].similarity >= 0.75
|
|
)
|
|
}
|
|
```
|
|
|
|
### 7.2 Similarity Scoring
|
|
|
|
**Cosine Similarity Interpretation:**
|
|
```
|
|
1.0 = Perfect match (identical)
|
|
0.9-1.0 = Extremely relevant
|
|
0.8-0.9 = Highly relevant
|
|
0.7-0.8 = Relevant (threshold)
|
|
0.6-0.7 = Somewhat relevant
|
|
< 0.6 = Not relevant (filtered out)
|
|
```
|
|
|
|
**Threshold Configuration:**
|
|
```typescript
|
|
const SIMILARITY_THRESHOLDS = {
|
|
minimum: 0.7, // Filter out results below this
|
|
confident: 0.75, // Consider search "confident" if top result above this
|
|
excellent: 0.85, // Exceptional match
|
|
};
|
|
```
|
|
|
|
---
|
|
|
|
## 8. API Integration
|
|
|
|
### 8.1 Frontend Chat Component Updates
|
|
|
|
**File:** `/components/chat/floating-chat.tsx`
|
|
|
|
**Add to existing component:**
|
|
|
|
```typescript
|
|
// Add loading state for vector search
|
|
const [isSearching, setIsSearching] = useState(false);
|
|
|
|
// Update sendMessage function
|
|
const sendMessage = async () => {
|
|
if (!inputMessage.trim()) return;
|
|
|
|
const userMessage = inputMessage.trim();
|
|
setInputMessage('');
|
|
|
|
// Add user message to UI
|
|
setMessages(prev => [...prev, {
|
|
role: 'user',
|
|
content: userMessage,
|
|
}]);
|
|
|
|
setIsSearching(true); // Show "searching Bible..." indicator
|
|
|
|
try {
|
|
const token = getToken();
|
|
const response = await fetch('/api/chat', {
|
|
method: 'POST',
|
|
headers: {
|
|
'Content-Type': 'application/json',
|
|
...(token && { Authorization: `Bearer ${token}` }),
|
|
},
|
|
body: JSON.stringify({
|
|
message: userMessage,
|
|
locale: locale,
|
|
conversationId: currentConversationId,
|
|
}),
|
|
});
|
|
|
|
const data = await response.json();
|
|
|
|
if (!response.ok) {
|
|
throw new Error(data.error || 'Failed to get response');
|
|
}
|
|
|
|
// Add AI response to UI
|
|
setMessages(prev => [...prev, {
|
|
role: 'assistant',
|
|
content: data.response,
|
|
metadata: data.metadata, // Includes search results, timings
|
|
}]);
|
|
|
|
// Update conversation ID
|
|
if (data.conversationId) {
|
|
setCurrentConversationId(data.conversationId);
|
|
}
|
|
|
|
} catch (error: any) {
|
|
console.error('Chat error:', error);
|
|
setMessages(prev => [...prev, {
|
|
role: 'assistant',
|
|
content: 'I apologize, but I encountered an error. Please try again.',
|
|
}]);
|
|
} finally {
|
|
setIsSearching(false);
|
|
}
|
|
};
|
|
```
|
|
|
|
**Add search indicator:**
|
|
|
|
```tsx
|
|
{isSearching && (
|
|
<Box sx={{ display: 'flex', alignItems: 'center', gap: 1, p: 2 }}>
|
|
<CircularProgress size={16} />
|
|
<Typography variant="body2" color="text.secondary">
|
|
Searching Bible verses...
|
|
</Typography>
|
|
</Box>
|
|
)}
|
|
```
|
|
|
|
---
|
|
|
|
## 9. Testing Strategy
|
|
|
|
### 9.1 Test Scenarios
|
|
|
|
**Scenario 1: English Question**
|
|
```
|
|
Input: "What does the Bible say about faith?"
|
|
Language: en
|
|
Expected:
|
|
- Search: bible_vectors_kjv, bible_vectors_niv, bible_vectors_esv
|
|
- Results: Hebrews 11:1, Romans 10:17, Ephesians 2:8-9
|
|
- Citations: [KJV] Hebrews 11:1, [NIV] Hebrews 11:1
|
|
- Fallback: No
|
|
```
|
|
|
|
**Scenario 2: Romanian Question**
|
|
```
|
|
Input: "Ce spune Biblia despre credință?"
|
|
Language: ro
|
|
Expected:
|
|
- Search: bible_vectors_cornilescu
|
|
- Results: Evrei 11:1, Romani 10:17
|
|
- Citations: [Cornilescu] Evrei 11:1
|
|
- Fallback: No (if Romanian complete)
|
|
```
|
|
|
|
**Scenario 3: Italian with Fallback**
|
|
```
|
|
Input: "Chi era Giobbe?"
|
|
Language: it
|
|
Expected:
|
|
- Search: bible_vectors_nuova_riveduta (incomplete)
|
|
- Results: Limited results
|
|
- Fallback: Yes (search English versions)
|
|
- Citations: [KJV] Job 1:1, [NIV] Job 1:1
|
|
- Response: Italian with English verse references
|
|
```
|
|
|
|
**Scenario 4: Obscure Topic**
|
|
```
|
|
Input: "Tell me about the Nephilim"
|
|
Language: en
|
|
Expected:
|
|
- Search: All English versions
|
|
- Results: Genesis 6:4, Numbers 13:33
|
|
- Citations: Multiple versions showing different translations
|
|
- Context: Explanation of who Nephilim were
|
|
```
|
|
|
|
### 9.2 Performance Benchmarks
|
|
|
|
**Target Metrics:**
|
|
```
|
|
Vector Search: < 2000ms
|
|
AI Generation: < 3000ms
|
|
Total Response Time: < 5000ms
|
|
Concurrent Users: 100+
|
|
Success Rate: > 95%
|
|
```
|
|
|
|
**Load Testing:**
|
|
```bash
|
|
# Use Apache Bench or similar
|
|
ab -n 100 -c 10 -H "Content-Type: application/json" \
|
|
-p test-request.json \
|
|
http://localhost:3010/api/chat
|
|
```
|
|
|
|
---
|
|
|
|
## 10. Deployment & Monitoring
|
|
|
|
### 10.1 Pre-Deployment Checklist
|
|
|
|
- [ ] All vector tables verified and populated
|
|
- [ ] bible_version_config table populated correctly
|
|
- [ ] Environment variables set (OPENAI_API_KEY, VECTOR_DB_URL)
|
|
- [ ] Database connection pool configured (max: 20)
|
|
- [ ] Vector search tested for all languages
|
|
- [ ] Fallback logic tested
|
|
- [ ] AI response quality reviewed
|
|
- [ ] Performance benchmarks met
|
|
- [ ] Error handling tested
|
|
- [ ] Logging configured
|
|
|
|
### 10.2 Monitoring Setup
|
|
|
|
**Metrics to Track:**
|
|
1. **Search Performance**
|
|
- Vector search duration
|
|
- AI generation duration
|
|
- Total response time
|
|
- Cache hit rate
|
|
|
|
2. **Quality Metrics**
|
|
- Average similarity score
|
|
- Fallback frequency per language
|
|
- Results count distribution
|
|
- User satisfaction (if tracked)
|
|
|
|
3. **Error Rates**
|
|
- Vector DB connection failures
|
|
- AI API errors
|
|
- Embedding generation failures
|
|
- Timeout errors
|
|
|
|
**Logging Example:**
|
|
```typescript
|
|
// Add to chat API route
|
|
console.log({
|
|
timestamp: new Date().toISOString(),
|
|
userId: user?.id || 'anonymous',
|
|
language: userLanguage,
|
|
questionLength: message.length,
|
|
searchDuration,
|
|
aiDuration,
|
|
totalDuration: searchDuration + aiDuration,
|
|
resultsCount: results.length,
|
|
topSimilarity: results[0]?.similarity,
|
|
usedFallback,
|
|
searchedLanguages,
|
|
});
|
|
```
|
|
|
|
### 10.3 Optimization Opportunities
|
|
|
|
**Future Enhancements:**
|
|
1. **Caching Layer**
|
|
- Cache common questions/embeddings (Redis)
|
|
- Cache Bible version configs (in-memory)
|
|
- Cache top searches by language
|
|
|
|
2. **Advanced Search**
|
|
- Semantic search refinement
|
|
- Cross-reference discovery
|
|
- Topic clustering
|
|
- Historical context integration
|
|
|
|
3. **Personalization**
|
|
- User's preferred Bible version
|
|
- Reading history integration
|
|
- Personalized recommendations
|
|
|
|
4. **Analytics**
|
|
- Popular questions tracking
|
|
- Language usage statistics
|
|
- Conversion tracking (chat → Bible reader)
|
|
|
|
---
|
|
|
|
## Implementation Checklist
|
|
|
|
### Phase 1: Setup (Day 1)
|
|
- [ ] Run connection verification script
|
|
- [ ] Enumerate all vector tables
|
|
- [ ] Create bible_version_config table
|
|
- [ ] Populate version mappings
|
|
- [ ] Test vector DB connections
|
|
|
|
### Phase 2: Vector Search (Day 2-3)
|
|
- [ ] Create vector DB connection pool
|
|
- [ ] Implement version config service
|
|
- [ ] Build embedding service with cache
|
|
- [ ] Create multi-language search function
|
|
- [ ] Implement fallback logic
|
|
- [ ] Write unit tests for search
|
|
|
|
### Phase 3: AI Integration (Day 4-5)
|
|
- [ ] Build system prompt generator
|
|
- [ ] Create response formatter
|
|
- [ ] Update chat API route
|
|
- [ ] Integrate vector search
|
|
- [ ] Add conversation persistence
|
|
- [ ] Test end-to-end flow
|
|
|
|
### Phase 4: Testing (Day 6-7)
|
|
- [ ] Run automated test suite
|
|
- [ ] Test all 4 languages
|
|
- [ ] Test fallback scenarios
|
|
- [ ] Performance testing
|
|
- [ ] Quality review
|
|
- [ ] User acceptance testing
|
|
|
|
### Phase 5: Deployment (Day 8)
|
|
- [ ] Deploy to production
|
|
- [ ] Monitor logs
|
|
- [ ] Track performance metrics
|
|
- [ ] Gather user feedback
|
|
- [ ] Document issues/improvements
|
|
|
|
---
|
|
|
|
## Appendix
|
|
|
|
### A. SQL Queries Reference
|
|
|
|
**Create Version Config Table:**
|
|
```sql
|
|
CREATE TABLE bible_version_config (
|
|
id SERIAL PRIMARY KEY,
|
|
table_name VARCHAR(100) UNIQUE NOT NULL,
|
|
version_name VARCHAR(100) NOT NULL,
|
|
version_abbreviation VARCHAR(20) NOT NULL,
|
|
language VARCHAR(5) NOT NULL,
|
|
is_complete BOOLEAN DEFAULT true,
|
|
books_count INTEGER,
|
|
verses_count INTEGER,
|
|
created_at TIMESTAMP DEFAULT NOW(),
|
|
metadata JSONB
|
|
);
|
|
|
|
CREATE INDEX idx_version_language ON bible_version_config(language);
|
|
CREATE INDEX idx_version_table ON bible_version_config(table_name);
|
|
```
|
|
|
|
**Count Verses Per Version:**
|
|
```sql
|
|
-- Replace {table_name} with actual table
|
|
SELECT
|
|
'{version_abbreviation}' as version,
|
|
COUNT(*) as total_verses,
|
|
COUNT(DISTINCT book) as total_books
|
|
FROM {table_name};
|
|
```
|
|
|
|
**Find Missing Books:**
|
|
```sql
|
|
-- Check which books are missing in a version
|
|
WITH all_books AS (
|
|
SELECT DISTINCT book FROM bible_vectors_kjv -- Use complete English version as reference
|
|
)
|
|
SELECT ab.book
|
|
FROM all_books ab
|
|
LEFT JOIN bible_vectors_nuova_riveduta nr ON ab.book = nr.book
|
|
WHERE nr.book IS NULL;
|
|
```
|
|
|
|
### B. Environment Variables
|
|
|
|
```bash
|
|
# AI API Keys
|
|
OPENAI_API_KEY=sk-proj-...
|
|
ANTHROPIC_API_KEY=sk-ant-... # Optional
|
|
|
|
# Vector Database
|
|
VECTOR_DB_URL=postgresql://user:password@host:5432/database
|
|
VECTOR_DB_PASSWORD=... # If separate
|
|
|
|
# Application
|
|
NEXT_PUBLIC_APP_URL=https://biblical-guide.com
|
|
NODE_ENV=production
|
|
|
|
# Optional: Caching
|
|
REDIS_URL=redis://... # For future caching layer
|
|
```
|
|
|
|
### C. Troubleshooting Guide
|
|
|
|
**Issue: Vector search returns no results**
|
|
```
|
|
Possible causes:
|
|
1. Embedding dimension mismatch
|
|
2. Similarity threshold too high
|
|
3. Table name incorrect
|
|
4. pgvector extension not installed
|
|
|
|
Debug:
|
|
- Check embedding dimensions: SELECT pg_column_size(embedding) / 4 FROM table LIMIT 1;
|
|
- Lower threshold: Try 0.5 instead of 0.7
|
|
- Verify table exists: SELECT * FROM pg_tables WHERE tablename = 'bible_vectors_kjv';
|
|
- Check extension: SELECT * FROM pg_extension WHERE extname = 'vector';
|
|
```
|
|
|
|
**Issue: AI responses are not in user's language**
|
|
```
|
|
Possible causes:
|
|
1. System prompt not specifying language
|
|
2. Locale not passed correctly
|
|
3. Fallback message not translated
|
|
|
|
Debug:
|
|
- Check locale parameter in API request
|
|
- Verify system prompt includes language instruction
|
|
- Review response generation function
|
|
```
|
|
|
|
**Issue: Slow vector search (> 5 seconds)**
|
|
```
|
|
Possible causes:
|
|
1. Missing vector index
|
|
2. Too many tables searched
|
|
3. Large result limit
|
|
4. Database connection pool exhausted
|
|
|
|
Solutions:
|
|
- Create index: CREATE INDEX ON table USING ivfflat (embedding vector_cosine_ops);
|
|
- Limit tables searched (only user's language first)
|
|
- Reduce limit from 20 to 10
|
|
- Increase connection pool size
|
|
```
|
|
|
|
---
|
|
|
|
## Next Steps
|
|
|
|
1. **Immediate:** Run Phase 1 verification scripts
|
|
2. **Day 2-3:** Implement vector search service
|
|
3. **Day 4-5:** Integrate into chat API
|
|
4. **Day 6-7:** Comprehensive testing
|
|
5. **Day 8:** Production deployment
|
|
|
|
**Questions to Answer Before Starting:**
|
|
1. What is your vector database provider? (Supabase, PostgreSQL+pgvector, Pinecone, etc.)
|
|
2. Are all embeddings using OpenAI's ada-002 model (1536 dimensions)?
|
|
3. Do you have a complete list of all Bible version table names?
|
|
4. Which AI model do you prefer? (GPT-4, Claude, etc.)
|
|
|
|
---
|
|
|
|
**Document End**
|
|
|
|
This plan will be updated as implementation progresses. |