Implement Azure OpenAI vector embeddings for Romanian Bible

- Add pgvector support with bible_passages table for vector search - Create Python ingestion script for Azure OpenAI embed-3 embeddings - Implement hybrid search combining vector similarity and full-text search - Update AI chat to use vector search with Azure OpenAI gpt-4o - Add floating chat component with Material UI design - Import complete Romanian Bible (FIDELA) with 30K+ verses - Add vector search library for semantic Bible search - Create multi-language implementation plan for future expansion 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-09-20 15:18:00 +03:00
parent 3b375c869b
commit dd5e1102eb
14 changed files with 2082 additions and 68 deletions
--- a/.env.local
+++ b/.env.local
@@ -11,6 +11,12 @@ JWT_SECRET=development-jwt-secret-change-in-production
 AZURE_OPENAI_KEY=4DhkkXVdDOXZ7xX1eOLHTHQQnbCy0jFYdA6RPJtyAdOMtO16nZmFJQQJ99BCACYeBjFXJ3w3AAABACOGHgNC
 AZURE_OPENAI_ENDPOINT=https://azureopenaiinstant.openai.azure.com
 AZURE_OPENAI_DEPLOYMENT=gpt-4o
 AZURE_OPENAI_API_VERSION=2024-05-01-preview
 AZURE_OPENAI_EMBED_DEPLOYMENT=embed-3
 EMBED_DIMS=3072
 BIBLE_MD_PATH=./bibles/Biblia-Fidela-limba-romana.md
 LANG_CODE=ro
 TRANSLATION_CODE=FIDELA
 # API Bible
 API_BIBLE_KEY=7b42606f8f809e155c9b0742c4f1849b
--- a/app/api/chat/route.ts
+++ b/app/api/chat/route.ts
@@ -1,5 +1,6 @@
 import { NextRequest, NextResponse } from 'next/server'
 import { z } from 'zod'
 import { searchBibleHybrid, BibleVerse } from '@/lib/vector-search'
 const chatRequestSchema = z.object({
  message: z.string().min(1),
@@ -49,73 +50,81 @@ export async function POST(request: NextRequest) {
 }
 async function generateBiblicalResponse(message: string, history: any[]): Promise<string> {
-  // Mock biblical responses for common questions
+  try {
-  const lowerMessage = message.toLowerCase()
+    // Search for relevant Bible verses using vector search
    const relevantVerses = await searchBibleHybrid(message, 5)
-  if (lowerMessage.includes('dragoste') || lowerMessage.includes('iubire')) {
+    // Create context from relevant verses
-    return `Întrebarea ta despre dragoste este foarte frumoasă! Biblia ne învață că "Dumnezeu este dragoste" (1 Ioan 4:8). De asemenea, în 1 Corinteni 13:4-7 găsim descrierea perfectă a dragostei: "Dragostea este îndelung răbdătoare, dragostea este binevoitoare; dragostea nu pizmuiește; dragostea nu se fălește, nu se semeață, nu face nimic necuviincios, nu caută ale sale, nu se mânie, nu ține seama de răul făcut..."
+    const versesContext = relevantVerses
      .map(verse => `${verse.ref}: "${verse.text_raw}"`)
      .join('\n\n')
-Isus ne-a dat cea mai mare poruncă: "Să iubești pe Domnul Dumnezeul tău cu toată inima ta, cu tot sufletul tău și cu tot cugetul tău" și "să-ți iubești aproapele ca pe tine însuți" (Matei 22:37-39).`
+    // Create conversation history for context
    const conversationHistory = history
      .slice(-3) // Last 3 messages for context
      .map(msg => `${msg.role}: ${msg.content}`)
      .join('\n')
    // Construct prompt for Azure OpenAI
    const systemPrompt = `Ești un asistent AI pentru întrebări biblice în limba română. Răspunde pe baza Scripturii, fiind respectuos și înțelept.
 Instrucțiuni:
 - Folosește versurile biblice relevante pentru a răspunde la întrebare
 - Citează întotdeauna referințele biblice (ex: Ioan 3:16)
 - Răspunde în română
 - Fii empatic și încurajator
 - Dacă nu ești sigur, încurajează studiul personal și rugăciunea
 Versuri relevante pentru această întrebare:
 ${versesContext}
 Conversația anterioară:
 ${conversationHistory}
 Întrebarea curentă: ${message}`
    // Call Azure OpenAI
    const response = await fetch(
      `${process.env.AZURE_OPENAI_ENDPOINT}/openai/deployments/${process.env.AZURE_OPENAI_DEPLOYMENT}/chat/completions?api-version=${process.env.AZURE_OPENAI_API_VERSION}`,
      {
        method: 'POST',
        headers: {
          'api-key': process.env.AZURE_OPENAI_KEY!,
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          messages: [
            {
              role: 'system',
              content: systemPrompt
            },
            {
              role: 'user',
              content: message
            }
          ],
          max_tokens: 800,
          temperature: 0.7,
          top_p: 0.9
        }),
      }
    )
    if (!response.ok) {
      throw new Error(`Azure OpenAI API error: ${response.status}`)
    }
    const data = await response.json()
    return data.choices[0].message.content
  } catch (error) {
    console.error('Error calling Azure OpenAI:', error)
    // Fallback to simple response if AI fails
    return `Îmi pare rău, dar întâmpin o problemă tehnică în acest moment. Te încurajez să cercetezi acest subiect în Scripturi și să te rogi pentru înțelegere.
 "Cercetați Scripturile, pentru că socotiți că în ele aveți viața veșnică, și tocmai ele mărturisesc despre Mine" (Ioan 5:39).
 "Dacă vreunul dintre voi duce lipsă de înțelepciune, să ceară de la Dumnezeu, care dă tuturor cu dărnicie și fără mustrare, și i se va da" (Iacov 1:5).`
  }
  if (lowerMessage.includes('rugăciune') || lowerMessage.includes('rog')) {
    return `Rugăciunea este comunicarea noastră directă cu Dumnezeu! Isus ne-a învățat să ne rugăm prin "Tatăl nostru" (Matei 6:9-13).
 Iată câteva principii importante pentru rugăciune:
 • "Rugați-vă neîncetat" (1 Tesaloniceni 5:17)
 • "Cerceți și veți găsi; bateți și vi se va deschide" (Matei 7:7)
 • "Nu vă îngrijorați de nimic, ci în toate, prin rugăciune și cerere, cu mulțumire, să fie cunoscute cererile voastre înaintea lui Dumnezeu" (Filipeni 4:6)
 Rugăciunea poate include laudă, mulțumire, spovedanie și cereri - Dumnezeu vrea să audă totul din inima ta!`
  }
  if (lowerMessage.includes('credință') || lowerMessage.includes('cred')) {
    return `Credința este fundamentul vieții creștine! "Fără credință este cu neputință să fim plăcuți lui Dumnezeu; căci cine se apropie de Dumnezeu trebuie să creadă că El este și că răsplătește pe cei ce Îl caută" (Evrei 11:6).
 "Credința este o încredere neclintită în lucrurile nădăjduite, o dovadă a lucrurilor care nu se văd" (Evrei 11:1).
 Isus a spus: "Adevărat vă spun că, dacă aveți credință cât un grăunte de muștar, veți zice muntelui acestuia: 'Mută-te de aici acolo!' și se va muta" (Matei 17:20).
 Credința crește prin ascultarea Cuvântului lui Dumnezeu: "Credința vine din ascultare, iar ascultarea vine din Cuvântul lui Hristos" (Romani 10:17).`
  }
  if (lowerMessage.includes('speranță') || lowerMessage.includes('sper')) {
    return `Speranța creștină nu este o dorință vagă, ci o certitudine bazată pe promisiunile lui Dumnezeu!
 "Fie ca Dumnezeul speranței să vă umple de toată bucuria și pacea în credință, pentru ca să prisosiți în speranță, prin puterea Duhului Sfânt!" (Romani 15:13).
 Speranța noastră este ancorata în Isus Hristos: "Hristos în voi, nădejdea slavei" (Coloseni 1:27).
 "Binecuvântat să fie Dumnezeu, Tatăl Domnului nostru Isus Hristos, care, după îndurarea Sa cea mare, ne-a născut din nou, printr-o înviere a lui Isus Hristos din morți, pentru o moștenire care nu se poate strica" (1 Petru 1:3-4).`
  }
  if (lowerMessage.includes('iertare') || lowerMessage.includes('iert')) {
    return `Iertarea este una dintre cele mai puternice învățături ale lui Isus! El ne-a învățat să ne rugăm: "Iartă-ne greșelile noastre, precum și noi iertăm greșiților noștri" (Matei 6:12).
 "Dacă iertați oamenilor greșelile lor, și Tatăl vostru cel ceresc vă va ierta greșelile voastre" (Matei 6:14).
 Petru a întrebat pe Isus: "De câte ori să iert?" Isus a răspuns: "Nu îți zic până la șapte ori, ci până la șaptezeci de ori câte șapte" (Matei 18:21-22) - adică mereu!
 Iertarea nu înseamnă că minimalizăm răul, ci că alegem să nu ținem seama de el, așa cum Dumnezeu face cu noi prin Hristos.`
  }
  if (lowerMessage.includes('pace') || lowerMessage.includes('liniște')) {
    return `Pacea lui Dumnezeu este diferită de pacea lumii! Isus a spus: "Pace vă las, pacea Mea vă dau; nu cum dă lumea, vă dau Eu. Să nu vi se tulbure inima și să nu vă fie frică!" (Ioan 14:27).
 "Pacea lui Dumnezeu, care întrece orice pricepere, vă va păzi inimile și gândurile în Hristos Isus" (Filipeni 4:7).
 Pentru a avea pace:
 • "În toate, prin rugăciune și cerere, cu mulțumire, să fie cunoscute cererile voastre înaintea lui Dumnezeu" (Filipeni 4:6)
 • "Aruncați toată grija voastră asupra Lui, căci El îngrijește de voi" (1 Petru 5:7)
 • "Isus le-a zis: 'Veniți la Mine, toți cei trudiți și împovărați, și Eu vă voi da odihnă'" (Matei 11:28)`
  }
  // Default response for other questions
  return `Mulțumesc pentru întrebarea ta! Aceasta este o întrebare foarte importantă din punct de vedere biblic.
 Te încurajez să cercetezi acest subiect în Scriptură, să te rogi pentru înțelegere și să discuți cu lideri spirituali maturi. "Cercetați Scripturile, pentru că socotiți că în ele aveți viața veșnică, și tocmai ele mărturisesc despre Mine" (Ioan 5:39).
 Dacă ai întrebări mai specifice despre anumite pasaje biblice sau doctrine, voi fi bucuros să te ajut mai detaliat. Dumnezeu să te binecuvânteze în căutarea ta după adevăr!
 "Dacă vreunul dintre voi duce lipsă de înțelepciune, să ceară de la Dumnezeu, care dă tuturor cu dărnicie și fără mustrare, și i se va da" (Iacob 1:5).`
 }
--- a/app/layout.tsx
+++ b/app/layout.tsx
@@ -1,6 +1,7 @@
 import './globals.css'
 import type { Metadata } from 'next'
 import { MuiThemeProvider } from '@/components/providers/theme-provider'
 import FloatingChat from '@/components/chat/floating-chat'
 export const metadata: Metadata = {
  title: 'Ghid Biblic - Biblical Guide',
@@ -17,6 +18,7 @@ export default function RootLayout({
      <body>
        <MuiThemeProvider>
          {children}
          <FloatingChat />
        </MuiThemeProvider>
      </body>
    </html>
--- a/components/chat/floating-chat.tsx
+++ b/components/chat/floating-chat.tsx
@@ -0,0 +1,426 @@
 'use client'
 import {
  Fab,
  Drawer,
  Box,
  Typography,
  TextField,
  Button,
  Paper,
  Avatar,
  Chip,
  IconButton,
  Divider,
  List,
  ListItem,
  ListItemText,
  useTheme,
  Slide,
  Grow,
  Zoom,
 } from '@mui/material'
 import {
  Chat,
  Send,
  Close,
  SmartToy,
  Person,
  ContentCopy,
  ThumbUp,
  ThumbDown,
  Minimize,
  Launch,
 } from '@mui/icons-material'
 import { useState, useRef, useEffect } from 'react'
 interface ChatMessage {
  id: string
  role: 'user' | 'assistant'
  content: string
  timestamp: Date
 }
 export default function FloatingChat() {
  const theme = useTheme()
  const [isOpen, setIsOpen] = useState(false)
  const [isMinimized, setIsMinimized] = useState(false)
  const [messages, setMessages] = useState<ChatMessage[]>([
    {
      id: '1',
      role: 'assistant',
      content: 'Bună ziua! Sunt asistentul tău AI pentru întrebări biblice. Cum te pot ajuta astăzi să înțelegi mai bine Scriptura?',
      timestamp: new Date(),
    }
  ])
  const [inputMessage, setInputMessage] = useState('')
  const [isLoading, setIsLoading] = useState(false)
  const messagesEndRef = useRef<HTMLDivElement>(null)
  const scrollToBottom = () => {
    messagesEndRef.current?.scrollIntoView({ behavior: 'smooth' })
  }
  useEffect(() => {
    scrollToBottom()
  }, [messages])
  const handleSendMessage = async () => {
    if (!inputMessage.trim() || isLoading) return
    const userMessage: ChatMessage = {
      id: Date.now().toString(),
      role: 'user',
      content: inputMessage,
      timestamp: new Date(),
    }
    setMessages(prev => [...prev, userMessage])
    setInputMessage('')
    setIsLoading(true)
    try {
      const response = await fetch('/api/chat', {
        method: 'POST',
        headers: {
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          message: inputMessage,
          history: messages.slice(-5),
        }),
      })
      if (!response.ok) {
        throw new Error('Failed to get response')
      }
      const data = await response.json()
      const assistantMessage: ChatMessage = {
        id: (Date.now() + 1).toString(),
        role: 'assistant',
        content: data.response || 'Îmi pare rău, nu am putut procesa întrebarea ta. Te rog încearcă din nou.',
        timestamp: new Date(),
      }
      setMessages(prev => [...prev, assistantMessage])
    } catch (error) {
      console.error('Error sending message:', error)
      const errorMessage: ChatMessage = {
        id: (Date.now() + 1).toString(),
        role: 'assistant',
        content: 'Îmi pare rău, a apărut o eroare. Te rog verifică conexiunea și încearcă din nou.',
        timestamp: new Date(),
      }
      setMessages(prev => [...prev, errorMessage])
    } finally {
      setIsLoading(false)
    }
  }
  const handleKeyPress = (event: React.KeyboardEvent) => {
    if (event.key === 'Enter' && !event.shiftKey) {
      event.preventDefault()
      handleSendMessage()
    }
  }
  const copyToClipboard = (text: string) => {
    navigator.clipboard.writeText(text)
  }
  const suggestedQuestions = [
    'Ce spune Biblia despre iubire?',
    'Explică-mi parabola semănătorului',
    'Care sunt fructele Duhului?',
    'Ce înseamnă să fii născut din nou?',
    'Cum pot să mă rog mai bine?',
  ]
  const toggleChat = () => {
    setIsOpen(!isOpen)
    if (isMinimized) setIsMinimized(false)
  }
  const minimizeChat = () => {
    setIsMinimized(!isMinimized)
  }
  const openFullChat = () => {
    window.open('/chat', '_blank')
  }
  return (
    <>
      {/* Floating Action Button */}
      <Zoom in={!isOpen} unmountOnExit>
        <Fab
          color="primary"
          onClick={toggleChat}
          sx={{
            position: 'fixed',
            bottom: 24,
            right: 24,
            zIndex: 1000,
            background: 'linear-gradient(45deg, #2C5F6B 30%, #8B7355 90%)',
            '&:hover': {
              background: 'linear-gradient(45deg, #1e4148 30%, #6d5a43 90%)',
            }
          }}
        >
          <Chat />
        </Fab>
      </Zoom>
      {/* Chat Overlay */}
      <Slide direction="up" in={isOpen} mountOnExit>
        <Paper
          elevation={8}
          sx={{
            position: 'fixed',
            bottom: 0,
            right: 0,
            width: { xs: '100vw', sm: '50vw', md: '40vw' },
            height: isMinimized ? 'auto' : '100vh',
            zIndex: 1200,
            borderRadius: { xs: 0, sm: '12px 0 0 0' },
            overflow: 'hidden',
            display: 'flex',
            flexDirection: 'column',
            background: 'linear-gradient(to bottom, #f8f9fa, #ffffff)',
          }}
        >
          {/* Header */}
          <Box
            sx={{
              p: 2,
              background: 'linear-gradient(45deg, #2C5F6B 30%, #8B7355 90%)',
              color: 'white',
              display: 'flex',
              alignItems: 'center',
              justifyContent: 'space-between',
            }}
          >
            <Box sx={{ display: 'flex', alignItems: 'center', gap: 1 }}>
              <Avatar sx={{ bgcolor: 'rgba(255,255,255,0.2)' }}>
                <SmartToy />
              </Avatar>
              <Box>
                <Typography variant="subtitle1" fontWeight="bold">
                  Chat AI Biblic
                </Typography>
                <Typography variant="caption" sx={{ opacity: 0.9 }}>
                  Asistent pentru întrebări biblice
                </Typography>
              </Box>
            </Box>
            <Box>
              <IconButton
                size="small"
                onClick={minimizeChat}
                sx={{ color: 'white', mr: 0.5 }}
              >
                <Minimize />
              </IconButton>
              <IconButton
                size="small"
                onClick={openFullChat}
                sx={{ color: 'white', mr: 0.5 }}
              >
                <Launch />
              </IconButton>
              <IconButton
                size="small"
                onClick={toggleChat}
                sx={{ color: 'white' }}
              >
                <Close />
              </IconButton>
            </Box>
          </Box>
          {!isMinimized && (
            <>
              {/* Suggested Questions */}
              <Box sx={{ p: 2, borderBottom: 1, borderColor: 'divider' }}>
                <Typography variant="body2" color="text.secondary" sx={{ mb: 1 }}>
                  Întrebări sugerate:
                </Typography>
                <Box sx={{ display: 'flex', flexWrap: 'wrap', gap: 0.5 }}>
                  {suggestedQuestions.slice(0, 3).map((question, index) => (
                    <Chip
                      key={index}
                      label={question}
                      size="small"
                      variant="outlined"
                      onClick={() => setInputMessage(question)}
                      sx={{
                        fontSize: '0.75rem',
                        cursor: 'pointer',
                        '&:hover': {
                          bgcolor: 'primary.light',
                          color: 'white',
                        },
                      }}
                    />
                  ))}
                </Box>
              </Box>
              {/* Messages */}
              <Box
                sx={{
                  flexGrow: 1,
                  overflow: 'auto',
                  p: 1,
                }}
              >
                {messages.map((message) => (
                  <Box
                    key={message.id}
                    sx={{
                      display: 'flex',
                      justifyContent: message.role === 'user' ? 'flex-end' : 'flex-start',
                      mb: 2,
                    }}
                  >
                    <Box
                      sx={{
                        display: 'flex',
                        flexDirection: message.role === 'user' ? 'row-reverse' : 'row',
                        alignItems: 'flex-start',
                        maxWidth: '85%',
                        gap: 1,
                      }}
                    >
                      <Avatar
                        sx={{
                          width: 32,
                          height: 32,
                          bgcolor: message.role === 'user' ? 'primary.main' : 'secondary.main',
                        }}
                      >
                        {message.role === 'user' ? <Person fontSize="small" /> : <SmartToy fontSize="small" />}
                      </Avatar>
                      <Paper
                        elevation={1}
                        sx={{
                          p: 1.5,
                          bgcolor: message.role === 'user' ? 'primary.light' : 'background.paper',
                          color: message.role === 'user' ? 'white' : 'text.primary',
                          borderRadius: 2,
                          maxWidth: '100%',
                        }}
                      >
                        <Typography
                          variant="body2"
                          sx={{
                            whiteSpace: 'pre-wrap',
                            lineHeight: 1.4,
                          }}
                        >
                          {message.content}
                        </Typography>
                        {message.role === 'assistant' && (
                          <Box sx={{ display: 'flex', gap: 0.5, mt: 1, justifyContent: 'flex-end' }}>
                            <IconButton
                              size="small"
                              onClick={() => copyToClipboard(message.content)}
                            >
                              <ContentCopy fontSize="small" />
                            </IconButton>
                            <IconButton size="small">
                              <ThumbUp fontSize="small" />
                            </IconButton>
                            <IconButton size="small">
                              <ThumbDown fontSize="small" />
                            </IconButton>
                          </Box>
                        )}
                        <Typography
                          variant="caption"
                          sx={{
                            display: 'block',
                            textAlign: 'right',
                            mt: 0.5,
                            opacity: 0.7,
                          }}
                        >
                          {message.timestamp.toLocaleTimeString('ro-RO', {
                            hour: '2-digit',
                            minute: '2-digit',
                          })}
                        </Typography>
                      </Paper>
                    </Box>
                  </Box>
                ))}
                {isLoading && (
                  <Box sx={{ display: 'flex', justifyContent: 'flex-start', mb: 2 }}>
                    <Box sx={{ display: 'flex', alignItems: 'flex-start', gap: 1 }}>
                      <Avatar sx={{ width: 32, height: 32, bgcolor: 'secondary.main' }}>
                        <SmartToy fontSize="small" />
                      </Avatar>
                      <Paper elevation={1} sx={{ p: 1.5, borderRadius: 2 }}>
                        <Typography variant="body2">
                          Scriu răspunsul...
                        </Typography>
                      </Paper>
                    </Box>
                  </Box>
                )}
                <div ref={messagesEndRef} />
              </Box>
              <Divider />
              {/* Input */}
              <Box sx={{ p: 2 }}>
                <Box sx={{ display: 'flex', gap: 1 }}>
                  <TextField
                    fullWidth
                    size="small"
                    multiline
                    maxRows={3}
                    placeholder="Scrie întrebarea ta despre Biblie..."
                    value={inputMessage}
                    onChange={(e) => setInputMessage(e.target.value)}
                    onKeyPress={handleKeyPress}
                    disabled={isLoading}
                    variant="outlined"
                    sx={{
                      '& .MuiOutlinedInput-root': {
                        borderRadius: 2,
                      }
                    }}
                  />
                  <Button
                    variant="contained"
                    onClick={handleSendMessage}
                    disabled={!inputMessage.trim() || isLoading}
                    sx={{
                      minWidth: 'auto',
                      px: 2,
                      borderRadius: 2,
                      background: 'linear-gradient(45deg, #2C5F6B 30%, #8B7355 90%)',
                    }}
                  >
                    <Send fontSize="small" />
                  </Button>
                </Box>
                <Typography variant="caption" color="text.secondary" sx={{ mt: 0.5, display: 'block' }}>
                  Enter pentru a trimite • Shift+Enter pentru linie nouă
                </Typography>
              </Box>
            </>
          )}
        </Paper>
      </Slide>
    </>
  )
 }
--- a/components/layout/navigation.tsx
+++ b/components/layout/navigation.tsx
@@ -24,7 +24,6 @@ import {
 import {
  Menu as MenuIcon,
  MenuBook,
  Chat,
  Favorite as Prayer,
  Search,
  AccountCircle,
@@ -37,7 +36,6 @@ import { useRouter } from 'next/navigation'
 const pages = [
  { name: 'Acasă', path: '/', icon: <Home /> },
  { name: 'Biblia', path: '/bible', icon: <MenuBook /> },
  { name: 'Chat AI', path: '/chat', icon: <Chat /> },
  { name: 'Rugăciuni', path: '/prayers', icon: <Prayer /> },
  { name: 'Căutare', path: '/search', icon: <Search /> },
 ]
--- a/lib/vector-search.ts
+++ b/lib/vector-search.ts
@@ -0,0 +1,140 @@
 import { Pool } from 'pg'
 const pool = new Pool({
  connectionString: process.env.DATABASE_URL,
 })
 export interface BibleVerse {
  id: string
  ref: string
  book: string
  chapter: number
  verse: number
  text_raw: string
  similarity?: number
  combined_score?: number
 }
 export async function getEmbedding(text: string): Promise<number[]> {
  const response = await fetch(
    `${process.env.AZURE_OPENAI_ENDPOINT}/openai/deployments/${process.env.AZURE_OPENAI_EMBED_DEPLOYMENT}/embeddings?api-version=${process.env.AZURE_OPENAI_API_VERSION}`,
    {
      method: 'POST',
      headers: {
        'api-key': process.env.AZURE_OPENAI_KEY!,
        'Content-Type': 'application/json',
      },
      body: JSON.stringify({
        input: [text],
      }),
    }
  )
  if (!response.ok) {
    throw new Error(`Embedding API error: ${response.status}`)
  }
  const data = await response.json()
  return data.data[0].embedding
 }
 export async function searchBibleSemantic(
  query: string,
  limit: number = 10
 ): Promise<BibleVerse[]> {
  try {
    const queryEmbedding = await getEmbedding(query)
    const client = await pool.connect()
    try {
      const result = await client.query(
        `
        SELECT ref, book, chapter, verse, text_raw,
               1 - (embedding <=> $1) AS similarity
        FROM bible_passages
        WHERE embedding IS NOT NULL
        ORDER BY embedding <=> $1
        LIMIT $2
        `,
        [JSON.stringify(queryEmbedding), limit]
      )
      return result.rows
    } finally {
      client.release()
    }
  } catch (error) {
    console.error('Error in semantic search:', error)
    throw error
  }
 }
 export async function searchBibleHybrid(
  query: string,
  limit: number = 10
 ): Promise<BibleVerse[]> {
  try {
    const queryEmbedding = await getEmbedding(query)
    const client = await pool.connect()
    try {
      const result = await client.query(
        `
        WITH vector_search AS (
          SELECT id, 1 - (embedding <=> $1) AS vector_sim
          FROM bible_passages
          WHERE embedding IS NOT NULL
          ORDER BY embedding <=> $1
          LIMIT 100
        ),
        text_search AS (
          SELECT id, ts_rank(tsv, plainto_tsquery('romanian', $3)) AS text_rank
          FROM bible_passages
          WHERE tsv @@ plainto_tsquery('romanian', $3)
        )
        SELECT bp.ref, bp.book, bp.chapter, bp.verse, bp.text_raw,
               COALESCE(vs.vector_sim, 0) * 0.7 + COALESCE(ts.text_rank, 0) * 0.3 AS combined_score
        FROM bible_passages bp
        LEFT JOIN vector_search vs ON vs.id = bp.id
        LEFT JOIN text_search ts ON ts.id = bp.id
        WHERE vs.id IS NOT NULL OR ts.id IS NOT NULL
        ORDER BY combined_score DESC
        LIMIT $2
        `,
        [JSON.stringify(queryEmbedding), limit, query]
      )
      return result.rows
    } finally {
      client.release()
    }
  } catch (error) {
    console.error('Error in hybrid search:', error)
    throw error
  }
 }
 export async function getContextVerses(
  book: string,
  chapter: number,
  verse: number,
  contextSize: number = 2
 ): Promise<BibleVerse[]> {
  const client = await pool.connect()
  try {
    const result = await client.query(
      `
      SELECT ref, book, chapter, verse, text_raw
      FROM bible_passages
      WHERE book = $1 AND chapter = $2
      AND verse BETWEEN $3 AND $4
      ORDER BY verse
      `,
      [book, chapter, verse - contextSize, verse + contextSize]
    )
    return result.rows
  } finally {
    client.release()
  }
 }
--- a/multi-language-implementation-plan.md
+++ b/multi-language-implementation-plan.md
@@ -0,0 +1,212 @@
 # Multi-Language Support Implementation Plan
 ## Overview
 Add comprehensive multi-language support to the Ghid Biblic application, starting with English as the second language alongside Romanian.
 ## Current State
 - **Database**: Already supports multiple languages (`lang` field) and translations (`translation` field)
 - **Frontend**: Hardcoded Romanian interface
 - **Vector Search**: Romanian-only search logic
 - **Bible Data**: Only Romanian (FIDELA) version imported
 ## Implementation Phases
 ### Phase 1: Core Infrastructure
 1. **Install i18n Framework**
   - Add `next-intl` for Next.js internationalization
   - Configure locale routing (`/ro/`, `/en/`)
   - Set up translation file structure
 2. **Language Configuration**
   - Create language detection and switching logic
   - Add language persistence (localStorage/cookies)
   - Configure default language fallbacks
 3. **Translation Files Structure**
   ```
   messages/
   ├── ro.json (Romanian - existing content)
   ├── en.json (English translations)
   └── common.json (shared terms)
   ```
 ### Phase 2: UI Internationalization
 1. **Navigation Component**
   - Translate all menu items and labels
   - Add language switcher dropdown
   - Update routing for locale-aware navigation
 2. **Chat Interface**
   - Translate all UI text and prompts
   - Add suggested questions per language
   - Update loading states and error messages
 3. **Page Content**
   - Home page (`/` → `/[locale]/`)
   - Bible browser (`/bible` → `/[locale]/bible`)
   - Search page (`/search` → `/[locale]/search`)
   - Prayer requests (`/prayers` → `/[locale]/prayers`)
 ### Phase 3: Backend Localization
 1. **Vector Search Updates**
   - Modify search functions to filter by language
   - Add language parameter to search APIs
   - Update hybrid search for language-specific full-text search
 2. **Chat API Enhancement**
   - Language-aware Bible verse retrieval
   - Localized AI response prompts
   - Language-specific fallback responses
 3. **API Route Updates**
   - Add locale parameter to all API endpoints
   - Update error responses for each language
   - Configure language-specific search configurations
 ### Phase 4: Bible Data Management
 1. **English Bible Import**
   - Source: API.Bible or public domain English Bible (KJV/ESV)
   - Adapt existing import script for English
   - Generate English embeddings using Azure OpenAI
 2. **Language-Aware Bible Browser**
   - Add language selector in Bible interface
   - Filter books/chapters/verses by selected language
   - Show parallel verses when both languages available
 ### Phase 5: Enhanced Features
 1. **Parallel Bible View**
   - Side-by-side Romanian/English verse display
   - Cross-reference linking between translations
   - Language comparison in search results
 2. **Smart Language Detection**
   - Auto-detect query language in chat
   - Suggest language switch based on user input
   - Mixed-language search capabilities
 3. **Advanced Search Features**
   - Cross-language semantic search
   - Translation comparison tools
   - Language-specific biblical term glossaries
 ## Technical Implementation Details
 ### Routing Structure
 ```
 Current: /page
 New:     /[locale]/page
 Examples:
 - /ro/biblia (Romanian Bible)
 - /en/bible (English Bible)
 - /ro/rugaciuni (Romanian Prayers)
 - /en/prayers (English Prayers)
 ```
 ### Database Schema Changes
 **No changes needed** - current schema already supports:
 - Multiple languages via `lang` field
 - Multiple translations via `translation` field
 - Unique constraints per translation/language
 ### Vector Search Updates
 ```typescript
 // Current
 searchBibleHybrid(query: string, limit: number)
 // Enhanced
 searchBibleHybrid(query: string, language: string, limit: number)
 ```
 ### Translation File Structure
 ```json
 // messages/en.json
 {
  "navigation": {
    "home": "Home",
    "bible": "Bible",
    "prayers": "Prayers",
    "search": "Search"
  },
  "chat": {
    "placeholder": "Ask your biblical question...",
    "suggestions": [
      "What does the Bible say about love?",
      "Explain the parable of the sower",
      "What are the fruits of the Spirit?"
    ]
  }
 }
 ```
 ### Language Switcher Component
 - Dropdown in navigation header
 - Flag icons for visual identification
 - Persist language choice across sessions
 - Redirect to equivalent page in new language
 ## Dependencies to Add
 ```json
 {
  "next-intl": "^3.x",
  "@formatjs/intl-localematcher": "^0.x",
  "negotiator": "^0.x"
 }
 ```
 ## File Structure Changes
 ```
 app/
 ├── [locale]/
 │   ├── page.tsx
 │   ├── bible/
 │   ├── prayers/
 │   ├── search/
 │   └── layout.tsx
 ├── api/ (unchanged)
 └── globals.css
 messages/
 ├── en.json
 ├── ro.json
 └── index.ts
 components/
 ├── language-switcher.tsx
 ├── navigation.tsx (updated)
 └── chat/ (updated)
 ```
 ## Testing Strategy
 1. **Unit Tests**: Translation loading and language switching
 2. **Integration Tests**: API endpoints with locale parameters
 3. **E2E Tests**: Complete user flows in both languages
 4. **Performance Tests**: Vector search with language filtering
 ## Rollout Plan
 1. **Development**: Implement Phase 1-3 (core infrastructure and UI)
 2. **Testing**: Deploy to staging with Romanian/English support
 3. **Beta Release**: Limited user testing with feedback collection
 4. **Production**: Full release with both languages
 5. **Future**: Add additional languages based on user demand
 ## Estimated Timeline
 - **Phase 1-2**: 2-3 days (i18n setup and UI translation)
 - **Phase 3**: 1-2 days (backend localization)
 - **Phase 4**: 2-3 days (English Bible import and embeddings)
 - **Phase 5**: 3-4 days (enhanced features)
 - **Total**: 8-12 days for complete implementation
 ## Success Metrics
 - Language switching works seamlessly
 - Vector search returns accurate results in both languages
 - AI chat responses are contextually appropriate per language
 - User can browse Bible in preferred language
 - Performance remains optimal with language filtering
 ## Future Considerations
 - Spanish, French, German language support
 - Regional dialect variations
 - Audio Bible integration per language
 - Collaborative translation features for community contributions
--- a/package-lock.json
+++ b/package-lock.json
@@ -24,6 +24,7 @@
        "@tailwindcss/postcss": "^4.1.13",
        "@types/node": "^24.5.2",
        "@types/pdf-parse": "^1.1.5",
        "@types/pg": "^8.15.5",
        "@types/react": "^19.1.13",
        "@types/react-dom": "^19.1.9",
        "autoprefixer": "^10.4.21",
@@ -35,6 +36,8 @@
        "next": "^15.5.3",
        "openai": "^5.22.0",
        "pdf-parse": "^1.1.1",
        "pg": "^8.16.3",
        "pgvector": "^0.2.1",
        "postcss": "^8.5.6",
        "prisma": "^6.16.2",
        "react": "^19.1.1",
@@ -4182,6 +4185,17 @@
        "@types/node": "*"
      }
    },
    "node_modules/@types/pg": {
      "version": "8.15.5",
      "resolved": "https://registry.npmjs.org/@types/pg/-/pg-8.15.5.tgz",
      "integrity": "sha512-LF7lF6zWEKxuT3/OR8wAZGzkg4ENGXFNyiV/JeOt9z5B+0ZVwbql9McqX5c/WStFq1GaGso7H1AzP/qSzmlCKQ==",
      "license": "MIT",
      "dependencies": {
        "@types/node": "*",
        "pg-protocol": "*",
        "pg-types": "^2.2.0"
      }
    },
    "node_modules/@types/prop-types": {
      "version": "15.7.15",
      "resolved": "https://registry.npmjs.org/@types/prop-types/-/prop-types-15.7.15.tgz",
@@ -9639,6 +9653,104 @@
      "integrity": "sha512-xCy9V055GLEqoFaHoC1SoLIaLmWctgCUaBaWxDZ7/Zx4CTyX7cJQLJOok/orfjZAh9kEYpjJa4d0KcJmCbctZA==",
      "license": "MIT"
    },
    "node_modules/pg": {
      "version": "8.16.3",
      "resolved": "https://registry.npmjs.org/pg/-/pg-8.16.3.tgz",
      "integrity": "sha512-enxc1h0jA/aq5oSDMvqyW3q89ra6XIIDZgCX9vkMrnz5DFTw/Ny3Li2lFQ+pt3L6MCgm/5o2o8HW9hiJji+xvw==",
      "license": "MIT",
      "dependencies": {
        "pg-connection-string": "^2.9.1",
        "pg-pool": "^3.10.1",
        "pg-protocol": "^1.10.3",
        "pg-types": "2.2.0",
        "pgpass": "1.0.5"
      },
      "engines": {
        "node": ">= 16.0.0"
      },
      "optionalDependencies": {
        "pg-cloudflare": "^1.2.7"
      },
      "peerDependencies": {
        "pg-native": ">=3.0.1"
      },
      "peerDependenciesMeta": {
        "pg-native": {
          "optional": true
        }
      }
    },
    "node_modules/pg-cloudflare": {
      "version": "1.2.7",
      "resolved": "https://registry.npmjs.org/pg-cloudflare/-/pg-cloudflare-1.2.7.tgz",
      "integrity": "sha512-YgCtzMH0ptvZJslLM1ffsY4EuGaU0cx4XSdXLRFae8bPP4dS5xL1tNB3k2o/N64cHJpwU7dxKli/nZ2lUa5fLg==",
      "license": "MIT",
      "optional": true
    },
    "node_modules/pg-connection-string": {
      "version": "2.9.1",
      "resolved": "https://registry.npmjs.org/pg-connection-string/-/pg-connection-string-2.9.1.tgz",
      "integrity": "sha512-nkc6NpDcvPVpZXxrreI/FOtX3XemeLl8E0qFr6F2Lrm/I8WOnaWNhIPK2Z7OHpw7gh5XJThi6j6ppgNoaT1w4w==",
      "license": "MIT"
    },
    "node_modules/pg-int8": {
      "version": "1.0.1",
      "resolved": "https://registry.npmjs.org/pg-int8/-/pg-int8-1.0.1.tgz",
      "integrity": "sha512-WCtabS6t3c8SkpDBUlb1kjOs7l66xsGdKpIPZsg4wR+B3+u9UAum2odSsF9tnvxg80h4ZxLWMy4pRjOsFIqQpw==",
      "license": "ISC",
      "engines": {
        "node": ">=4.0.0"
      }
    },
    "node_modules/pg-pool": {
      "version": "3.10.1",
      "resolved": "https://registry.npmjs.org/pg-pool/-/pg-pool-3.10.1.tgz",
      "integrity": "sha512-Tu8jMlcX+9d8+QVzKIvM/uJtp07PKr82IUOYEphaWcoBhIYkoHpLXN3qO59nAI11ripznDsEzEv8nUxBVWajGg==",
      "license": "MIT",
      "peerDependencies": {
        "pg": ">=8.0"
      }
    },
    "node_modules/pg-protocol": {
      "version": "1.10.3",
      "resolved": "https://registry.npmjs.org/pg-protocol/-/pg-protocol-1.10.3.tgz",
      "integrity": "sha512-6DIBgBQaTKDJyxnXaLiLR8wBpQQcGWuAESkRBX/t6OwA8YsqP+iVSiond2EDy6Y/dsGk8rh/jtax3js5NeV7JQ==",
      "license": "MIT"
    },
    "node_modules/pg-types": {
      "version": "2.2.0",
      "resolved": "https://registry.npmjs.org/pg-types/-/pg-types-2.2.0.tgz",
      "integrity": "sha512-qTAAlrEsl8s4OiEQY69wDvcMIdQN6wdz5ojQiOy6YRMuynxenON0O5oCpJI6lshc6scgAY8qvJ2On/p+CXY0GA==",
      "license": "MIT",
      "dependencies": {
        "pg-int8": "1.0.1",
        "postgres-array": "~2.0.0",
        "postgres-bytea": "~1.0.0",
        "postgres-date": "~1.0.4",
        "postgres-interval": "^1.1.0"
      },
      "engines": {
        "node": ">=4"
      }
    },
    "node_modules/pgpass": {
      "version": "1.0.5",
      "resolved": "https://registry.npmjs.org/pgpass/-/pgpass-1.0.5.tgz",
      "integrity": "sha512-FdW9r/jQZhSeohs1Z3sI1yxFQNFvMcnmfuj4WBMUTxOrAyLMaTcE1aAMBiTlbMNaXvBCQuVi0R7hd8udDSP7ug==",
      "license": "MIT",
      "dependencies": {
        "split2": "^4.1.0"
      }
    },
    "node_modules/pgvector": {
      "version": "0.2.1",
      "resolved": "https://registry.npmjs.org/pgvector/-/pgvector-0.2.1.tgz",
      "integrity": "sha512-nKaQY9wtuiidwLMdVIce1O3kL0d+FxrigCVzsShnoqzOSaWWWOvuctb/sYwlai5cTwwzRSNa+a/NtN2kVZGNJw==",
      "license": "MIT",
      "engines": {
        "node": ">= 18"
      }
    },
    "node_modules/picocolors": {
      "version": "1.1.1",
      "resolved": "https://registry.npmjs.org/picocolors/-/picocolors-1.1.1.tgz",
@@ -9726,6 +9838,45 @@
      "integrity": "sha512-1NNCs6uurfkVbeXG4S8JFT9t19m45ICnif8zWLd5oPSZ50QnwMfK+H3jv408d4jw/7Bttv5axS5IiHoLaVNHeQ==",
      "license": "MIT"
    },
    "node_modules/postgres-array": {
      "version": "2.0.0",
      "resolved": "https://registry.npmjs.org/postgres-array/-/postgres-array-2.0.0.tgz",
      "integrity": "sha512-VpZrUqU5A69eQyW2c5CA1jtLecCsN2U/bD6VilrFDWq5+5UIEVO7nazS3TEcHf1zuPYO/sqGvUvW62g86RXZuA==",
      "license": "MIT",
      "engines": {
        "node": ">=4"
      }
    },
    "node_modules/postgres-bytea": {
      "version": "1.0.0",
      "resolved": "https://registry.npmjs.org/postgres-bytea/-/postgres-bytea-1.0.0.tgz",
      "integrity": "sha512-xy3pmLuQqRBZBXDULy7KbaitYqLcmxigw14Q5sj8QBVLqEwXfeybIKVWiqAXTlcvdvb0+xkOtDbfQMOf4lST1w==",
      "license": "MIT",
      "engines": {
        "node": ">=0.10.0"
      }
    },
    "node_modules/postgres-date": {
      "version": "1.0.7",
      "resolved": "https://registry.npmjs.org/postgres-date/-/postgres-date-1.0.7.tgz",
      "integrity": "sha512-suDmjLVQg78nMK2UZ454hAG+OAW+HQPZ6n++TNDUX+L0+uUlLywnoxJKDou51Zm+zTCjrCl0Nq6J9C5hP9vK/Q==",
      "license": "MIT",
      "engines": {
        "node": ">=0.10.0"
      }
    },
    "node_modules/postgres-interval": {
      "version": "1.2.0",
      "resolved": "https://registry.npmjs.org/postgres-interval/-/postgres-interval-1.2.0.tgz",
      "integrity": "sha512-9ZhXKM/rw350N1ovuWHbGxnGh/SNJ4cnxHiM0rxE4VN41wsg8P8zWn9hv/buK00RP4WvlOyr/RBDiptyxVbkZQ==",
      "license": "MIT",
      "dependencies": {
        "xtend": "^4.0.0"
      },
      "engines": {
        "node": ">=0.10.0"
      }
    },
    "node_modules/pretty-format": {
      "version": "27.5.1",
      "resolved": "https://registry.npmjs.org/pretty-format/-/pretty-format-27.5.1.tgz",
@@ -10480,6 +10631,15 @@
        "url": "https://github.com/sponsors/wooorm"
      }
    },
    "node_modules/split2": {
      "version": "4.2.0",
      "resolved": "https://registry.npmjs.org/split2/-/split2-4.2.0.tgz",
      "integrity": "sha512-UcjcJOWknrNkF6PLX83qcHM6KHgVKNkV62Y8a5uYDVv9ydGQVwAHMKqHdJje1VTWpljG0WYpCDhrCdAOYH4TWg==",
      "license": "ISC",
      "engines": {
        "node": ">= 10.x"
      }
    },
    "node_modules/sprintf-js": {
      "version": "1.0.3",
      "resolved": "https://registry.npmjs.org/sprintf-js/-/sprintf-js-1.0.3.tgz",
@@ -11638,6 +11798,15 @@
        "node": ">=0.4.0"
      }
    },
    "node_modules/xtend": {
      "version": "4.0.2",
      "resolved": "https://registry.npmjs.org/xtend/-/xtend-4.0.2.tgz",
      "integrity": "sha512-LKYU1iAXJXUgAXn9URjiu+MWhyUXHsvfp7mcuYm9dSUKK0/CjtrUwFAxD82/mCWbtLsGjFIad0wIsod4zrTAEQ==",
      "license": "MIT",
      "engines": {
        "node": ">=0.4"
      }
    },
    "node_modules/y18n": {
      "version": "5.0.8",
      "resolved": "https://registry.npmjs.org/y18n/-/y18n-5.0.8.tgz",
--- a/package.json
+++ b/package.json
@@ -37,6 +37,7 @@
    "@tailwindcss/postcss": "^4.1.13",
    "@types/node": "^24.5.2",
    "@types/pdf-parse": "^1.1.5",
    "@types/pg": "^8.15.5",
    "@types/react": "^19.1.13",
    "@types/react-dom": "^19.1.9",
    "autoprefixer": "^10.4.21",
@@ -48,6 +49,8 @@
    "next": "^15.5.3",
    "openai": "^5.22.0",
    "pdf-parse": "^1.1.1",
    "pg": "^8.16.3",
    "pgvector": "^0.2.1",
    "postcss": "^8.5.6",
    "prisma": "^6.16.2",
    "react": "^19.1.1",
--- a/prisma/schema.prisma
+++ b/prisma/schema.prisma
@@ -78,6 +78,26 @@ model BibleVerse {
  @@index([version])
 }
 model BiblePassage {
  id          String   @id @default(uuid())
  testament   String   // 'OT' or 'NT'
  book        String
  chapter     Int
  verse       Int
  ref         String   // Generated field: "book chapter:verse"
  lang        String   @default("ro")
  translation String   @default("FIDELA")
  textRaw     String   @db.Text
  textNorm    String   @db.Text // Normalized text for embedding
  embedding   Unsupported("vector(3072)")?
  createdAt   DateTime @default(now())
  updatedAt   DateTime @updatedAt
  @@unique([translation, lang, book, chapter, verse])
  @@index([book, chapter])
  @@index([testament])
 }
 model ChatMessage {
  id          String   @id @default(uuid())
  userId      String
--- a/scripts/bible_search.py
+++ b/scripts/bible_search.py
@@ -0,0 +1,121 @@
 import os
 import asyncio
 from typing import List, Dict
 from dotenv import load_dotenv
 import httpx
 import psycopg
 from psycopg.rows import dict_row
 load_dotenv()
 AZ_ENDPOINT = os.getenv("AZURE_OPENAI_ENDPOINT", "").rstrip("/")
 AZ_API_KEY = os.getenv("AZURE_OPENAI_KEY")
 AZ_API_VER = os.getenv("AZURE_OPENAI_API_VERSION", "2024-05-01-preview")
 AZ_DEPLOYMENT = os.getenv("AZURE_OPENAI_EMBED_DEPLOYMENT", "embed-3")
 DB_URL = os.getenv("DATABASE_URL")
 EMBED_URL = f"{AZ_ENDPOINT}/openai/deployments/{AZ_DEPLOYMENT}/embeddings?api-version={AZ_API_VER}"
 async def get_embedding(text: str) -> List[float]:
    """Get embedding for a text using Azure OpenAI"""
    payload = {"input": [text]}
    headers = {"api-key": AZ_API_KEY, "Content-Type": "application/json"}
    async with httpx.AsyncClient() as client:
        for attempt in range(3):
            try:
                r = await client.post(EMBED_URL, headers=headers, json=payload, timeout=30)
                if r.status_code == 200:
                    data = r.json()
                    return data["data"][0]["embedding"]
                elif r.status_code in (429, 500, 503):
                    backoff = 2 ** attempt
                    await asyncio.sleep(backoff)
                else:
                    raise RuntimeError(f"Embedding error {r.status_code}: {r.text}")
            except Exception as e:
                if attempt == 2:
                    raise e
                await asyncio.sleep(2 ** attempt)
 async def search_bible_semantic(query: str, limit: int = 10) -> List[Dict]:
    """Search Bible using semantic similarity"""
    # Get embedding for the query
    query_embedding = await get_embedding(query)
    # Search for similar verses
    with psycopg.connect(DB_URL, row_factory=dict_row) as conn:
        with conn.cursor() as cur:
            cur.execute("""
                SELECT ref, book, chapter, verse, text_raw,
                       1 - (embedding <=> %s) AS similarity
                FROM bible_passages
                WHERE embedding IS NOT NULL
                ORDER BY embedding <=> %s
                LIMIT %s
            """, (query_embedding, query_embedding, limit))
            return cur.fetchall()
 async def search_bible_hybrid(query: str, limit: int = 10) -> List[Dict]:
    """Search Bible using hybrid semantic + lexical search"""
    # Get embedding for the query
    query_embedding = await get_embedding(query)
    # Create search query for full-text search
    search_query = " & ".join(query.split())
    with psycopg.connect(DB_URL, row_factory=dict_row) as conn:
        with conn.cursor() as cur:
            cur.execute("""
                WITH vector_search AS (
                    SELECT id, 1 - (embedding <=> %s) AS vector_sim
                    FROM bible_passages
                    WHERE embedding IS NOT NULL
                    ORDER BY embedding <=> %s
                    LIMIT 100
                ),
                text_search AS (
                    SELECT id, ts_rank(tsv, plainto_tsquery('romanian', %s)) AS text_rank
                    FROM bible_passages
                    WHERE tsv @@ plainto_tsquery('romanian', %s)
                )
                SELECT bp.ref, bp.book, bp.chapter, bp.verse, bp.text_raw,
                       COALESCE(vs.vector_sim, 0) * 0.7 + COALESCE(ts.text_rank, 0) * 0.3 AS combined_score
                FROM bible_passages bp
                LEFT JOIN vector_search vs ON vs.id = bp.id
                LEFT JOIN text_search ts ON ts.id = bp.id
                WHERE vs.id IS NOT NULL OR ts.id IS NOT NULL
                ORDER BY combined_score DESC
                LIMIT %s
            """, (query_embedding, query_embedding, query, query, limit))
            return cur.fetchall()
 async def get_context_verses(book: str, chapter: int, verse: int, context_size: int = 2) -> List[Dict]:
    """Get surrounding verses for context"""
    with psycopg.connect(DB_URL, row_factory=dict_row) as conn:
        with conn.cursor() as cur:
            cur.execute("""
                SELECT ref, book, chapter, verse, text_raw
                FROM bible_passages
                WHERE book = %s AND chapter = %s
                AND verse BETWEEN %s AND %s
                ORDER BY verse
            """, (book, chapter, verse - context_size, verse + context_size))
            return cur.fetchall()
 if __name__ == "__main__":
    async def test_search():
        results = await search_bible_semantic("dragoste", 5)
        print("Semantic search results for 'dragoste':")
        for result in results:
            print(f"{result['ref']}: {result['text_raw'][:100]}... (similarity: {result['similarity']:.3f})")
        print("\nHybrid search results for 'dragoste':")
        hybrid_results = await search_bible_hybrid("dragoste", 5)
        for result in hybrid_results:
            print(f"{result['ref']}: {result['text_raw'][:100]}... (score: {result['combined_score']:.3f})")
    asyncio.run(test_search())
--- a/scripts/import-romanian-bible-md.ts
+++ b/scripts/import-romanian-bible-md.ts
@@ -0,0 +1,305 @@
 import { PrismaClient } from '@prisma/client'
 import * as fs from 'fs'
 import * as path from 'path'
 const prisma = new PrismaClient()
 // Book name mappings from Romanian to standardized names
 const BOOK_MAPPINGS: Record<string, { name: string; abbreviation: string; testament: string; orderNum: number }> = {
  'Geneza': { name: 'Geneza', abbreviation: 'GEN', testament: 'OT', orderNum: 1 },
  'Exodul': { name: 'Exodul', abbreviation: 'EXO', testament: 'OT', orderNum: 2 },
  'Leviticul': { name: 'Leviticul', abbreviation: 'LEV', testament: 'OT', orderNum: 3 },
  'Numeri': { name: 'Numerii', abbreviation: 'NUM', testament: 'OT', orderNum: 4 },
  'Deuteronom': { name: 'Deuteronomul', abbreviation: 'DEU', testament: 'OT', orderNum: 5 },
  'Iosua': { name: 'Iosua', abbreviation: 'JOS', testament: 'OT', orderNum: 6 },
  'Judecători': { name: 'Judecătorii', abbreviation: 'JDG', testament: 'OT', orderNum: 7 },
  'Rut': { name: 'Rut', abbreviation: 'RUT', testament: 'OT', orderNum: 8 },
  '1 Samuel': { name: '1 Samuel', abbreviation: '1SA', testament: 'OT', orderNum: 9 },
  '2 Samuel': { name: '2 Samuel', abbreviation: '2SA', testament: 'OT', orderNum: 10 },
  '1 Imparati': { name: '1 Împărați', abbreviation: '1KI', testament: 'OT', orderNum: 11 },
  '2 Imparati': { name: '2 Împărați', abbreviation: '2KI', testament: 'OT', orderNum: 12 },
  '1 Cronici': { name: '1 Cronici', abbreviation: '1CH', testament: 'OT', orderNum: 13 },
  '2 Cronici': { name: '2 Cronici', abbreviation: '2CH', testament: 'OT', orderNum: 14 },
  'Ezra': { name: 'Ezra', abbreviation: 'EZR', testament: 'OT', orderNum: 15 },
  'Neemia': { name: 'Neemia', abbreviation: 'NEH', testament: 'OT', orderNum: 16 },
  'Estera': { name: 'Estera', abbreviation: 'EST', testament: 'OT', orderNum: 17 },
  'Iov': { name: 'Iov', abbreviation: 'JOB', testament: 'OT', orderNum: 18 },
  'Psalmii': { name: 'Psalmii', abbreviation: 'PSA', testament: 'OT', orderNum: 19 },
  'Proverbe': { name: 'Proverbele', abbreviation: 'PRO', testament: 'OT', orderNum: 20 },
  'Eclesiastul': { name: 'Eclesiastul', abbreviation: 'ECC', testament: 'OT', orderNum: 21 },
  'Cântarea Cântărilor': { name: 'Cântarea Cântărilor', abbreviation: 'SNG', testament: 'OT', orderNum: 22 },
  'Isaia': { name: 'Isaia', abbreviation: 'ISA', testament: 'OT', orderNum: 23 },
  'Ieremia': { name: 'Ieremia', abbreviation: 'JER', testament: 'OT', orderNum: 24 },
  'Plângerile': { name: 'Plângerile', abbreviation: 'LAM', testament: 'OT', orderNum: 25 },
  'Ezechiel': { name: 'Ezechiel', abbreviation: 'EZK', testament: 'OT', orderNum: 26 },
  'Daniel': { name: 'Daniel', abbreviation: 'DAN', testament: 'OT', orderNum: 27 },
  'Osea': { name: 'Osea', abbreviation: 'HOS', testament: 'OT', orderNum: 28 },
  'Ioel': { name: 'Ioel', abbreviation: 'JOL', testament: 'OT', orderNum: 29 },
  'Amos': { name: 'Amos', abbreviation: 'AMO', testament: 'OT', orderNum: 30 },
  'Obadia': { name: 'Obadia', abbreviation: 'OBA', testament: 'OT', orderNum: 31 },
  'Iona': { name: 'Iona', abbreviation: 'JON', testament: 'OT', orderNum: 32 },
  'Mica': { name: 'Mica', abbreviation: 'MIC', testament: 'OT', orderNum: 33 },
  'Naum': { name: 'Naum', abbreviation: 'NAM', testament: 'OT', orderNum: 34 },
  'Habacuc': { name: 'Habacuc', abbreviation: 'HAB', testament: 'OT', orderNum: 35 },
  'Țefania': { name: 'Țefania', abbreviation: 'ZEP', testament: 'OT', orderNum: 36 },
  'Hagai': { name: 'Hagai', abbreviation: 'HAG', testament: 'OT', orderNum: 37 },
  'Zaharia': { name: 'Zaharia', abbreviation: 'ZEC', testament: 'OT', orderNum: 38 },
  'Maleahi': { name: 'Maleahi', abbreviation: 'MAL', testament: 'OT', orderNum: 39 },
  // New Testament
  'Matei': { name: 'Matei', abbreviation: 'MAT', testament: 'NT', orderNum: 40 },
  'Marcu': { name: 'Marcu', abbreviation: 'MRK', testament: 'NT', orderNum: 41 },
  'Luca': { name: 'Luca', abbreviation: 'LUK', testament: 'NT', orderNum: 42 },
  'Ioan': { name: 'Ioan', abbreviation: 'JHN', testament: 'NT', orderNum: 43 },
  'Faptele Apostolilor': { name: 'Faptele Apostolilor', abbreviation: 'ACT', testament: 'NT', orderNum: 44 },
  'Romani': { name: 'Romani', abbreviation: 'ROM', testament: 'NT', orderNum: 45 },
  '1 Corinteni': { name: '1 Corinteni', abbreviation: '1CO', testament: 'NT', orderNum: 46 },
  '2 Corinteni': { name: '2 Corinteni', abbreviation: '2CO', testament: 'NT', orderNum: 47 },
  'Galateni': { name: 'Galateni', abbreviation: 'GAL', testament: 'NT', orderNum: 48 },
  'Efeseni': { name: 'Efeseni', abbreviation: 'EPH', testament: 'NT', orderNum: 49 },
  'Filipeni': { name: 'Filipeni', abbreviation: 'PHP', testament: 'NT', orderNum: 50 },
  'Coloseni': { name: 'Coloseni', abbreviation: 'COL', testament: 'NT', orderNum: 51 },
  '1 Tesaloniceni': { name: '1 Tesaloniceni', abbreviation: '1TH', testament: 'NT', orderNum: 52 },
  '2 Tesaloniceni': { name: '2 Tesaloniceni', abbreviation: '2TH', testament: 'NT', orderNum: 53 },
  '1 Timotei': { name: '1 Timotei', abbreviation: '1TI', testament: 'NT', orderNum: 54 },
  '2 Timotei': { name: '2 Timotei', abbreviation: '2TI', testament: 'NT', orderNum: 55 },
  'Titus': { name: 'Titus', abbreviation: 'TIT', testament: 'NT', orderNum: 56 },
  'Filimon': { name: 'Filimon', abbreviation: 'PHM', testament: 'NT', orderNum: 57 },
  'Evrei': { name: 'Evrei', abbreviation: 'HEB', testament: 'NT', orderNum: 58 },
  'Iacov': { name: 'Iacov', abbreviation: 'JAS', testament: 'NT', orderNum: 59 },
  '1 Petru': { name: '1 Petru', abbreviation: '1PE', testament: 'NT', orderNum: 60 },
  '2 Petru': { name: '2 Petru', abbreviation: '2PE', testament: 'NT', orderNum: 61 },
  '1 Ioan': { name: '1 Ioan', abbreviation: '1JN', testament: 'NT', orderNum: 62 },
  '2 Ioan': { name: '2 Ioan', abbreviation: '2JN', testament: 'NT', orderNum: 63 },
  '3 Ioan': { name: '3 Ioan', abbreviation: '3JN', testament: 'NT', orderNum: 64 },
  'Iuda': { name: 'Iuda', abbreviation: 'JUD', testament: 'NT', orderNum: 65 },
  'Revelaţia': { name: 'Revelația', abbreviation: 'REV', testament: 'NT', orderNum: 66 },
 }
 interface ParsedVerse {
  verseNum: number
  text: string
 }
 interface ParsedChapter {
  chapterNum: number
  verses: ParsedVerse[]
 }
 interface ParsedBook {
  name: string
  chapters: ParsedChapter[]
 }
 async function parseRomanianBible(filePath: string): Promise<ParsedBook[]> {
  console.log(`Reading Romanian Bible from: ${filePath}`)
  const content = fs.readFileSync(filePath, 'utf-8')
  const lines = content.split('\n')
  const books: ParsedBook[] = []
  let currentBook: ParsedBook | null = null
  let currentChapter: ParsedChapter | null = null
  let isInBibleContent = false
  for (let i = 0; i < lines.length; i++) {
    const line = lines[i].trim()
    // Start processing after "VECHIUL TESTAMENT"
    if (line === 'VECHIUL TESTAMENT' || line === 'TESTAMENT') {
      isInBibleContent = true
      continue
    }
    if (!isInBibleContent) continue
    // Book detection: … BookName …
    const bookMatch = line.match(/^…\s*(.+?)\s*…$/)
    if (bookMatch) {
      // Save previous book if exists
      if (currentBook && currentBook.chapters.length > 0) {
        books.push(currentBook)
      }
      const bookName = bookMatch[1].trim()
      console.log(`Found book: ${bookName}`)
      currentBook = {
        name: bookName,
        chapters: []
      }
      currentChapter = null
      continue
    }
    // Chapter detection: Capitolul X or CApitoLuL X
    const chapterMatch = line.match(/^[cC][aA][pP][iI][tT][oO][lL][uU][lL]\s+(\d+)$/i)
    if (chapterMatch && currentBook) {
      // Save previous chapter if exists
      if (currentChapter && currentChapter.verses.length > 0) {
        currentBook.chapters.push(currentChapter)
      }
      const chapterNum = parseInt(chapterMatch[1])
      console.log(`  Chapter ${chapterNum}`)
      currentChapter = {
        chapterNum,
        verses: []
      }
      continue
    }
    // Verse detection: starts with number
    const verseMatch = line.match(/^(\d+)\s+(.+)$/)
    if (verseMatch && currentChapter) {
      const verseNum = parseInt(verseMatch[1])
      let verseText = verseMatch[2].trim()
      // Handle paragraph markers
      verseText = verseText.replace(/^¶\s*/, '')
      // Look ahead for continuation lines (lines that don't start with numbers or special markers)
      let j = i + 1
      while (j < lines.length) {
        const nextLine = lines[j].trim()
        // Stop if we hit a new verse, chapter, book, or empty line
        if (!nextLine ||
            nextLine.match(/^\d+\s/) ||           // New verse
            nextLine.match(/^[cC][aA][pP][iI][tT][oO][lL][uU][lL]\s+\d+$/i) || // New chapter
            nextLine.match(/^….*…$/) ||           // New book
            nextLine === 'TESTAMENT') {           // Testament marker
          break
        }
        // Add continuation line
        verseText += ' ' + nextLine
        j++
      }
      // Clean up the text
      verseText = verseText.replace(/\s+/g, ' ').trim()
      currentChapter.verses.push({
        verseNum,
        text: verseText
      })
      // Skip the lines we've processed
      i = j - 1
      continue
    }
  }
  // Save the last book and chapter
  if (currentChapter && currentChapter.verses.length > 0 && currentBook) {
    currentBook.chapters.push(currentChapter)
  }
  if (currentBook && currentBook.chapters.length > 0) {
    books.push(currentBook)
  }
  console.log(`Parsed ${books.length} books`)
  return books
 }
 async function importRomanianBible() {
  try {
    console.log('Starting Romanian Bible import...')
    // Clear existing data
    console.log('Clearing existing data...')
    await prisma.bibleVerse.deleteMany()
    await prisma.bibleChapter.deleteMany()
    await prisma.bibleBook.deleteMany()
    // Parse the markdown file
    const filePath = path.join(process.cwd(), 'bibles', 'Biblia-Fidela-limba-romana.md')
    const books = await parseRomanianBible(filePath)
    console.log(`Importing ${books.length} books into database...`)
    for (const book of books) {
      const bookInfo = BOOK_MAPPINGS[book.name]
      if (!bookInfo) {
        console.warn(`Warning: No mapping found for book "${book.name}", skipping...`)
        continue
      }
      console.log(`Creating book: ${bookInfo.name}`)
      // Create book
      const createdBook = await prisma.bibleBook.create({
        data: {
          id: bookInfo.orderNum,
          name: bookInfo.name,
          testament: bookInfo.testament,
          orderNum: bookInfo.orderNum
        }
      })
      // Create chapters and verses
      for (const chapter of book.chapters) {
        console.log(`  Creating chapter ${chapter.chapterNum} with ${chapter.verses.length} verses`)
        const createdChapter = await prisma.bibleChapter.create({
          data: {
            bookId: createdBook.id,
            chapterNum: chapter.chapterNum
          }
        })
        // Create verses in batch (deduplicate by verse number)
        const uniqueVerses = chapter.verses.reduce((acc, verse) => {
          acc[verse.verseNum] = verse  // This will overwrite duplicates
          return acc
        }, {} as Record<number, ParsedVerse>)
        const versesData = Object.values(uniqueVerses).map(verse => ({
          chapterId: createdChapter.id,
          verseNum: verse.verseNum,
          text: verse.text,
          version: 'FIDELA'
        }))
        if (versesData.length > 0) {
          await prisma.bibleVerse.createMany({
            data: versesData
          })
        }
      }
    }
    // Print summary
    const bookCount = await prisma.bibleBook.count()
    const chapterCount = await prisma.bibleChapter.count()
    const verseCount = await prisma.bibleVerse.count()
    console.log('\n✅ Romanian Bible import completed successfully!')
    console.log(`📚 Books imported: ${bookCount}`)
    console.log(`📖 Chapters imported: ${chapterCount}`)
    console.log(`📝 Verses imported: ${verseCount}`)
  } catch (error) {
    console.error('❌ Error importing Romanian Bible:', error)
    throw error
  } finally {
    await prisma.$disconnect()
  }
 }
 // Run the import
 if (require.main === module) {
  importRomanianBible()
    .then(() => {
      console.log('Import completed successfully!')
      process.exit(0)
    })
    .catch((error) => {
      console.error('Import failed:', error)
      process.exit(1)
    })
 }
 export { importRomanianBible }
--- a/scripts/ingest_bible_pgvector.py
+++ b/scripts/ingest_bible_pgvector.py
@@ -0,0 +1,231 @@
 import os, re, json, math, time, asyncio
 from typing import List, Dict, Tuple, Iterable
 from dataclasses import dataclass
 from pathlib import Path
 from dotenv import load_dotenv
 import httpx
 import psycopg
 from psycopg.rows import dict_row
 load_dotenv()
 AZ_ENDPOINT   = os.getenv("AZURE_OPENAI_ENDPOINT", "").rstrip("/")
 AZ_API_KEY    = os.getenv("AZURE_OPENAI_KEY")
 AZ_API_VER    = os.getenv("AZURE_OPENAI_API_VERSION", "2024-05-01-preview")
 AZ_DEPLOYMENT = os.getenv("AZURE_OPENAI_EMBED_DEPLOYMENT", "embed-3")
 EMBED_DIMS    = int(os.getenv("EMBED_DIMS", "3072"))
 DB_URL        = os.getenv("DATABASE_URL")
 BIBLE_MD_PATH = os.getenv("BIBLE_MD_PATH")
 LANG_CODE     = os.getenv("LANG_CODE", "ro")
 TRANSLATION   = os.getenv("TRANSLATION_CODE", "FIDELA")
 assert AZ_ENDPOINT and AZ_API_KEY and DB_URL and BIBLE_MD_PATH, "Missing required env vars"
 EMBED_URL = f"{AZ_ENDPOINT}/openai/deployments/{AZ_DEPLOYMENT}/embeddings?api-version={AZ_API_VER}"
 BOOKS_OT = [
  "Geneza","Exodul","Leviticul","Numeri","Deuteronom","Iosua","Judecători","Rut",
  "1 Samuel","2 Samuel","1 Imparati","2 Imparati","1 Cronici","2 Cronici","Ezra","Neemia","Estera",
  "Iov","Psalmii","Proverbe","Eclesiastul","Cântarea Cântărilor","Isaia","Ieremia","Plângerile",
  "Ezechiel","Daniel","Osea","Ioel","Amos","Obadia","Iona","Mica","Naum","Habacuc","Țefania","Hagai","Zaharia","Maleahi"
 ]
 BOOKS_NT = [
  "Matei","Marcu","Luca","Ioan","Faptele Apostolilor","Romani","1 Corinteni","2 Corinteni",
  "Galateni","Efeseni","Filipeni","Coloseni","1 Tesaloniceni","2 Tesaloniceni","1 Timotei","2 Timotei",
  "Titus","Filimon","Evrei","Iacov","1 Petru","2 Petru","1 Ioan","2 Ioan","3 Ioan","Iuda","Revelaţia"
 ]
 BOOK_CANON = {b:("OT" if b in BOOKS_OT else "NT") for b in BOOKS_OT + BOOKS_NT}
@dataclass
 class Verse:
    testament: str
    book: str
    chapter: int
    verse: int
    text_raw: str
    text_norm: str
 def normalize_text(s: str) -> str:
    s = re.sub(r"\s+", " ", s.strip())
    s = s.replace("  ", " ")
    return s
 BOOK_RE   = re.compile(r"^(?P<book>[A-ZĂÂÎȘȚ][^\n]+?)\s*$")
 CH_RE     = re.compile(r"^(?i:Capitolul|CApitoLuL)\s+(?P<ch>\d+)\b")
 VERSE_RE  = re.compile(r"^(?P<v>\d+)\s+(?P<body>.+)$")
 def parse_bible_md(md_text: str):
    cur_book, cur_ch = None, None
    testament = None
    is_in_bible_content = False
    for line in md_text.splitlines():
        line = line.rstrip()
        # Start processing after "VECHIUL TESTAMENT" or when we find book markers
        if line == 'VECHIUL TESTAMENT' or line == 'TESTAMENT' or '…' in line:
            is_in_bible_content = True
        if not is_in_bible_content:
            continue
        # Book detection: … BookName …
        book_match = re.match(r'^…\s*(.+?)\s*…$', line)
        if book_match:
            bname = book_match.group(1).strip()
            if bname in BOOK_CANON:
                cur_book = bname
                testament = BOOK_CANON[bname]
                cur_ch = None
                print(f"Found book: {bname}")
                continue
        # Chapter detection: Capitolul X or CApitoLuL X
        m_ch = CH_RE.match(line)
        if m_ch and cur_book:
            cur_ch = int(m_ch.group("ch"))
            print(f"  Chapter {cur_ch}")
            continue
        # Verse detection: starts with number
        m_v = VERSE_RE.match(line)
        if m_v and cur_book and cur_ch:
            vnum = int(m_v.group("v"))
            body = m_v.group("body").strip()
            # Remove paragraph markers
            body = re.sub(r'^¶\s*', '', body)
            raw = body
            norm = normalize_text(body)
            yield {
                "testament": testament, "book": cur_book, "chapter": cur_ch, "verse": vnum,
                "text_raw": raw, "text_norm": norm
            }
 async def embed_batch(client, inputs):
    payload = {"input": inputs}
    headers = {"api-key": AZ_API_KEY, "Content-Type": "application/json"}
    for attempt in range(6):
        try:
            r = await client.post(EMBED_URL, headers=headers, json=payload, timeout=60)
            if r.status_code == 200:
                data = r.json()
                ordered = sorted(data["data"], key=lambda x: x["index"])
                return [d["embedding"] for d in ordered]
            elif r.status_code in (429, 500, 503):
                backoff = 2 ** attempt + (0.1 * attempt)
                print(f"Rate limited, waiting {backoff:.1f}s...")
                await asyncio.sleep(backoff)
            else:
                raise RuntimeError(f"Embedding error {r.status_code}: {r.text}")
        except Exception as e:
            backoff = 2 ** attempt + (0.1 * attempt)
            print(f"Error on attempt {attempt + 1}: {e}, waiting {backoff:.1f}s...")
            await asyncio.sleep(backoff)
    raise RuntimeError("Failed to embed after retries")
 # First, we need to create the table with proper SQL
 CREATE_TABLE_SQL = """
 CREATE TABLE IF NOT EXISTS bible_passages (
  id               UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  testament        TEXT NOT NULL,
  book             TEXT NOT NULL,
  chapter          INT  NOT NULL,
  verse            INT  NOT NULL,
  ref              TEXT GENERATED ALWAYS AS (book || ' ' || chapter || ':' || verse) STORED,
  lang             TEXT NOT NULL DEFAULT 'ro',
  translation      TEXT NOT NULL DEFAULT 'FIDELA',
  text_raw         TEXT NOT NULL,
  text_norm        TEXT NOT NULL,
  tsv              tsvector,
  embedding        vector(1536),
  created_at       TIMESTAMPTZ DEFAULT now(),
  updated_at       TIMESTAMPTZ DEFAULT now()
 );
 """
 CREATE_INDEXES_SQL = """
 -- Uniqueness by canonical reference within translation/language
 CREATE UNIQUE INDEX IF NOT EXISTS ux_ref_lang ON bible_passages (translation, lang, book, chapter, verse);
 -- Full-text index
 CREATE INDEX IF NOT EXISTS idx_tsv ON bible_passages USING GIN (tsv);
 -- Other indexes
 CREATE INDEX IF NOT EXISTS idx_book_ch ON bible_passages (book, chapter);
 CREATE INDEX IF NOT EXISTS idx_testament ON bible_passages (testament);
 """
 UPSERT_SQL = """
 INSERT INTO bible_passages (testament, book, chapter, verse, lang, translation, text_raw, text_norm, tsv, embedding)
 VALUES (%(testament)s, %(book)s, %(chapter)s, %(verse)s, %(lang)s, %(translation)s, %(text_raw)s, %(text_norm)s,
        to_tsvector(COALESCE(%(ts_lang)s,'simple')::regconfig, %(text_norm)s), %(embedding)s)
 ON CONFLICT (translation, lang, book, chapter, verse) DO UPDATE
 SET text_raw=EXCLUDED.text_raw,
    text_norm=EXCLUDED.text_norm,
    tsv=EXCLUDED.tsv,
    embedding=EXCLUDED.embedding,
    updated_at=now();
 """
 async def main():
    print("Starting Bible embedding ingestion...")
    md_text = Path(BIBLE_MD_PATH).read_text(encoding="utf-8", errors="ignore")
    verses = list(parse_bible_md(md_text))
    print(f"Parsed verses: {len(verses)}")
    batch_size = 128
    # First create the table structure
    with psycopg.connect(DB_URL) as conn:
        with conn.cursor() as cur:
            print("Creating bible_passages table...")
            cur.execute("CREATE EXTENSION IF NOT EXISTS vector;")
            cur.execute(CREATE_TABLE_SQL)
            cur.execute(CREATE_INDEXES_SQL)
            conn.commit()
            print("Table created successfully")
    # Now process embeddings
    async with httpx.AsyncClient() as client:
        with psycopg.connect(DB_URL, autocommit=False) as conn:
            with conn.cursor() as cur:
                for i in range(0, len(verses), batch_size):
                    batch = verses[i:i+batch_size]
                    inputs = [v["text_norm"] for v in batch]
                    print(f"Generating embeddings for batch {i//batch_size + 1}/{(len(verses) + batch_size - 1)//batch_size}")
                    embs = await embed_batch(client, inputs)
                    rows = []
                    for v, e in zip(batch, embs):
                        rows.append({
                            **v,
                            "lang": LANG_CODE,
                            "translation": TRANSLATION,
                            "ts_lang": "romanian",
                            "embedding": e
                        })
                    cur.executemany(UPSERT_SQL, rows)
                    conn.commit()
                    print(f"Upserted {len(rows)} verses... {i+len(rows)}/{len(verses)}")
    # Create IVFFLAT index after data is loaded
    print("Creating IVFFLAT index...")
    with psycopg.connect(DB_URL, autocommit=True) as conn:
        with conn.cursor() as cur:
            cur.execute("VACUUM ANALYZE bible_passages;")
            cur.execute("""
                CREATE INDEX IF NOT EXISTS idx_vec_ivfflat
                ON bible_passages USING ivfflat (embedding vector_cosine_ops)
                WITH (lists = 200);
            """)
    print("✅ Bible embedding ingestion completed successfully!")
 if __name__ == "__main__":
    asyncio.run(main())
--- a/temp/azure-embed3-bible-pgvector-guide.md
+++ b/temp/azure-embed3-bible-pgvector-guide.md
@@ -0,0 +1,372 @@
 # Azure OpenAI **embed-3** → Postgres + pgvector Ingestion Guide (Bible Corpus)
 **Goal**: Create a production‑ready Python script that ingests the full Bible (Markdown source) into **Postgres** with **pgvector** and **full‑text** metadata, using **Azure OpenAI `embed-3`** embeddings. The vectors will power a consumer chat assistant (Q&A & conversations about the Bible) and a backend agent that generates custom prayers.
 > Sample corpus used here: Romanian *Biblia Fidela* (Markdown). Structure contains books, chapters, verses (e.g., *Geneza 1:1…*) and a TOC in the file. fileciteturn0file0
 ---
 ## 0) Architecture at a glance
 - **Input**: Bible in Markdown (`*.md`) → parser → normalized records: *(book, chapter, verse, text, lang=ro)*
 - **Embedding**: Azure OpenAI **embed-3** (prefer `text-embedding-3-large`, 3072‑D). Batch inputs to cut cost/latency.
 - **Storage**: Postgres with:
  - `pgvector` column `embedding vector(3072)`
  - `tsvector` column for hybrid lexical search (Romanian or English config as needed)
  - metadata columns for fast filtering (book, chapter, verse, testament, translation, language)
 - **Indexes**: `ivfflat` over `embedding`, GIN over `tsv` (and btree over metadata)
 - **Retrieval**:
  - Dense vector kNN
  - Hybrid: combine kNN score + BM25/tsvector
  - Windowed context stitching (neighbor verses) for chat
 - **Consumers**: 
  - Chat assistant: answer + cite (book:chapter:verse).
  - Prayer agent: prompt‑compose with retrieved passages & user intents.
 ---
 ## 1) Prerequisites
 ### Postgres + pgvector
 ```bash
 # Install pgvector (on Ubuntu)
 sudo apt-get update && sudo apt-get install -y postgresql postgresql-contrib
 # In psql as superuser:
 CREATE EXTENSION IF NOT EXISTS vector;
 ```
 ### Python deps
 ```bash
 python -m venv .venv && source .venv/bin/activate
 pip install psycopg[binary] pgvector pydantic python-dotenv httpx tqdm rapidfuzz
 ```
 > `httpx` for HTTP (async‑capable), `pgvector` adapter, `rapidfuzz` for optional de‑dup or heuristic joins, `tqdm` for progress.
 ### Azure OpenAI
 - Create **Embeddings** deployment for **`text-embedding-3-large`** (or `-small` if cost sensitive). Name it (e.g.) `embeddings`.
 - Collect:
  - `AZURE_OPENAI_ENDPOINT=https://<your>.openai.azure.com/`
  - `AZURE_OPENAI_API_KEY=...`
  - `AZURE_OPENAI_API_VERSION=2024-05-01-preview` *(or your current stable)*
  - `AZURE_OPENAI_EMBED_DEPLOYMENT=embeddings` *(your deployment name)*
 Create `.env`:
 ```env
 DATABASE_URL=postgresql://user:pass@localhost:5432/bible
 AZURE_OPENAI_ENDPOINT=https://YOUR_RESOURCE.openai.azure.com/
 AZURE_OPENAI_API_KEY=YOUR_KEY
 AZURE_OPENAI_API_VERSION=2024-05-01-preview
 AZURE_OPENAI_EMBED_DEPLOYMENT=embeddings
 EMBED_DIMS=3072
 BIBLE_MD_PATH=./Biblia-Fidela-limba-romana.md
 LANG_CODE=ro
 TRANSLATION_CODE=FIDELA
 ```
 ---
 ## 2) Database schema
 ```sql
 -- One-time setup in your database
 CREATE EXTENSION IF NOT EXISTS vector;
 CREATE TABLE IF NOT EXISTS bible_passages (
  id               BIGSERIAL PRIMARY KEY,
  testament        TEXT NOT NULL,           -- 'OT' or 'NT'
  book             TEXT NOT NULL,
  chapter          INT  NOT NULL,
  verse            INT  NOT NULL,
  ref              TEXT GENERATED ALWAYS AS (book || ' ' || chapter || ':' || verse) STORED,
  lang             TEXT NOT NULL DEFAULT 'ro',
  translation      TEXT NOT NULL DEFAULT 'FIDELA',
  text_raw         TEXT NOT NULL,           -- exact verse text
  text_norm        TEXT NOT NULL,           -- normalized/cleaned text (embedding input)
  tsv              tsvector,
  embedding        vector(3072),            -- 1536 if using embed-3-small
  created_at       TIMESTAMPTZ DEFAULT now(),
  updated_at       TIMESTAMPTZ DEFAULT now()
 );
 -- Uniqueness by canonical reference within translation/language
 CREATE UNIQUE INDEX IF NOT EXISTS ux_ref_lang ON bible_passages (translation, lang, book, chapter, verse);
 -- Full-text index (choose config; Romanian available if installed via ISPELL; else use 'simple' or 'english')
 -- If you have pg_catalog.romanian, use that. Else fallback to 'simple' but keep lexemes.
 CREATE INDEX IF NOT EXISTS idx_tsv ON bible_passages USING GIN (tsv);
 -- Vector index (choose nlist to match data size; we set after populating table)
 -- First create a flat index for small data, or IVFFLAT for scale:
 -- Requires ANALYZE beforehand and SET enable_seqscan=off for kNN plans.
 ```
 After loading, build the IVFFLAT index (the table must be populated first):
 ```sql
 -- Example: around 31k verses ⇒ nlist ~ 100–200 is reasonable; tune per EXPLAIN ANALYZE
 CREATE INDEX IF NOT EXISTS idx_vec_ivfflat
 ON bible_passages USING ivfflat (embedding vector_cosine_ops)
 WITH (lists = 200);
 ```
 Trigger to keep `updated_at` fresh:
 ```sql
 CREATE OR REPLACE FUNCTION touch_updated_at() RETURNS TRIGGER AS $$
 BEGIN NEW.updated_at = now(); RETURN NEW; END; $$ LANGUAGE plpgsql;
 DROP TRIGGER IF EXISTS trg_bible_updated ON bible_passages;
 CREATE TRIGGER trg_bible_updated BEFORE UPDATE ON bible_passages
 FOR EACH ROW EXECUTE PROCEDURE touch_updated_at();
 ```
 ---
 ## 3) Parsing & Chunking strategy (large, high‑quality)
 **Why verse‑level?** It’s the canonical granular unit for Bible QA.  
 **Context‑stitching**: during retrieval, fetch neighbor verses (±N) to maintain narrative continuity.  
 **Normalization** steps (for `text_norm`):
 - Strip verse numbers and sidenotes if present in raw lines.
 - Collapse whitespace, unify quotes, remove page headers/footers and TOC artifacts.
 - Preserve punctuation; avoid stemming before embeddings.
 - Lowercasing optional (OpenAI embeddings are case-robust).
 **Testament/book detection**: From headings and TOC present in the Markdown; detect Book → Chapter → Verse boundaries via regex.  
 Example regex heuristics (tune to your file):  
 - Book headers: `^(?P<book>[A-ZĂÂÎȘȚ].+?)\s*$` (bounded by known canon order)  
 - Chapter headers: `^Capitolul\s+(?P<ch>\d+)` or `^CApitoLuL\s+(?P<ch>\d+)` (case variations)  
 - Verse lines: `^(?P<verse>\d+)\s+(.+)$`
 > The provided Markdown clearly shows book order (e.g., *Geneza*, *Exodul*, …; NT: *Matei*, *Marcu*, …) and verse lines like “**1** LA început…”. fileciteturn0file0
 ---
 ## 4) Python ingestion script
 > **Save as** `ingest_bible_pgvector.py`
 ```python
 import os, re, json, math, time, asyncio
 from typing import List, Dict, Tuple, Iterable
 from dataclasses import dataclass
 from pathlib import Path
 from dotenv import load_dotenv
 import httpx
 import psycopg
 from psycopg.rows import dict_row
 load_dotenv()
 AZ_ENDPOINT   = os.getenv("AZURE_OPENAI_ENDPOINT", "").rstrip("/")
 AZ_API_KEY    = os.getenv("AZURE_OPENAI_API_KEY")
 AZ_API_VER    = os.getenv("AZURE_OPENAI_API_VERSION", "2024-05-01-preview")
 AZ_DEPLOYMENT = os.getenv("AZURE_OPENAI_EMBED_DEPLOYMENT", "embeddings")
 EMBED_DIMS    = int(os.getenv("EMBED_DIMS", "3072"))
 DB_URL        = os.getenv("DATABASE_URL")
 BIBLE_MD_PATH = os.getenv("BIBLE_MD_PATH")
 LANG_CODE     = os.getenv("LANG_CODE", "ro")
 TRANSLATION   = os.getenv("TRANSLATION_CODE", "FIDELA")
 assert AZ_ENDPOINT and AZ_API_KEY and DB_URL and BIBLE_MD_PATH, "Missing required env vars"
 EMBED_URL = f"{AZ_ENDPOINT}/openai/deployments/{AZ_DEPLOYMENT}/embeddings?api-version={AZ_API_VER}"
 BOOKS_OT = [
  "Geneza","Exodul","Leviticul","Numeri","Deuteronom","Iosua","Judecători","Rut",
  "1 Samuel","2 Samuel","1 Imparati","2 Imparati","1 Cronici","2 Cronici","Ezra","Neemia","Estera",
  "Iov","Psalmii","Proverbe","Eclesiastul","Cântarea Cântărilor","Isaia","Ieremia","Plângerile",
  "Ezechiel","Daniel","Osea","Ioel","Amos","Obadia","Iona","Mica","Naum","Habacuc","Țefania","Hagai","Zaharia","Maleahi"
 ]
 BOOKS_NT = [
  "Matei","Marcu","Luca","Ioan","Faptele Apostolilor","Romani","1 Corinteni","2 Corinteni",
  "Galateni","Efeseni","Filipeni","Coloseni","1 Tesaloniceni","2 Tesaloniceni","1 Timotei","2 Timotei",
  "Titus","Filimon","Evrei","Iacov","1 Petru","2 Petru","1 Ioan","2 Ioan","3 Ioan","Iuda","Revelaţia"
 ]
 BOOK_CANON = {b:("OT" if b in BOOKS_OT else "NT") for b in BOOKS_OT + BOOKS_NT}
@dataclass
 class Verse:
    testament: str
    book: str
    chapter: int
    verse: int
    text_raw: str
    text_norm: str
 def normalize_text(s: str) -> str:
    s = re.sub(r"\s+", " ", s.strip())
    s = s.replace("  ", " ")
    return s
 BOOK_RE   = re.compile(r"^(?P<book>[A-ZĂÂÎȘȚ][^\n]+?)\s*$")
 CH_RE     = re.compile(r"^(?i:Capitolul|CApitoLuL)\s+(?P<ch>\d+)\b")
 VERSE_RE  = re.compile(r"^(?P<v>\d+)\s+(?P<body>.+)$")
 def parse_bible_md(md_text: str):
    cur_book, cur_ch = None, None
    testament = None
    for line in md_text.splitlines():
        line = line.rstrip()
        # Book detection
        m_book = BOOK_RE.match(line)
        if m_book:
            bname = m_book.group("book").strip()
            if bname in BOOK_CANON:
                cur_book = bname
                testament = BOOK_CANON[bname]
                cur_ch = None
                continue
        m_ch = CH_RE.match(line)
        if m_ch and cur_book:
            cur_ch = int(m_ch.group("ch"))
            continue
        m_v = VERSE_RE.match(line)
        if m_v and cur_book and cur_ch:
            vnum = int(m_v.group("v"))
            body = m_v.group("body").strip()
            raw = body
            norm = normalize_text(body)
            yield {
                "testament": testament, "book": cur_book, "chapter": cur_ch, "verse": vnum,
                "text_raw": raw, "text_norm": norm
            }
 async def embed_batch(client, inputs):
    payload = {"input": inputs}
    headers = {"api-key": AZ_API_KEY, "Content-Type": "application/json"}
    for attempt in range(6):
        try:
            r = await client.post(EMBED_URL, headers=headers, json=payload, timeout=60)
            if r.status_code == 200:
                data = r.json()
                ordered = sorted(data["data"], key=lambda x: x["index"])
                return [d["embedding"] for d in ordered]
            elif r.status_code in (429, 500, 503):
                backoff = 2 ** attempt + (0.1 * attempt)
                await asyncio.sleep(backoff)
            else:
                raise RuntimeError(f"Embedding error {r.status_code}: {r.text}")
        except Exception:
            backoff = 2 ** attempt + (0.1 * attempt)
            await asyncio.sleep(backoff)
    raise RuntimeError("Failed to embed after retries")
 UPSERT_SQL = """
 INSERT INTO bible_passages (testament, book, chapter, verse, lang, translation, text_raw, text_norm, tsv, embedding)
 VALUES (%(testament)s, %(book)s, %(chapter)s, %(verse)s, %(lang)s, %(translation)s, %(text_raw)s, %(text_norm)s,
        to_tsvector(COALESCE(%(ts_lang)s,'simple')::regconfig, %(text_norm)s), %(embedding)s)
 ON CONFLICT (translation, lang, book, chapter, verse) DO UPDATE
 SET text_raw=EXCLUDED.text_raw,
    text_norm=EXCLUDED.text_norm,
    tsv=EXCLUDED.tsv,
    embedding=EXCLUDED.embedding,
    updated_at=now();
 """
 async def main():
    md_text = Path(BIBLE_MD_PATH).read_text(encoding="utf-8", errors="ignore")
    verses = list(parse_bible_md(md_text))
    print(f"Parsed verses: {len(verses)}")
    batch_size = 128
    async with httpx.AsyncClient() as client, psycopg.connect(DB_URL, autocommit=False) as conn:
        with conn.cursor() as cur:
            for i in range(0, len(verses), batch_size):
                batch = verses[i:i+batch_size]
                inputs = [v["text_norm"] for v in batch]
                embs = await embed_batch(client, inputs)
                rows = []
                for v, e in zip(batch, embs):
                    rows.append({
                        **v,
                        "lang": os.getenv("LANG_CODE","ro"),
                        "translation": os.getenv("TRANSLATION_CODE","FIDELA"),
                        "ts_lang": "romanian",
                        "embedding": e
                    })
                cur.executemany(UPSERT_SQL, rows)
                conn.commit()
                print(f"Upserted {len(rows)} … {i+len(rows)}/{len(verses)}")
    print("Done. Build IVFFLAT index after ANALYZE.")
 if __name__ == "__main__":
    import asyncio
    asyncio.run(main())
 ```
 **Notes**
 - If `romanian` text search config is unavailable, set `ts_lang='simple'`.
 - For `embed-3-small`, set `EMBED_DIMS=1536` and change column type to `vector(1536)`.
 ---
 ## 5) Post‑ingestion steps
 ```sql
 VACUUM ANALYZE bible_passages;
 CREATE INDEX IF NOT EXISTS idx_vec_ivfflat
 ON bible_passages USING ivfflat (embedding vector_cosine_ops)
 WITH (lists = 200);
 CREATE INDEX IF NOT EXISTS idx_book_ch ON bible_passages (book, chapter);
 ```
 ---
 ## 6) Retrieval patterns
 ### A) Pure vector kNN (cosine)
 ```sql
 SELECT ref, book, chapter, verse, text_raw,
       1 - (embedding <=> $1) AS cosine_sim
 FROM bible_passages
 ORDER BY embedding <=> $1
 LIMIT $2;
 ```
 ### B) Hybrid lexical + vector (weighted)
 ```sql
 WITH v AS (
  SELECT id, 1 - (embedding <=> $1) AS vsim
  FROM bible_passages
  ORDER BY embedding <=> $1
  LIMIT 100
 ),
 l AS (
  SELECT id, ts_rank(tsv, $2) AS lrank
  FROM bible_passages
  WHERE tsv @@ $2
 )
 SELECT bp.ref, bp.book, bp.chapter, bp.verse, bp.text_raw,
       COALESCE(v.vsim, 0) * 0.7 + COALESCE(l.lrank, 0) * 0.3 AS score
 FROM bible_passages bp
 LEFT JOIN v ON v.id = bp.id
 LEFT JOIN l ON l.id = bp.id
 ORDER BY score DESC
 LIMIT 20;
 ```
 ---
 ## 7) Chat & Prayer agent tips
 - **Answer grounding**: always cite `ref` (e.g., *Ioan 3:16*).
 - **Multilingual output**: keep quotes in Romanian; explain in the user’s language.
 - **Prayer agent**: constrain tone & doctrine; inject retrieved verses as anchors.
 ---
 ## 8) Ops
 - Idempotent `UPSERT`.
 - Backoff on 429/5xx.
 - Consider keeping both `embed-3-large` and `-small` columns when migrating.
 ---
 ## 9) License & attribution
 This guide references the structure of *Biblia Fidela* Markdown for ingestion demonstration. fileciteturn0file0