Fix Next.js 15 Suspense boundary errors and remove Docker files

- Add Suspense boundaries for useSearchParams in bible and search pages - Extract search page content to separate component for proper client-side rendering - Remove all Docker configuration files and temp documentation - Build now passes successfully without TypeScript or Suspense errors 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-09-22 14:26:23 +00:00
parent 98c17d69bc
commit 1257e9bdf7
14 changed files with 1000 additions and 4293 deletions
--- a/temp/azure-embed3-bible-pgvector-guide.md
+++ b/temp/azure-embed3-bible-pgvector-guide.md
@@ -1,372 +0,0 @@
-# Azure OpenAI **embed-3** → Postgres + pgvector Ingestion Guide (Bible Corpus)
-
-**Goal**: Create a production‑ready Python script that ingests the full Bible (Markdown source) into **Postgres** with **pgvector** and **full‑text** metadata, using **Azure OpenAI `embed-3`** embeddings. The vectors will power a consumer chat assistant (Q&A & conversations about the Bible) and a backend agent that generates custom prayers.
-
-> Sample corpus used here: Romanian *Biblia Fidela* (Markdown). Structure contains books, chapters, verses (e.g., *Geneza 1:1…*) and a TOC in the file. fileciteturn0file0
-
---
-
-## 0) Architecture at a glance
-
- **Input**: Bible in Markdown (`*.md`) → parser → normalized records: *(book, chapter, verse, text, lang=ro)*
- **Embedding**: Azure OpenAI **embed-3** (prefer `text-embedding-3-large`, 3072‑D). Batch inputs to cut cost/latency.
- **Storage**: Postgres with:
-  - `pgvector` column `embedding vector(3072)`
-  - `tsvector` column for hybrid lexical search (Romanian or English config as needed)
-  - metadata columns for fast filtering (book, chapter, verse, testament, translation, language)
- **Indexes**: `ivfflat` over `embedding`, GIN over `tsv` (and btree over metadata)
- **Retrieval**:
-  - Dense vector kNN
-  - Hybrid: combine kNN score + BM25/tsvector
-  - Windowed context stitching (neighbor verses) for chat
- **Consumers**: 
-  - Chat assistant: answer + cite (book:chapter:verse).
-  - Prayer agent: prompt‑compose with retrieved passages & user intents.
-
---
-
-## 1) Prerequisites
-
-### Postgres + pgvector
-```bash
-# Install pgvector (on Ubuntu)
-sudo apt-get update && sudo apt-get install -y postgresql postgresql-contrib
-# In psql as superuser:
-CREATE EXTENSION IF NOT EXISTS vector;
-```
-
-### Python deps
-```bash
-python -m venv .venv && source .venv/bin/activate
-pip install psycopg[binary] pgvector pydantic python-dotenv httpx tqdm rapidfuzz
-```
-
-> `httpx` for HTTP (async‑capable), `pgvector` adapter, `rapidfuzz` for optional de‑dup or heuristic joins, `tqdm` for progress.
-
-### Azure OpenAI
- Create **Embeddings** deployment for **`text-embedding-3-large`** (or `-small` if cost sensitive). Name it (e.g.) `embeddings`.
- Collect:
-  - `AZURE_OPENAI_ENDPOINT=https://<your>.openai.azure.com/`
-  - `AZURE_OPENAI_API_KEY=...`
-  - `AZURE_OPENAI_API_VERSION=2024-05-01-preview` *(or your current stable)*
-  - `AZURE_OPENAI_EMBED_DEPLOYMENT=embeddings` *(your deployment name)*
-
-Create `.env`:
-```env
-DATABASE_URL=postgresql://user:pass@localhost:5432/bible
-AZURE_OPENAI_ENDPOINT=https://YOUR_RESOURCE.openai.azure.com/
-AZURE_OPENAI_API_KEY=YOUR_KEY
-AZURE_OPENAI_API_VERSION=2024-05-01-preview
-AZURE_OPENAI_EMBED_DEPLOYMENT=embeddings
-EMBED_DIMS=3072
-BIBLE_MD_PATH=./Biblia-Fidela-limba-romana.md
-LANG_CODE=ro
-TRANSLATION_CODE=FIDELA
-```
-
---
-
-## 2) Database schema
-
-```sql
-- One-time setup in your database
-CREATE EXTENSION IF NOT EXISTS vector;
-
-CREATE TABLE IF NOT EXISTS bible_passages (
-  id               BIGSERIAL PRIMARY KEY,
-  testament        TEXT NOT NULL,           -- 'OT' or 'NT'
-  book             TEXT NOT NULL,
-  chapter          INT  NOT NULL,
-  verse            INT  NOT NULL,
-  ref              TEXT GENERATED ALWAYS AS (book || ' ' || chapter || ':' || verse) STORED,
-  lang             TEXT NOT NULL DEFAULT 'ro',
-  translation      TEXT NOT NULL DEFAULT 'FIDELA',
-  text_raw         TEXT NOT NULL,           -- exact verse text
-  text_norm        TEXT NOT NULL,           -- normalized/cleaned text (embedding input)
-  tsv              tsvector,
-  embedding        vector(3072),            -- 1536 if using embed-3-small
-  created_at       TIMESTAMPTZ DEFAULT now(),
-  updated_at       TIMESTAMPTZ DEFAULT now()
-);
-
-- Uniqueness by canonical reference within translation/language
-CREATE UNIQUE INDEX IF NOT EXISTS ux_ref_lang ON bible_passages (translation, lang, book, chapter, verse);
-
-- Full-text index (choose config; Romanian available if installed via ISPELL; else use 'simple' or 'english')
-- If you have pg_catalog.romanian, use that. Else fallback to 'simple' but keep lexemes.
-CREATE INDEX IF NOT EXISTS idx_tsv ON bible_passages USING GIN (tsv);
-
-- Vector index (choose nlist to match data size; we set after populating table)
-- First create a flat index for small data, or IVFFLAT for scale:
-- Requires ANALYZE beforehand and SET enable_seqscan=off for kNN plans.
-```
-
-After loading, build the IVFFLAT index (the table must be populated first):
-```sql
-- Example: around 31k verses ⇒ nlist ~ 100–200 is reasonable; tune per EXPLAIN ANALYZE
-CREATE INDEX IF NOT EXISTS idx_vec_ivfflat
-ON bible_passages USING ivfflat (embedding vector_cosine_ops)
-WITH (lists = 200);
-```
-
-Trigger to keep `updated_at` fresh:
-```sql
-CREATE OR REPLACE FUNCTION touch_updated_at() RETURNS TRIGGER AS $$
-BEGIN NEW.updated_at = now(); RETURN NEW; END; $$ LANGUAGE plpgsql;
-
-DROP TRIGGER IF EXISTS trg_bible_updated ON bible_passages;
-CREATE TRIGGER trg_bible_updated BEFORE UPDATE ON bible_passages
-FOR EACH ROW EXECUTE PROCEDURE touch_updated_at();
-```
-
---
-
-## 3) Parsing & Chunking strategy (large, high‑quality)
-
-**Why verse‑level?** It’s the canonical granular unit for Bible QA.  
-**Context‑stitching**: during retrieval, fetch neighbor verses (±N) to maintain narrative continuity.  
-**Normalization** steps (for `text_norm`):
- Strip verse numbers and sidenotes if present in raw lines.
- Collapse whitespace, unify quotes, remove page headers/footers and TOC artifacts.
- Preserve punctuation; avoid stemming before embeddings.
- Lowercasing optional (OpenAI embeddings are case-robust).
-
-**Testament/book detection**: From headings and TOC present in the Markdown; detect Book → Chapter → Verse boundaries via regex.  
-Example regex heuristics (tune to your file):  
- Book headers: `^(?P<book>[A-ZĂÂÎȘȚ].+?)\s*$` (bounded by known canon order)  
- Chapter headers: `^Capitolul\s+(?P<ch>\d+)` or `^CApitoLuL\s+(?P<ch>\d+)` (case variations)  
- Verse lines: `^(?P<verse>\d+)\s+(.+)$`
-
-> The provided Markdown clearly shows book order (e.g., *Geneza*, *Exodul*, …; NT: *Matei*, *Marcu*, …) and verse lines like “**1** LA început…”. fileciteturn0file0
-
---
-
-## 4) Python ingestion script
-
-> **Save as** `ingest_bible_pgvector.py`
-
-```python
-import os, re, json, math, time, asyncio
-from typing import List, Dict, Tuple, Iterable
-from dataclasses import dataclass
-from pathlib import Path
-from dotenv import load_dotenv
-import httpx
-import psycopg
-from psycopg.rows import dict_row
-
-load_dotenv()
-
-AZ_ENDPOINT   = os.getenv("AZURE_OPENAI_ENDPOINT", "").rstrip("/")
-AZ_API_KEY    = os.getenv("AZURE_OPENAI_API_KEY")
-AZ_API_VER    = os.getenv("AZURE_OPENAI_API_VERSION", "2024-05-01-preview")
-AZ_DEPLOYMENT = os.getenv("AZURE_OPENAI_EMBED_DEPLOYMENT", "embeddings")
-EMBED_DIMS    = int(os.getenv("EMBED_DIMS", "3072"))
-DB_URL        = os.getenv("DATABASE_URL")
-BIBLE_MD_PATH = os.getenv("BIBLE_MD_PATH")
-LANG_CODE     = os.getenv("LANG_CODE", "ro")
-TRANSLATION   = os.getenv("TRANSLATION_CODE", "FIDELA")
-
-assert AZ_ENDPOINT and AZ_API_KEY and DB_URL and BIBLE_MD_PATH, "Missing required env vars"
-
-EMBED_URL = f"{AZ_ENDPOINT}/openai/deployments/{AZ_DEPLOYMENT}/embeddings?api-version={AZ_API_VER}"
-
-BOOKS_OT = [
-  "Geneza","Exodul","Leviticul","Numeri","Deuteronom","Iosua","Judecători","Rut",
-  "1 Samuel","2 Samuel","1 Imparati","2 Imparati","1 Cronici","2 Cronici","Ezra","Neemia","Estera",
-  "Iov","Psalmii","Proverbe","Eclesiastul","Cântarea Cântărilor","Isaia","Ieremia","Plângerile",
-  "Ezechiel","Daniel","Osea","Ioel","Amos","Obadia","Iona","Mica","Naum","Habacuc","Țefania","Hagai","Zaharia","Maleahi"
-]
-BOOKS_NT = [
-  "Matei","Marcu","Luca","Ioan","Faptele Apostolilor","Romani","1 Corinteni","2 Corinteni",
-  "Galateni","Efeseni","Filipeni","Coloseni","1 Tesaloniceni","2 Tesaloniceni","1 Timotei","2 Timotei",
-  "Titus","Filimon","Evrei","Iacov","1 Petru","2 Petru","1 Ioan","2 Ioan","3 Ioan","Iuda","Revelaţia"
-]
-
-BOOK_CANON = {b:("OT" if b in BOOKS_OT else "NT") for b in BOOKS_OT + BOOKS_NT}
-
-@dataclass
-class Verse:
-    testament: str
-    book: str
-    chapter: int
-    verse: int
-    text_raw: str
-    text_norm: str
-
-def normalize_text(s: str) -> str:
-    s = re.sub(r"\s+", " ", s.strip())
-    s = s.replace("  ", " ")
-    return s
-
-BOOK_RE   = re.compile(r"^(?P<book>[A-ZĂÂÎȘȚ][^\n]+?)\s*$")
-CH_RE     = re.compile(r"^(?i:Capitolul|CApitoLuL)\s+(?P<ch>\d+)\b")
-VERSE_RE  = re.compile(r"^(?P<v>\d+)\s+(?P<body>.+)$")
-
-def parse_bible_md(md_text: str):
-    cur_book, cur_ch = None, None
-    testament = None
-    for line in md_text.splitlines():
-        line = line.rstrip()
-
-        # Book detection
-        m_book = BOOK_RE.match(line)
-        if m_book:
-            bname = m_book.group("book").strip()
-            if bname in BOOK_CANON:
-                cur_book = bname
-                testament = BOOK_CANON[bname]
-                cur_ch = None
-                continue
-
-        m_ch = CH_RE.match(line)
-        if m_ch and cur_book:
-            cur_ch = int(m_ch.group("ch"))
-            continue
-
-        m_v = VERSE_RE.match(line)
-        if m_v and cur_book and cur_ch:
-            vnum = int(m_v.group("v"))
-            body = m_v.group("body").strip()
-            raw = body
-            norm = normalize_text(body)
-            yield {
-                "testament": testament, "book": cur_book, "chapter": cur_ch, "verse": vnum,
-                "text_raw": raw, "text_norm": norm
-            }
-
-async def embed_batch(client, inputs):
-    payload = {"input": inputs}
-    headers = {"api-key": AZ_API_KEY, "Content-Type": "application/json"}
-    for attempt in range(6):
-        try:
-            r = await client.post(EMBED_URL, headers=headers, json=payload, timeout=60)
-            if r.status_code == 200:
-                data = r.json()
-                ordered = sorted(data["data"], key=lambda x: x["index"])
-                return [d["embedding"] for d in ordered]
-            elif r.status_code in (429, 500, 503):
-                backoff = 2 ** attempt + (0.1 * attempt)
-                await asyncio.sleep(backoff)
-            else:
-                raise RuntimeError(f"Embedding error {r.status_code}: {r.text}")
-        except Exception:
-            backoff = 2 ** attempt + (0.1 * attempt)
-            await asyncio.sleep(backoff)
-    raise RuntimeError("Failed to embed after retries")
-
-UPSERT_SQL = """
-INSERT INTO bible_passages (testament, book, chapter, verse, lang, translation, text_raw, text_norm, tsv, embedding)
-VALUES (%(testament)s, %(book)s, %(chapter)s, %(verse)s, %(lang)s, %(translation)s, %(text_raw)s, %(text_norm)s,
-        to_tsvector(COALESCE(%(ts_lang)s,'simple')::regconfig, %(text_norm)s), %(embedding)s)
-ON CONFLICT (translation, lang, book, chapter, verse) DO UPDATE
-SET text_raw=EXCLUDED.text_raw,
-    text_norm=EXCLUDED.text_norm,
-    tsv=EXCLUDED.tsv,
-    embedding=EXCLUDED.embedding,
-    updated_at=now();
-"""
-
-async def main():
-    md_text = Path(BIBLE_MD_PATH).read_text(encoding="utf-8", errors="ignore")
-    verses = list(parse_bible_md(md_text))
-    print(f"Parsed verses: {len(verses)}")
-
-    batch_size = 128
-    async with httpx.AsyncClient() as client, psycopg.connect(DB_URL, autocommit=False) as conn:
-        with conn.cursor() as cur:
-            for i in range(0, len(verses), batch_size):
-                batch = verses[i:i+batch_size]
-                inputs = [v["text_norm"] for v in batch]
-                embs = await embed_batch(client, inputs)
-                rows = []
-                for v, e in zip(batch, embs):
-                    rows.append({
-                        **v,
-                        "lang": os.getenv("LANG_CODE","ro"),
-                        "translation": os.getenv("TRANSLATION_CODE","FIDELA"),
-                        "ts_lang": "romanian",
-                        "embedding": e
-                    })
-                cur.executemany(UPSERT_SQL, rows)
-                conn.commit()
-                print(f"Upserted {len(rows)} … {i+len(rows)}/{len(verses)}")
-    print("Done. Build IVFFLAT index after ANALYZE.")
-
-if __name__ == "__main__":
-    import asyncio
-    asyncio.run(main())
-```
-
-**Notes**
- If `romanian` text search config is unavailable, set `ts_lang='simple'`.
- For `embed-3-small`, set `EMBED_DIMS=1536` and change column type to `vector(1536)`.
-
---
-
-## 5) Post‑ingestion steps
-
-```sql
-VACUUM ANALYZE bible_passages;
-CREATE INDEX IF NOT EXISTS idx_vec_ivfflat
-ON bible_passages USING ivfflat (embedding vector_cosine_ops)
-WITH (lists = 200);
-CREATE INDEX IF NOT EXISTS idx_book_ch ON bible_passages (book, chapter);
-```
-
---
-
-## 6) Retrieval patterns
-
-### A) Pure vector kNN (cosine)
-```sql
-SELECT ref, book, chapter, verse, text_raw,
-       1 - (embedding <=> $1) AS cosine_sim
-FROM bible_passages
-ORDER BY embedding <=> $1
-LIMIT $2;
-```
-
-### B) Hybrid lexical + vector (weighted)
-```sql
-WITH v AS (
-  SELECT id, 1 - (embedding <=> $1) AS vsim
-  FROM bible_passages
-  ORDER BY embedding <=> $1
-  LIMIT 100
-),
-l AS (
-  SELECT id, ts_rank(tsv, $2) AS lrank
-  FROM bible_passages
-  WHERE tsv @@ $2
-)
-SELECT bp.ref, bp.book, bp.chapter, bp.verse, bp.text_raw,
-       COALESCE(v.vsim, 0) * 0.7 + COALESCE(l.lrank, 0) * 0.3 AS score
-FROM bible_passages bp
-LEFT JOIN v ON v.id = bp.id
-LEFT JOIN l ON l.id = bp.id
-ORDER BY score DESC
-LIMIT 20;
-```
-
---
-
-## 7) Chat & Prayer agent tips
-
- **Answer grounding**: always cite `ref` (e.g., *Ioan 3:16*).
- **Multilingual output**: keep quotes in Romanian; explain in the user’s language.
- **Prayer agent**: constrain tone & doctrine; inject retrieved verses as anchors.
-
---
-
-## 8) Ops
-
- Idempotent `UPSERT`.
- Backoff on 429/5xx.
- Consider keeping both `embed-3-large` and `-small` columns when migrating.
-
---
-
-## 9) License & attribution
-
-This guide references the structure of *Biblia Fidela* Markdown for ingestion demonstration. fileciteturn0file0