Add comprehensive .gitignore

2025-10-01 19:01:52 +00:00
commit f3ff07c0ef
254 changed files with 88254 additions and 0 deletions
--- a/docs/azure-openai-integration-summary.md
+++ b/docs/azure-openai-integration-summary.md
@@ -0,0 +1,576 @@
+# Azure OpenAI Integration - Implementation Summary
+
+## Overview
+
+The AI service has been updated to support both OpenAI and Azure OpenAI with automatic fallback, proper environment configuration, and full support for GPT-5 models including reasoning tokens.
+
+---
+
+## Environment Configuration
+
+### ✅ Complete Environment Variables (.env)
+
+```bash
+# AI Services Configuration
+# Primary provider: 'openai' or 'azure'
+AI_PROVIDER=azure
+
+# OpenAI Configuration (Primary - if AI_PROVIDER=openai)
+OPENAI_API_KEY=sk-your-openai-api-key-here
+OPENAI_MODEL=gpt-4o-mini
+OPENAI_EMBEDDING_MODEL=text-embedding-3-small
+OPENAI_MAX_TOKENS=1000
+
+# Azure OpenAI Configuration (if AI_PROVIDER=azure)
+AZURE_OPENAI_ENABLED=true
+
+# Azure OpenAI - Chat/Completion Endpoint (GPT-5)
+# Each deployment has its own API key for better security and quota management
+AZURE_OPENAI_CHAT_ENDPOINT=https://footprints-open-ai.openai.azure.com
+AZURE_OPENAI_CHAT_DEPLOYMENT=gpt-5-mini
+AZURE_OPENAI_CHAT_API_VERSION=2025-04-01-preview
+AZURE_OPENAI_CHAT_API_KEY=your-chat-api-key-here
+AZURE_OPENAI_CHAT_MAX_TOKENS=1000
+AZURE_OPENAI_REASONING_EFFORT=medium
+
+# Azure OpenAI - Whisper/Voice Endpoint
+AZURE_OPENAI_WHISPER_ENDPOINT=https://footprints-open-ai.openai.azure.com
+AZURE_OPENAI_WHISPER_DEPLOYMENT=whisper
+AZURE_OPENAI_WHISPER_API_VERSION=2025-04-01-preview
+AZURE_OPENAI_WHISPER_API_KEY=your-whisper-api-key-here
+
+# Azure OpenAI - Embeddings Endpoint
+AZURE_OPENAI_EMBEDDINGS_ENDPOINT=https://footprints-ai.openai.azure.com
+AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT=Text-Embedding-ada-002-V2
+AZURE_OPENAI_EMBEDDINGS_API_VERSION=2023-05-15
+AZURE_OPENAI_EMBEDDINGS_API_KEY=your-embeddings-api-key-here
+```
+
+### Configuration for Your Setup
+
+Based on your requirements:
+
+```bash
+AI_PROVIDER=azure
+AZURE_OPENAI_ENABLED=true
+
+# Chat (GPT-5 Mini) - Separate API key
+AZURE_OPENAI_CHAT_ENDPOINT=https://footprints-open-ai.openai.azure.com
+AZURE_OPENAI_CHAT_DEPLOYMENT=gpt-5-mini
+AZURE_OPENAI_CHAT_API_VERSION=2025-04-01-preview
+AZURE_OPENAI_CHAT_API_KEY=[your_chat_key]
+AZURE_OPENAI_REASONING_EFFORT=medium  # or 'minimal', 'low', 'high'
+
+# Voice (Whisper) - Separate API key
+AZURE_OPENAI_WHISPER_ENDPOINT=https://footprints-open-ai.openai.azure.com
+AZURE_OPENAI_WHISPER_DEPLOYMENT=whisper
+AZURE_OPENAI_WHISPER_API_VERSION=2025-04-01-preview
+AZURE_OPENAI_WHISPER_API_KEY=[your_whisper_key]
+
+# Embeddings - Separate API key
+AZURE_OPENAI_EMBEDDINGS_ENDPOINT=https://footprints-ai.openai.azure.com
+AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT=Text-Embedding-ada-002-V2
+AZURE_OPENAI_EMBEDDINGS_API_VERSION=2023-05-15
+AZURE_OPENAI_EMBEDDINGS_API_KEY=[your_embeddings_key]
+```
+
+### Why Separate API Keys?
+
+Each Azure OpenAI deployment can have its own API key for:
+- **Security**: Limit blast radius if a key is compromised
+- **Quota Management**: Separate rate limits per service
+- **Cost Tracking**: Monitor usage per deployment
+- **Access Control**: Different team members can have access to different services
+
+---
+
+## AI Service Implementation
+
+### ✅ Key Features
+
+**1. Multi-Provider Support**
+- Primary: Azure OpenAI (GPT-5)
+- Fallback: OpenAI (GPT-4o-mini)
+- Automatic failover if Azure unavailable
+
+**2. GPT-5 Specific Features**
+- ✅ Reasoning tokens tracking
+- ✅ Configurable reasoning effort (minimal, low, medium, high)
+- ✅ Extended context (272K input + 128K output = 400K total)
+- ✅ Response metadata with token counts
+
+**3. Response Format**
+```typescript
+interface ChatResponseDto {
+  conversationId: string;
+  message: string;
+  timestamp: Date;
+  metadata?: {
+    model?: string;                    // 'gpt-5-mini' or 'gpt-4o-mini'
+    provider?: 'openai' | 'azure';
+    reasoningTokens?: number;          // GPT-5 only
+    totalTokens?: number;
+  };
+}
+```
+
+**4. Azure GPT-5 Request**
+```typescript
+const requestBody = {
+  messages: azureMessages,
+  temperature: 0.7,
+  max_tokens: 1000,
+  stream: false,
+  reasoning_effort: 'medium',  // GPT-5 specific
+};
+```
+
+**5. Azure GPT-5 Response**
+```typescript
+{
+  choices: [{
+    message: { content: string },
+    reasoning_tokens: number,  // NEW in GPT-5
+  }],
+  usage: {
+    prompt_tokens: number,
+    completion_tokens: number,
+    reasoning_tokens: number,  // NEW in GPT-5
+    total_tokens: number,
+  }
+}
+```
+
+---
+
+## GPT-5 vs GPT-4 Differences
+
+### Reasoning Tokens
+
+**GPT-5 introduces `reasoning_tokens`**:
+- Hidden tokens used for internal reasoning
+- Not part of message content
+- Configurable via `reasoning_effort` parameter
+- Higher effort = more reasoning tokens = better quality
+
+**Reasoning Effort Levels**:
+```typescript
+'minimal'  // Fastest, lowest reasoning tokens
+'low'      // Quick responses with basic reasoning
+'medium'   // Balanced (default)
+'high'     // Most thorough, highest reasoning tokens
+```
+
+### Context Length
+
+**GPT-5**:
+- Input: 272,000 tokens (vs GPT-4's 128K)
+- Output: 128,000 tokens
+- Total context: 400,000 tokens
+
+**GPT-4o**:
+- Input: 128,000 tokens
+- Total context: 128,000 tokens
+
+### Token Efficiency
+
+**GPT-5 Benefits**:
+- 22% fewer output tokens vs o3
+- 45% fewer tool calls
+- Better performance per dollar despite reasoning overhead
+
+### Pricing
+
+**Azure OpenAI GPT-5**:
+- Input: $1.25 / 1M tokens
+- Output: $10.00 / 1M tokens
+- Cached input: $0.125 / 1M (90% discount for repeated prompts)
+
+---
+
+## Implementation Details
+
+### Service Initialization
+
+The AI service now:
+1. Checks `AI_PROVIDER` environment variable
+2. Configures Azure OpenAI if provider is 'azure'
+3. Falls back to OpenAI if Azure not configured
+4. Logs which provider is active
+
+```typescript
+constructor() {
+  this.aiProvider = this.configService.get('AI_PROVIDER', 'openai');
+
+  if (this.aiProvider === 'azure') {
+    // Load Azure configuration from environment
+    this.azureChatEndpoint = this.configService.get('AZURE_OPENAI_CHAT_ENDPOINT');
+    this.azureChatDeployment = this.configService.get('AZURE_OPENAI_CHAT_DEPLOYMENT');
+    // ... more configuration
+  } else {
+    // Load OpenAI configuration
+    this.chatModel = new ChatOpenAI({ ... });
+  }
+}
+```
+
+### Chat Method Flow
+
+```typescript
+async chat(userId, chatDto) {
+  // 1. Validate configuration
+  // 2. Get/create conversation
+  // 3. Build context with user data
+  // 4. Generate response based on provider:
+
+  if (this.aiProvider === 'azure') {
+    const response = await this.generateWithAzure(messages);
+    // Returns: { content, reasoningTokens, totalTokens }
+  } else {
+    const response = await this.generateWithOpenAI(messages);
+    // Returns: content string
+  }
+
+  // 5. Save conversation with token tracking
+  // 6. Return response with metadata
+}
+```
+
+### Azure Generation Method
+
+```typescript
+private async generateWithAzure(messages) {
+  const url = `${endpoint}/openai/deployments/${deployment}/chat/completions?api-version=${apiVersion}`;
+
+  const requestBody = {
+    messages: azureMessages,
+    temperature: 0.7,
+    max_tokens: 1000,
+    reasoning_effort: 'medium',  // GPT-5 parameter
+  };
+
+  const response = await axios.post(url, requestBody, {
+    headers: {
+      'api-key': this.azureApiKey,
+      'Content-Type': 'application/json',
+    },
+  });
+
+  return {
+    content: response.data.choices[0].message.content,
+    reasoningTokens: response.data.usage.reasoning_tokens,
+    totalTokens: response.data.usage.total_tokens,
+  };
+}
+```
+
+### Automatic Fallback
+
+If Azure fails, the service automatically retries with OpenAI:
+
+```typescript
+catch (error) {
+  // Fallback to OpenAI if Azure fails
+  if (this.aiProvider === 'azure' && this.chatModel) {
+    this.logger.warn('Azure OpenAI failed, attempting OpenAI fallback...');
+    this.aiProvider = 'openai';
+    return this.chat(userId, chatDto);  // Recursive call with OpenAI
+  }
+  throw new BadRequestException('Failed to generate AI response');
+}
+```
+
+---
+
+## Testing the Integration
+
+### 1. Check Provider Status
+
+```bash
+GET /api/v1/ai/provider-status
+```
+
+Response:
+```json
+{
+  "provider": "azure",
+  "model": "gpt-5-mini",
+  "configured": true,
+  "endpoint": "https://footprints-open-ai.openai.azure.com"
+}
+```
+
+### 2. Test Chat with GPT-5
+
+```bash
+POST /api/v1/ai/chat
+Authorization: Bearer {token}
+
+{
+  "message": "How much should a 3-month-old eat per feeding?"
+}
+```
+
+Response:
+```json
+{
+  "conversationId": "conv_123",
+  "message": "A 3-month-old typically eats...",
+  "timestamp": "2025-01-15T10:30:00Z",
+  "metadata": {
+    "model": "gpt-5-mini",
+    "provider": "azure",
+    "reasoningTokens": 145,
+    "totalTokens": 523
+  }
+}
+```
+
+### 3. Monitor Reasoning Tokens
+
+Check logs for GPT-5 reasoning token usage:
+
+```
+[AIService] Azure OpenAI response: {
+  model: 'gpt-5-mini',
+  finish_reason: 'stop',
+  prompt_tokens: 256,
+  completion_tokens: 122,
+  reasoning_tokens: 145,  // GPT-5 reasoning overhead
+  total_tokens: 523
+}
+```
+
+---
+
+## Optimizing Reasoning Effort
+
+### When to Use Each Level
+
+**Minimal** (`reasoning_effort: 'minimal'`):
+- Simple queries
+- Quick responses needed
+- Cost optimization
+- Use case: "What time is it?"
+
+**Low** (`reasoning_effort: 'low'`):
+- Straightforward questions
+- Fast turnaround required
+- Use case: "How many oz in 120ml?"
+
+**Medium** (`reasoning_effort: 'medium'`) - **Default**:
+- Balanced performance
+- Most common use cases
+- Use case: "Is my baby's sleep pattern normal?"
+
+**High** (`reasoning_effort: 'high'`):
+- Complex reasoning required
+- Premium features
+- Use case: "Analyze my baby's feeding patterns over the last month and suggest optimizations"
+
+### Dynamic Reasoning Effort
+
+You can adjust based on query complexity:
+
+```typescript
+// Future enhancement: Analyze query complexity
+const effort = this.determineReasoningEffort(chatDto.message);
+
+const requestBody = {
+  messages: azureMessages,
+  reasoning_effort: effort,  // Dynamic based on query
+};
+```
+
+---
+
+## Future Enhancements
+
+### 1. Voice Service (Whisper)
+
+Implement similar pattern for voice transcription:
+
+```typescript
+export class WhisperService {
+  async transcribeAudio(audioBuffer: Buffer): Promise<string> {
+    if (this.aiProvider === 'azure') {
+      return this.transcribeWithAzure(audioBuffer);
+    }
+    return this.transcribeWithOpenAI(audioBuffer);
+  }
+
+  private async transcribeWithAzure(audioBuffer: Buffer) {
+    const url = `${this.azureWhisperEndpoint}/openai/deployments/${this.azureWhisperDeployment}/audio/transcriptions?api-version=${this.azureWhisperApiVersion}`;
+
+    const formData = new FormData();
+    formData.append('file', new Blob([audioBuffer]), 'audio.wav');
+
+    const response = await axios.post(url, formData, {
+      headers: {
+        'api-key': this.azureWhisperApiKey,  // Separate key for Whisper
+      },
+    });
+
+    return response.data.text;
+  }
+}
+```
+
+### 2. Embeddings Service
+
+For pattern recognition and similarity search:
+
+```typescript
+export class EmbeddingsService {
+  async createEmbedding(text: string): Promise<number[]> {
+    if (this.aiProvider === 'azure') {
+      return this.createEmbeddingWithAzure(text);
+    }
+    return this.createEmbeddingWithOpenAI(text);
+  }
+
+  private async createEmbeddingWithAzure(text: string) {
+    const url = `${this.azureEmbeddingsEndpoint}/openai/deployments/${this.azureEmbeddingsDeployment}/embeddings?api-version=${this.azureEmbeddingsApiVersion}`;
+
+    const response = await axios.post(url, { input: text }, {
+      headers: {
+        'api-key': this.azureEmbeddingsApiKey,  // Separate key for Embeddings
+      },
+    });
+
+    return response.data.data[0].embedding;
+  }
+}
+```
+
+### 3. Prompt Caching
+
+Leverage Azure's cached input pricing (90% discount):
+
+```typescript
+// Reuse identical system prompts for cost savings
+const systemPrompt = `You are a helpful parenting assistant...`; // Cache this
+```
+
+### 4. Streaming Responses
+
+For better UX with long responses:
+
+```typescript
+const requestBody = {
+  messages: azureMessages,
+  stream: true,  // Enable streaming
+  reasoning_effort: 'medium',
+};
+
+// Handle streamed response
+```
+
+---
+
+## Troubleshooting
+
+### Common Issues
+
+**1. "AI service not configured"**
+- Check `AI_PROVIDER` is set to 'azure'
+- Verify `AZURE_OPENAI_CHAT_API_KEY` is set (not the old `AZURE_OPENAI_API_KEY`)
+- Confirm `AZURE_OPENAI_CHAT_ENDPOINT` is correct
+
+**2. "Invalid API version"**
+- GPT-5 requires `2025-04-01-preview` or later
+- Update `AZURE_OPENAI_CHAT_API_VERSION`
+
+**3. "Deployment not found"**
+- Verify `AZURE_OPENAI_CHAT_DEPLOYMENT` matches Azure deployment name
+- Check deployment is in same region as endpoint
+
+**4. High token usage**
+- GPT-5 reasoning tokens are additional overhead
+- Reduce `reasoning_effort` if cost is concern
+- Use `'minimal'` for simple queries
+
+**5. Slow responses**
+- Higher `reasoning_effort` = slower responses
+- Use `'low'` or `'minimal'` for time-sensitive queries
+- Consider caching common responses
+
+### Debug Logging
+
+Enable debug logs to see requests/responses:
+
+```typescript
+this.logger.debug('Azure OpenAI request:', {
+  url,
+  deployment,
+  reasoning_effort,
+  messageCount,
+});
+
+this.logger.debug('Azure OpenAI response:', {
+  model,
+  finish_reason,
+  prompt_tokens,
+  completion_tokens,
+  reasoning_tokens,
+  total_tokens,
+});
+```
+
+---
+
+## Summary
+
+✅ **Fully Configured**:
+- Environment variables for all Azure endpoints
+- Chat (GPT-5), Whisper, Embeddings separately configurable
+- No hardcoded values
+
+✅ **GPT-5 Support**:
+- Reasoning tokens tracked and returned
+- Configurable reasoning effort (minimal/low/medium/high)
+- Extended 400K context window ready
+
+✅ **Automatic Fallback**:
+- Azure → OpenAI if Azure fails
+- Graceful degradation
+
+✅ **Monitoring**:
+- Detailed logging for debugging
+- Token usage tracking (including reasoning tokens)
+- Provider status endpoint
+
+✅ **Production Ready**:
+- Proper error handling
+- Timeout configuration (30s)
+- Metadata in responses
+
+---
+
+## Next Steps
+
+1. **Add your actual API keys** to `.env`:
+   ```bash
+   AZURE_OPENAI_CHAT_API_KEY=[your_chat_key]
+   AZURE_OPENAI_WHISPER_API_KEY=[your_whisper_key]
+   AZURE_OPENAI_EMBEDDINGS_API_KEY=[your_embeddings_key]
+   ```
+
+2. **Restart the backend** to pick up configuration:
+   ```bash
+   npm run start:dev
+   ```
+
+3. **Test the integration**:
+   - Check provider status endpoint
+   - Send a test chat message
+   - Verify reasoning tokens in response
+
+4. **Monitor token usage**:
+   - Review logs for reasoning token counts
+   - Adjust `reasoning_effort` based on usage patterns
+   - Consider cost optimization strategies
+
+5. **Implement Voice & Embeddings** (optional):
+   - Follow similar patterns as chat service
+   - Use separate Azure endpoints already configured