feat(ai-safety): Implement comprehensive AI Safety features

- Create AISafetyService with keyword detection for emergency, medical, crisis, developmental, and stress triggers - Add emergency response templates (911, poison control, medical disclaimer) - Add crisis hotline integration (988, Postpartum Support, Crisis Text Line, Childhelp) - Add medical disclaimer and developmental disclaimer templates - Add stress support resources for overwhelmed parents - Implement output safety checking for unsafe patterns (dosages, diagnoses) - Add safety response injection based on trigger type - Integrate safety checks into AI chat flow with immediate overrides for emergencies/crises - Add base safety prompt with critical safety rules and guardrails - Add medical and crisis safety override prompts - Enhance system prompt with safety guardrails dynamically based on query triggers - Export AISafetyService from AIModule for use in other modules - All safety metrics logged for monitoring dashboard (TODO: database storage) Safety coverage: ✅ Emergency keyword detection (not breathing, choking, seizure, etc.) ✅ Medical concern keywords (fever, vomiting, rash, medication, etc.) ✅ Crisis keywords (suicide, self-harm, PPD, abuse, etc.) ✅ Parental stress keywords (overwhelmed, burned out, isolated, etc.) ✅ Developmental concern keywords (delay, autism, ADHD, regression, etc.) ✅ Output moderation patterns (dosages, diagnoses, definitive medical statements) ✅ Crisis hotline templates with 4 major US resources ✅ Medical disclaimers with red flags and when to seek care ✅ Stress support with self-care reminders Tested: Backend compiles and runs successfully with 0 errors
2025-10-02 19:05:45 +00:00
parent b2f3551ccd
commit 9246d4b00d
4 changed files with 1103 additions and 5 deletions
--- a/AI_SAFETY_STRATEGY.md
+++ b/AI_SAFETY_STRATEGY.md
@@ -0,0 +1,517 @@
+# AI Safety Strategy - Maternal App
+
+**Purpose:** Ensure safe, responsible, and helpful AI interactions for parents seeking childcare guidance.
+
+**Last Updated:** October 2, 2025
+
+---
+
+## 1. Safety Principles
+
+### 1.1 Core Safety Values
+1. **Medical Disclaimer First** - Never provide medical diagnoses or emergency advice
+2. **Crisis Detection** - Recognize mental health crises and provide resources
+3. **Age-Appropriate** - Responses suitable for child ages 0-6 years
+4. **Evidence-Based** - Reference pediatric guidelines when possible
+5. **Non-Judgmental** - Support all parenting styles without criticism
+6. **Privacy-Focused** - Never request or store unnecessary medical information
+
+### 1.2 What AI SHOULD Do
+- ✅ Provide general parenting information and tips
+- ✅ Suggest routines and organizational strategies
+- ✅ Explain developmental milestones
+- ✅ Offer emotional support and encouragement
+- ✅ Direct to professional resources when appropriate
+- ✅ Help track and interpret patterns in child's data
+
+### 1.3 What AI MUST NOT Do
+- ❌ Diagnose medical conditions
+- ❌ Prescribe medications or dosages
+- ❌ Handle medical emergencies
+- ❌ Replace professional medical advice
+- ❌ Make definitive statements about child development delays
+- ❌ Encourage unsafe practices
+
+---
+
+## 2. Medical Disclaimer Triggers
+
+### 2.1 Emergency Keywords (Immediate Disclaimer)
+Trigger immediate medical disclaimer and emergency guidance:
+
+**Critical Symptoms:**
+- `emergency`, `911`, `ambulance`
+- `not breathing`, `can't breathe`, `choking`
+- `unconscious`, `unresponsive`, `passed out`
+- `seizure`, `convulsion`, `shaking uncontrollably`
+- `severe bleeding`, `blood loss`, `won't stop bleeding`
+- `severe burn`, `burned`, `scalded`
+- `poisoning`, `swallowed`, `ingested`
+- `head injury`, `fell`, `hit head`
+- `allergic reaction`, `anaphylaxis`, `swelling`
+
+**Response Template:**
+```
+⚠️ EMERGENCY DISCLAIMER ⚠️
+
+This appears to be a medical emergency. Please:
+
+1. Call emergency services immediately (911 in US)
+2. If child is not breathing, start CPR if trained
+3. Stay calm and follow dispatcher instructions
+
+I'm an AI assistant and cannot provide emergency medical guidance.
+Please seek immediate professional medical help.
+
+Emergency Resources:
+- US: 911
+- Poison Control: 1-800-222-1222
+- Nurse Hotline: [Local number]
+```
+
+### 2.2 Medical Concern Keywords (Soft Disclaimer)
+Trigger medical disclaimer but allow response with disclaimer:
+
+**Symptoms:**
+- `fever`, `temperature`, `hot`, `feverish`
+- `vomiting`, `throwing up`, `vomit`
+- `diarrhea`, `loose stools`, `watery stool`
+- `rash`, `spots`, `bumps`, `hives`
+- `cough`, `coughing`, `wheezing`
+- `ear infection`, `ear pain`, `ear ache`
+- `cold`, `flu`, `sick`, `illness`
+- `constipation`, `not pooping`, `hard stool`
+- `dehydration`, `not drinking`, `dry`
+- `injury`, `hurt`, `pain`, `ache`
+- `medication`, `medicine`, `dosage`, `dose`
+
+**Response Pattern:**
+```
+⚠️ Medical Disclaimer: I'm an AI assistant, not a medical professional.
+For medical concerns, always consult your pediatrician or healthcare provider.
+
+[General information response, if appropriate]
+
+When to seek immediate care:
+- [Specific red flags for the symptom]
+
+Consider calling your pediatrician if:
+- [Warning signs]
+```
+
+### 2.3 Developmental Concern Keywords
+- `delay`, `behind`, `not talking`, `not walking`
+- `autism`, `ADHD`, `disability`, `special needs`
+- `regression`, `lost skills`, `stopped doing`
+
+**Response Pattern:**
+```
+⚠️ Developmental Disclaimer: Every child develops at their own pace.
+If you have concerns, consult your pediatrician or early intervention specialist.
+
+[General milestone information]
+
+Resources:
+- CDC Milestone Tracker: [link]
+- Early Intervention Services: [link]
+```
+
+---
+
+## 3. Crisis Hotline Integration
+
+### 3.1 Mental Health Crisis Detection
+
+**Crisis Keywords:**
+- `suicide`, `kill myself`, `want to die`, `end it all`
+- `hurt myself`, `self-harm`, `cutting`
+- `can't cope`, `can't do this`, `giving up`
+- `hopeless`, `worthless`, `burden`
+- `postpartum depression`, `PPD`, `severe depression`
+- `psychosis`, `hallucinations`, `voices`
+- `abuse`, `hitting`, `hurting`, `violent`
+
+**Crisis Response Template:**
+```
+🆘 CRISIS SUPPORT 🆘
+
+It sounds like you're going through an incredibly difficult time.
+Your feelings are valid, and help is available.
+
+IMMEDIATE CRISIS RESOURCES:
+━━━━━━━━━━━━━━━━━━━━━━━━━
+🇺🇸 National Suicide Prevention Lifeline
+   📞 988 (call or text)
+   💬 Chat: 988lifeline.org
+
+🤱 Postpartum Support International
+   📞 1-800-944-4773
+   💬 Text "HELP" to 800-944-4773
+
+🆘 Crisis Text Line
+   💬 Text "HOME" to 741741
+
+👶 Childhelp National Child Abuse Hotline
+   📞 1-800-422-4453
+
+These services are:
+✓ Free and confidential
+✓ Available 24/7
+✓ Staffed by trained counselors
+✓ No judgment, only support
+
+You don't have to go through this alone. Please reach out.
+```
+
+### 3.2 Parental Stress Detection
+
+**Stress Keywords:**
+- `overwhelmed`, `stressed`, `exhausted`, `burned out`
+- `crying`, `breaking down`, `can't handle`
+- `no support`, `alone`, `isolated`
+- `angry at baby`, `resentful`, `regret`
+
+**Support Response:**
+```
+💙 Parenting is Hard 💙
+
+You're not alone in feeling this way. Many parents experience:
+- Exhaustion and burnout
+- Overwhelming emotions
+- Moments of frustration
+
+This doesn't make you a bad parent. It makes you human.
+
+Support Resources:
+🤱 Postpartum Support International: 1-800-944-4773
+📞 Parents Anonymous: 1-855-427-2736
+💬 Crisis Text Line: Text "PARENT" to 741741
+
+Self-Care Reminders:
+✓ Taking breaks is necessary, not selfish
+✓ Asking for help is a sign of strength
+✓ Your mental health matters too
+
+[Followed by relevant coping strategies if appropriate]
+```
+
+---
+
+## 4. Content Moderation & Filtering
+
+### 4.1 Input Moderation
+
+**Filter Out:**
+- Inappropriate sexual content
+- Violent or abusive language
+- Requests for illegal activities
+- Spam or commercial solicitation
+- Personal information requests (addresses, SSN, etc.)
+
+**Moderation Response:**
+```
+I'm here to provide parenting support and childcare guidance.
+This type of content falls outside my intended use.
+
+If you have questions about childcare for children ages 0-6,
+I'm happy to help!
+```
+
+### 4.2 Output Moderation
+
+**Before Sending AI Response, Check For:**
+- Medical advice (trigger disclaimer if present)
+- Potentially harmful suggestions
+- Contradictory information
+- Inappropriate content
+- Dosage recommendations (remove/disclaim)
+
+**Content Safety Checks:**
+```typescript
+const unsafePatterns = [
+  /\d+\s*(mg|ml|oz|tbsp|tsp)\s*(of|every|per)/i, // Dosage patterns
+  /give\s+(him|her|them|baby|child)\s+\d+/i,     // Specific instructions with amounts
+  /diagnose|diagnosis|you have|they have/i,      // Diagnostic language
+  /definitely|certainly\s+(is|has)/i,            // Definitive medical statements
+];
+```
+
+---
+
+## 5. Rate Limiting & Abuse Prevention
+
+### 5.1 Current Rate Limits
+- **Free Tier:** 10 AI queries per day
+- **Premium Tier:** Unlimited queries (with fair use policy)
+
+### 5.2 Enhanced Rate Limiting
+
+**Suspicious Pattern Detection:**
+- Same question repeated >3 times in 1 hour
+- Medical emergency keywords repeated >5 times in 1 day
+- Queries from multiple IPs for same user (account sharing)
+- Unusual query volume (>100/day even for premium)
+
+**Action on Suspicious Activity:**
+1. Temporary rate limit (1 query per hour for 24 hours)
+2. Flag account for manual review
+3. Send email notification to user
+4. Log to security monitoring
+
+### 5.3 Fair Use Policy (Premium)
+- Maximum 200 queries per day (reasonable for active use)
+- Queries should be related to childcare (0-6 years)
+- No commercial use or data scraping
+- No sharing of account credentials
+
+---
+
+## 6. System Prompt Guardrails
+
+### 6.1 Base System Prompt (Always Included)
+
+```
+You are a helpful AI assistant for the Maternal App, designed to support
+parents of children aged 0-6 years with childcare organization and guidance.
+
+CRITICAL SAFETY RULES:
+1. You are NOT a medical professional. Never diagnose or prescribe.
+2. For medical emergencies, ALWAYS direct to call 911 immediately.
+3. For medical concerns, ALWAYS recommend consulting a pediatrician.
+4. Recognize mental health crises and provide crisis hotline resources.
+5. Be supportive and non-judgmental of all parenting approaches.
+6. Focus on evidence-based information from reputable sources (AAP, CDC, WHO).
+7. Never provide specific medication dosages.
+8. If asked about serious developmental delays, refer to professionals.
+
+TONE:
+- Warm, empathetic, and encouraging
+- Clear and concise
+- Non-judgmental and inclusive
+- Supportive but honest
+
+SCOPE:
+- Childcare for ages 0-6 years
+- Routines, tracking, organization
+- General parenting tips and strategies
+- Emotional support for parents
+- Interpretation of tracked activity patterns
+
+OUT OF SCOPE:
+- Medical diagnosis or treatment
+- Legal advice
+- Financial planning
+- Relationship counseling (beyond parenting partnership)
+- Children outside 0-6 age range
+```
+
+### 6.2 Context-Specific Safety Additions
+
+**When Medical Keywords Detected:**
+```
+MEDICAL SAFETY OVERRIDE:
+This query contains medical concerns. Your response MUST:
+1. Start with a medical disclaimer
+2. Avoid definitive statements
+3. Recommend professional consultation
+4. Provide general information only
+5. Include "when to seek immediate care" guidance
+```
+
+**When Crisis Keywords Detected:**
+```
+CRISIS RESPONSE OVERRIDE:
+This query indicates a potential crisis. Your response MUST:
+1. Acknowledge their feelings without judgment
+2. Provide immediate crisis hotline numbers
+3. Encourage reaching out for professional help
+4. Be brief and focus on resources
+5. Do NOT provide coping strategies that could be misinterpreted
+```
+
+---
+
+## 7. Monitoring & Analytics
+
+### 7.1 Safety Metrics to Track
+
+**Daily Monitoring:**
+- Number of medical disclaimer triggers
+- Number of crisis hotline responses sent
+- Number of emergency keyword detections
+- Number of queries blocked by content filter
+- Average response moderation score
+
+**Weekly Review:**
+- Trending medical concerns
+- Common crisis keywords
+- False positive rate for disclaimers
+- User feedback on safety responses
+
+### 7.2 Alert Thresholds
+
+**Immediate Alerts:**
+- >10 emergency keywords in 1 hour (potential abuse or real emergency pattern)
+- >5 crisis keywords from single user in 24 hours (high-risk user)
+- Content filter blocking >50 queries/day (attack or misconfiguration)
+
+**Daily Review Alerts:**
+- >100 medical disclaimers triggered in 24 hours
+- >20 crisis responses in 24 hours
+- Pattern of similar medical queries (potential misinformation spread)
+
+---
+
+## 8. User Education
+
+### 8.1 First-Time User Guidance
+
+**On First AI Query, Show:**
+```
+👋 Welcome to Your AI Parenting Assistant!
+
+I'm here to help with:
+✓ Childcare tips and organization
+✓ Understanding your baby's patterns
+✓ Answering general parenting questions
+✓ Providing encouragement and support
+
+Important to Know:
+⚠️ I'm NOT a medical professional
+⚠️ Always consult your pediatrician for medical concerns
+⚠️ Call 911 for emergencies
+
+For the best experience:
+💡 Ask specific questions about routines, tracking, or parenting tips
+💡 Share your child's age for age-appropriate guidance
+💡 Use your activity tracking data for personalized insights
+
+Let's get started! How can I help you today?
+```
+
+### 8.2 In-App Safety Reminders
+
+**Random Safety Tips (1 in 20 responses):**
+- "Remember: I'm an AI assistant, not a doctor. Always consult your pediatrician for medical advice."
+- "Having a tough day? It's okay to ask for help. Check out our Resources section."
+- "Emergency? Call 911 immediately. I cannot provide emergency medical guidance."
+
+---
+
+## 9. Implementation Checklist
+
+### 9.1 Backend Implementation
+- [ ] Create `ai-safety.service.ts` with keyword detection
+- [ ] Implement medical disclaimer injection
+- [ ] Add crisis hotline response generator
+- [ ] Enhance content moderation filters
+- [ ] Update system prompts with safety guardrails
+- [ ] Add safety metrics logging
+- [ ] Create safety monitoring dashboard
+
+### 9.2 Frontend Implementation
+- [ ] Add safety disclaimer modal on first AI use
+- [ ] Display crisis hotlines prominently when triggered
+- [ ] Show medical disclaimer badges
+- [ ] Add "Report Unsafe Response" button
+- [ ] Create AI Safety FAQ page
+
+### 9.3 Testing
+- [ ] Unit tests for keyword detection (100+ test cases)
+- [ ] Integration tests for crisis response flow
+- [ ] Manual testing of all disclaimer triggers
+- [ ] User acceptance testing with parents
+- [ ] Safety audit by medical professional (recommended)
+
+### 9.4 Documentation
+- [ ] User-facing AI Safety policy
+- [ ] Developer documentation for safety service
+- [ ] Incident response playbook
+- [ ] Safety metric reporting template
+
+---
+
+## 10. Incident Response
+
+### 10.1 If Harmful Response is Reported
+
+**Immediate Actions (Within 1 hour):**
+1. Flag conversation in database
+2. Review full conversation history
+3. Identify failure point (keyword detection, content filter, prompt)
+4. Temporarily disable AI for affected user if serious
+5. Notify safety team
+
+**Follow-up Actions (Within 24 hours):**
+1. Root cause analysis
+2. Update keyword lists or filters
+3. Test fix with similar queries
+4. Document incident and resolution
+5. Update user with resolution
+
+**Communication:**
+- Acknowledge report within 1 hour
+- Provide resolution timeline
+- Follow up when fixed
+- Offer support resources if appropriate
+
+### 10.2 Emergency Escalation
+
+**Escalate to Management If:**
+- User reports following harmful AI advice and experiencing negative outcome
+- Multiple similar harmful responses in short timeframe
+- Media or regulatory inquiry about AI safety
+- Suicidal ideation not properly handled by crisis detection
+
+---
+
+## 11. Regular Safety Audits
+
+### 11.1 Monthly Audits
+- Review 100 random AI conversations
+- Test all crisis and medical keyword triggers
+- Analyze false positive/negative rates
+- Update keyword lists based on new patterns
+- Review user reports and feedback
+
+### 11.2 Quarterly Reviews
+- External safety audit (if possible)
+- Update crisis hotline numbers (verify still active)
+- Review and update safety policies
+- Train team on new safety features
+- Benchmark against industry standards
+
+---
+
+## 12. Compliance & Legal
+
+### 12.1 Disclaimers (Required)
+- [ ] Terms of Service mention AI limitations
+- [ ] Privacy Policy covers AI data usage
+- [ ] Medical disclaimer in AI interface
+- [ ] Crisis resources easily accessible
+
+### 12.2 Recommended Legal Review
+- AI response liability
+- Medical advice disclaimers sufficiency
+- Crisis response appropriateness
+- Age-appropriate content standards
+
+---
+
+## Status: READY FOR IMPLEMENTATION
+
+**Priority:** HIGH - User Safety Critical
+**Estimated Implementation Time:** 3-5 days
+**Testing Time:** 2 days
+**Total:** 5-7 days with testing
+
+**Next Steps:**
+1. Implement `AISafetyService` with keyword detection
+2. Update `AIService` to integrate safety checks
+3. Add crisis hotline database/configuration
+4. Create frontend safety components
+5. Comprehensive testing with real scenarios
+6. User documentation
+
+This strategy ensures the Maternal App AI is helpful, supportive, and above all, SAFE for parents seeking guidance.