# AI Safety Strategy - Maternal App **Purpose:** Ensure safe, responsible, and helpful AI interactions for parents seeking childcare guidance. **Last Updated:** October 2, 2025 --- ## 1. Safety Principles ### 1.1 Core Safety Values 1. **Medical Disclaimer First** - Never provide medical diagnoses or emergency advice 2. **Crisis Detection** - Recognize mental health crises and provide resources 3. **Age-Appropriate** - Responses suitable for child ages 0-6 years 4. **Evidence-Based** - Reference pediatric guidelines when possible 5. **Non-Judgmental** - Support all parenting styles without criticism 6. **Privacy-Focused** - Never request or store unnecessary medical information ### 1.2 What AI SHOULD Do - ✅ Provide general parenting information and tips - ✅ Suggest routines and organizational strategies - ✅ Explain developmental milestones - ✅ Offer emotional support and encouragement - ✅ Direct to professional resources when appropriate - ✅ Help track and interpret patterns in child's data ### 1.3 What AI MUST NOT Do - ❌ Diagnose medical conditions - ❌ Prescribe medications or dosages - ❌ Handle medical emergencies - ❌ Replace professional medical advice - ❌ Make definitive statements about child development delays - ❌ Encourage unsafe practices --- ## 2. Medical Disclaimer Triggers ### 2.1 Emergency Keywords (Immediate Disclaimer) Trigger immediate medical disclaimer and emergency guidance: **Critical Symptoms:** - `emergency`, `911`, `ambulance` - `not breathing`, `can't breathe`, `choking` - `unconscious`, `unresponsive`, `passed out` - `seizure`, `convulsion`, `shaking uncontrollably` - `severe bleeding`, `blood loss`, `won't stop bleeding` - `severe burn`, `burned`, `scalded` - `poisoning`, `swallowed`, `ingested` - `head injury`, `fell`, `hit head` - `allergic reaction`, `anaphylaxis`, `swelling` **Response Template:** ``` ⚠️ EMERGENCY DISCLAIMER ⚠️ This appears to be a medical emergency. Please: 1. Call emergency services immediately (911 in US) 2. If child is not breathing, start CPR if trained 3. Stay calm and follow dispatcher instructions I'm an AI assistant and cannot provide emergency medical guidance. Please seek immediate professional medical help. Emergency Resources: - US: 911 - Poison Control: 1-800-222-1222 - Nurse Hotline: [Local number] ``` ### 2.2 Medical Concern Keywords (Soft Disclaimer) Trigger medical disclaimer but allow response with disclaimer: **Symptoms:** - `fever`, `temperature`, `hot`, `feverish` - `vomiting`, `throwing up`, `vomit` - `diarrhea`, `loose stools`, `watery stool` - `rash`, `spots`, `bumps`, `hives` - `cough`, `coughing`, `wheezing` - `ear infection`, `ear pain`, `ear ache` - `cold`, `flu`, `sick`, `illness` - `constipation`, `not pooping`, `hard stool` - `dehydration`, `not drinking`, `dry` - `injury`, `hurt`, `pain`, `ache` - `medication`, `medicine`, `dosage`, `dose` **Response Pattern:** ``` ⚠️ Medical Disclaimer: I'm an AI assistant, not a medical professional. For medical concerns, always consult your pediatrician or healthcare provider. [General information response, if appropriate] When to seek immediate care: - [Specific red flags for the symptom] Consider calling your pediatrician if: - [Warning signs] ``` ### 2.3 Developmental Concern Keywords - `delay`, `behind`, `not talking`, `not walking` - `autism`, `ADHD`, `disability`, `special needs` - `regression`, `lost skills`, `stopped doing` **Response Pattern:** ``` ⚠️ Developmental Disclaimer: Every child develops at their own pace. If you have concerns, consult your pediatrician or early intervention specialist. [General milestone information] Resources: - CDC Milestone Tracker: [link] - Early Intervention Services: [link] ``` --- ## 3. Crisis Hotline Integration ### 3.1 Mental Health Crisis Detection **Crisis Keywords:** - `suicide`, `kill myself`, `want to die`, `end it all` - `hurt myself`, `self-harm`, `cutting` - `can't cope`, `can't do this`, `giving up` - `hopeless`, `worthless`, `burden` - `postpartum depression`, `PPD`, `severe depression` - `psychosis`, `hallucinations`, `voices` - `abuse`, `hitting`, `hurting`, `violent` **Crisis Response Template:** ``` 🆘 CRISIS SUPPORT 🆘 It sounds like you're going through an incredibly difficult time. Your feelings are valid, and help is available. IMMEDIATE CRISIS RESOURCES: ━━━━━━━━━━━━━━━━━━━━━━━━━ 🇺🇸 National Suicide Prevention Lifeline 📞 988 (call or text) 💬 Chat: 988lifeline.org 🤱 Postpartum Support International 📞 1-800-944-4773 💬 Text "HELP" to 800-944-4773 🆘 Crisis Text Line 💬 Text "HOME" to 741741 👶 Childhelp National Child Abuse Hotline 📞 1-800-422-4453 These services are: ✓ Free and confidential ✓ Available 24/7 ✓ Staffed by trained counselors ✓ No judgment, only support You don't have to go through this alone. Please reach out. ``` ### 3.2 Parental Stress Detection **Stress Keywords:** - `overwhelmed`, `stressed`, `exhausted`, `burned out` - `crying`, `breaking down`, `can't handle` - `no support`, `alone`, `isolated` - `angry at baby`, `resentful`, `regret` **Support Response:** ``` 💙 Parenting is Hard 💙 You're not alone in feeling this way. Many parents experience: - Exhaustion and burnout - Overwhelming emotions - Moments of frustration This doesn't make you a bad parent. It makes you human. Support Resources: 🤱 Postpartum Support International: 1-800-944-4773 📞 Parents Anonymous: 1-855-427-2736 💬 Crisis Text Line: Text "PARENT" to 741741 Self-Care Reminders: ✓ Taking breaks is necessary, not selfish ✓ Asking for help is a sign of strength ✓ Your mental health matters too [Followed by relevant coping strategies if appropriate] ``` --- ## 4. Content Moderation & Filtering ### 4.1 Input Moderation **Filter Out:** - Inappropriate sexual content - Violent or abusive language - Requests for illegal activities - Spam or commercial solicitation - Personal information requests (addresses, SSN, etc.) **Moderation Response:** ``` I'm here to provide parenting support and childcare guidance. This type of content falls outside my intended use. If you have questions about childcare for children ages 0-6, I'm happy to help! ``` ### 4.2 Output Moderation **Before Sending AI Response, Check For:** - Medical advice (trigger disclaimer if present) - Potentially harmful suggestions - Contradictory information - Inappropriate content - Dosage recommendations (remove/disclaim) **Content Safety Checks:** ```typescript const unsafePatterns = [ /\d+\s*(mg|ml|oz|tbsp|tsp)\s*(of|every|per)/i, // Dosage patterns /give\s+(him|her|them|baby|child)\s+\d+/i, // Specific instructions with amounts /diagnose|diagnosis|you have|they have/i, // Diagnostic language /definitely|certainly\s+(is|has)/i, // Definitive medical statements ]; ``` --- ## 5. Rate Limiting & Abuse Prevention ### 5.1 Current Rate Limits - **Free Tier:** 10 AI queries per day - **Premium Tier:** Unlimited queries (with fair use policy) ### 5.2 Enhanced Rate Limiting **Suspicious Pattern Detection:** - Same question repeated >3 times in 1 hour - Medical emergency keywords repeated >5 times in 1 day - Queries from multiple IPs for same user (account sharing) - Unusual query volume (>100/day even for premium) **Action on Suspicious Activity:** 1. Temporary rate limit (1 query per hour for 24 hours) 2. Flag account for manual review 3. Send email notification to user 4. Log to security monitoring ### 5.3 Fair Use Policy (Premium) - Maximum 200 queries per day (reasonable for active use) - Queries should be related to childcare (0-6 years) - No commercial use or data scraping - No sharing of account credentials --- ## 6. System Prompt Guardrails ### 6.1 Base System Prompt (Always Included) ``` You are a helpful AI assistant for the Maternal App, designed to support parents of children aged 0-6 years with childcare organization and guidance. CRITICAL SAFETY RULES: 1. You are NOT a medical professional. Never diagnose or prescribe. 2. For medical emergencies, ALWAYS direct to call 911 immediately. 3. For medical concerns, ALWAYS recommend consulting a pediatrician. 4. Recognize mental health crises and provide crisis hotline resources. 5. Be supportive and non-judgmental of all parenting approaches. 6. Focus on evidence-based information from reputable sources (AAP, CDC, WHO). 7. Never provide specific medication dosages. 8. If asked about serious developmental delays, refer to professionals. TONE: - Warm, empathetic, and encouraging - Clear and concise - Non-judgmental and inclusive - Supportive but honest SCOPE: - Childcare for ages 0-6 years - Routines, tracking, organization - General parenting tips and strategies - Emotional support for parents - Interpretation of tracked activity patterns OUT OF SCOPE: - Medical diagnosis or treatment - Legal advice - Financial planning - Relationship counseling (beyond parenting partnership) - Children outside 0-6 age range ``` ### 6.2 Context-Specific Safety Additions **When Medical Keywords Detected:** ``` MEDICAL SAFETY OVERRIDE: This query contains medical concerns. Your response MUST: 1. Start with a medical disclaimer 2. Avoid definitive statements 3. Recommend professional consultation 4. Provide general information only 5. Include "when to seek immediate care" guidance ``` **When Crisis Keywords Detected:** ``` CRISIS RESPONSE OVERRIDE: This query indicates a potential crisis. Your response MUST: 1. Acknowledge their feelings without judgment 2. Provide immediate crisis hotline numbers 3. Encourage reaching out for professional help 4. Be brief and focus on resources 5. Do NOT provide coping strategies that could be misinterpreted ``` --- ## 7. Monitoring & Analytics ### 7.1 Safety Metrics to Track **Daily Monitoring:** - Number of medical disclaimer triggers - Number of crisis hotline responses sent - Number of emergency keyword detections - Number of queries blocked by content filter - Average response moderation score **Weekly Review:** - Trending medical concerns - Common crisis keywords - False positive rate for disclaimers - User feedback on safety responses ### 7.2 Alert Thresholds **Immediate Alerts:** - >10 emergency keywords in 1 hour (potential abuse or real emergency pattern) - >5 crisis keywords from single user in 24 hours (high-risk user) - Content filter blocking >50 queries/day (attack or misconfiguration) **Daily Review Alerts:** - >100 medical disclaimers triggered in 24 hours - >20 crisis responses in 24 hours - Pattern of similar medical queries (potential misinformation spread) --- ## 8. User Education ### 8.1 First-Time User Guidance **On First AI Query, Show:** ``` 👋 Welcome to Your AI Parenting Assistant! I'm here to help with: ✓ Childcare tips and organization ✓ Understanding your baby's patterns ✓ Answering general parenting questions ✓ Providing encouragement and support Important to Know: ⚠️ I'm NOT a medical professional ⚠️ Always consult your pediatrician for medical concerns ⚠️ Call 911 for emergencies For the best experience: 💡 Ask specific questions about routines, tracking, or parenting tips 💡 Share your child's age for age-appropriate guidance 💡 Use your activity tracking data for personalized insights Let's get started! How can I help you today? ``` ### 8.2 In-App Safety Reminders **Random Safety Tips (1 in 20 responses):** - "Remember: I'm an AI assistant, not a doctor. Always consult your pediatrician for medical advice." - "Having a tough day? It's okay to ask for help. Check out our Resources section." - "Emergency? Call 911 immediately. I cannot provide emergency medical guidance." --- ## 9. Implementation Checklist ### 9.1 Backend Implementation - [ ] Create `ai-safety.service.ts` with keyword detection - [ ] Implement medical disclaimer injection - [ ] Add crisis hotline response generator - [ ] Enhance content moderation filters - [ ] Update system prompts with safety guardrails - [ ] Add safety metrics logging - [ ] Create safety monitoring dashboard ### 9.2 Frontend Implementation - [ ] Add safety disclaimer modal on first AI use - [ ] Display crisis hotlines prominently when triggered - [ ] Show medical disclaimer badges - [ ] Add "Report Unsafe Response" button - [ ] Create AI Safety FAQ page ### 9.3 Testing - [ ] Unit tests for keyword detection (100+ test cases) - [ ] Integration tests for crisis response flow - [ ] Manual testing of all disclaimer triggers - [ ] User acceptance testing with parents - [ ] Safety audit by medical professional (recommended) ### 9.4 Documentation - [ ] User-facing AI Safety policy - [ ] Developer documentation for safety service - [ ] Incident response playbook - [ ] Safety metric reporting template --- ## 10. Incident Response ### 10.1 If Harmful Response is Reported **Immediate Actions (Within 1 hour):** 1. Flag conversation in database 2. Review full conversation history 3. Identify failure point (keyword detection, content filter, prompt) 4. Temporarily disable AI for affected user if serious 5. Notify safety team **Follow-up Actions (Within 24 hours):** 1. Root cause analysis 2. Update keyword lists or filters 3. Test fix with similar queries 4. Document incident and resolution 5. Update user with resolution **Communication:** - Acknowledge report within 1 hour - Provide resolution timeline - Follow up when fixed - Offer support resources if appropriate ### 10.2 Emergency Escalation **Escalate to Management If:** - User reports following harmful AI advice and experiencing negative outcome - Multiple similar harmful responses in short timeframe - Media or regulatory inquiry about AI safety - Suicidal ideation not properly handled by crisis detection --- ## 11. Regular Safety Audits ### 11.1 Monthly Audits - Review 100 random AI conversations - Test all crisis and medical keyword triggers - Analyze false positive/negative rates - Update keyword lists based on new patterns - Review user reports and feedback ### 11.2 Quarterly Reviews - External safety audit (if possible) - Update crisis hotline numbers (verify still active) - Review and update safety policies - Train team on new safety features - Benchmark against industry standards --- ## 12. Compliance & Legal ### 12.1 Disclaimers (Required) - [ ] Terms of Service mention AI limitations - [ ] Privacy Policy covers AI data usage - [ ] Medical disclaimer in AI interface - [ ] Crisis resources easily accessible ### 12.2 Recommended Legal Review - AI response liability - Medical advice disclaimers sufficiency - Crisis response appropriateness - Age-appropriate content standards --- ## Status: READY FOR IMPLEMENTATION **Priority:** HIGH - User Safety Critical **Estimated Implementation Time:** 3-5 days **Testing Time:** 2 days **Total:** 5-7 days with testing **Next Steps:** 1. Implement `AISafetyService` with keyword detection 2. Update `AIService` to integrate safety checks 3. Add crisis hotline database/configuration 4. Create frontend safety components 5. Comprehensive testing with real scenarios 6. User documentation This strategy ensures the Maternal App AI is helpful, supportive, and above all, SAFE for parents seeking guidance.