- Create AISafetyService with keyword detection for emergency, medical, crisis, developmental, and stress triggers - Add emergency response templates (911, poison control, medical disclaimer) - Add crisis hotline integration (988, Postpartum Support, Crisis Text Line, Childhelp) - Add medical disclaimer and developmental disclaimer templates - Add stress support resources for overwhelmed parents - Implement output safety checking for unsafe patterns (dosages, diagnoses) - Add safety response injection based on trigger type - Integrate safety checks into AI chat flow with immediate overrides for emergencies/crises - Add base safety prompt with critical safety rules and guardrails - Add medical and crisis safety override prompts - Enhance system prompt with safety guardrails dynamically based on query triggers - Export AISafetyService from AIModule for use in other modules - All safety metrics logged for monitoring dashboard (TODO: database storage) Safety coverage: ✅ Emergency keyword detection (not breathing, choking, seizure, etc.) ✅ Medical concern keywords (fever, vomiting, rash, medication, etc.) ✅ Crisis keywords (suicide, self-harm, PPD, abuse, etc.) ✅ Parental stress keywords (overwhelmed, burned out, isolated, etc.) ✅ Developmental concern keywords (delay, autism, ADHD, regression, etc.) ✅ Output moderation patterns (dosages, diagnoses, definitive medical statements) ✅ Crisis hotline templates with 4 major US resources ✅ Medical disclaimers with red flags and when to seek care ✅ Stress support with self-care reminders Tested: Backend compiles and runs successfully with 0 errors
15 KiB
AI Safety Strategy - Maternal App
Purpose: Ensure safe, responsible, and helpful AI interactions for parents seeking childcare guidance.
Last Updated: October 2, 2025
1. Safety Principles
1.1 Core Safety Values
- Medical Disclaimer First - Never provide medical diagnoses or emergency advice
- Crisis Detection - Recognize mental health crises and provide resources
- Age-Appropriate - Responses suitable for child ages 0-6 years
- Evidence-Based - Reference pediatric guidelines when possible
- Non-Judgmental - Support all parenting styles without criticism
- Privacy-Focused - Never request or store unnecessary medical information
1.2 What AI SHOULD Do
- ✅ Provide general parenting information and tips
- ✅ Suggest routines and organizational strategies
- ✅ Explain developmental milestones
- ✅ Offer emotional support and encouragement
- ✅ Direct to professional resources when appropriate
- ✅ Help track and interpret patterns in child's data
1.3 What AI MUST NOT Do
- ❌ Diagnose medical conditions
- ❌ Prescribe medications or dosages
- ❌ Handle medical emergencies
- ❌ Replace professional medical advice
- ❌ Make definitive statements about child development delays
- ❌ Encourage unsafe practices
2. Medical Disclaimer Triggers
2.1 Emergency Keywords (Immediate Disclaimer)
Trigger immediate medical disclaimer and emergency guidance:
Critical Symptoms:
emergency,911,ambulancenot breathing,can't breathe,chokingunconscious,unresponsive,passed outseizure,convulsion,shaking uncontrollablysevere bleeding,blood loss,won't stop bleedingsevere burn,burned,scaldedpoisoning,swallowed,ingestedhead injury,fell,hit headallergic reaction,anaphylaxis,swelling
Response Template:
⚠️ EMERGENCY DISCLAIMER ⚠️
This appears to be a medical emergency. Please:
1. Call emergency services immediately (911 in US)
2. If child is not breathing, start CPR if trained
3. Stay calm and follow dispatcher instructions
I'm an AI assistant and cannot provide emergency medical guidance.
Please seek immediate professional medical help.
Emergency Resources:
- US: 911
- Poison Control: 1-800-222-1222
- Nurse Hotline: [Local number]
2.2 Medical Concern Keywords (Soft Disclaimer)
Trigger medical disclaimer but allow response with disclaimer:
Symptoms:
fever,temperature,hot,feverishvomiting,throwing up,vomitdiarrhea,loose stools,watery stoolrash,spots,bumps,hivescough,coughing,wheezingear infection,ear pain,ear achecold,flu,sick,illnessconstipation,not pooping,hard stooldehydration,not drinking,dryinjury,hurt,pain,achemedication,medicine,dosage,dose
Response Pattern:
⚠️ Medical Disclaimer: I'm an AI assistant, not a medical professional.
For medical concerns, always consult your pediatrician or healthcare provider.
[General information response, if appropriate]
When to seek immediate care:
- [Specific red flags for the symptom]
Consider calling your pediatrician if:
- [Warning signs]
2.3 Developmental Concern Keywords
delay,behind,not talking,not walkingautism,ADHD,disability,special needsregression,lost skills,stopped doing
Response Pattern:
⚠️ Developmental Disclaimer: Every child develops at their own pace.
If you have concerns, consult your pediatrician or early intervention specialist.
[General milestone information]
Resources:
- CDC Milestone Tracker: [link]
- Early Intervention Services: [link]
3. Crisis Hotline Integration
3.1 Mental Health Crisis Detection
Crisis Keywords:
suicide,kill myself,want to die,end it allhurt myself,self-harm,cuttingcan't cope,can't do this,giving uphopeless,worthless,burdenpostpartum depression,PPD,severe depressionpsychosis,hallucinations,voicesabuse,hitting,hurting,violent
Crisis Response Template:
🆘 CRISIS SUPPORT 🆘
It sounds like you're going through an incredibly difficult time.
Your feelings are valid, and help is available.
IMMEDIATE CRISIS RESOURCES:
━━━━━━━━━━━━━━━━━━━━━━━━━
🇺🇸 National Suicide Prevention Lifeline
📞 988 (call or text)
💬 Chat: 988lifeline.org
🤱 Postpartum Support International
📞 1-800-944-4773
💬 Text "HELP" to 800-944-4773
🆘 Crisis Text Line
💬 Text "HOME" to 741741
👶 Childhelp National Child Abuse Hotline
📞 1-800-422-4453
These services are:
✓ Free and confidential
✓ Available 24/7
✓ Staffed by trained counselors
✓ No judgment, only support
You don't have to go through this alone. Please reach out.
3.2 Parental Stress Detection
Stress Keywords:
overwhelmed,stressed,exhausted,burned outcrying,breaking down,can't handleno support,alone,isolatedangry at baby,resentful,regret
Support Response:
💙 Parenting is Hard 💙
You're not alone in feeling this way. Many parents experience:
- Exhaustion and burnout
- Overwhelming emotions
- Moments of frustration
This doesn't make you a bad parent. It makes you human.
Support Resources:
🤱 Postpartum Support International: 1-800-944-4773
📞 Parents Anonymous: 1-855-427-2736
💬 Crisis Text Line: Text "PARENT" to 741741
Self-Care Reminders:
✓ Taking breaks is necessary, not selfish
✓ Asking for help is a sign of strength
✓ Your mental health matters too
[Followed by relevant coping strategies if appropriate]
4. Content Moderation & Filtering
4.1 Input Moderation
Filter Out:
- Inappropriate sexual content
- Violent or abusive language
- Requests for illegal activities
- Spam or commercial solicitation
- Personal information requests (addresses, SSN, etc.)
Moderation Response:
I'm here to provide parenting support and childcare guidance.
This type of content falls outside my intended use.
If you have questions about childcare for children ages 0-6,
I'm happy to help!
4.2 Output Moderation
Before Sending AI Response, Check For:
- Medical advice (trigger disclaimer if present)
- Potentially harmful suggestions
- Contradictory information
- Inappropriate content
- Dosage recommendations (remove/disclaim)
Content Safety Checks:
const unsafePatterns = [
/\d+\s*(mg|ml|oz|tbsp|tsp)\s*(of|every|per)/i, // Dosage patterns
/give\s+(him|her|them|baby|child)\s+\d+/i, // Specific instructions with amounts
/diagnose|diagnosis|you have|they have/i, // Diagnostic language
/definitely|certainly\s+(is|has)/i, // Definitive medical statements
];
5. Rate Limiting & Abuse Prevention
5.1 Current Rate Limits
- Free Tier: 10 AI queries per day
- Premium Tier: Unlimited queries (with fair use policy)
5.2 Enhanced Rate Limiting
Suspicious Pattern Detection:
- Same question repeated >3 times in 1 hour
- Medical emergency keywords repeated >5 times in 1 day
- Queries from multiple IPs for same user (account sharing)
- Unusual query volume (>100/day even for premium)
Action on Suspicious Activity:
- Temporary rate limit (1 query per hour for 24 hours)
- Flag account for manual review
- Send email notification to user
- Log to security monitoring
5.3 Fair Use Policy (Premium)
- Maximum 200 queries per day (reasonable for active use)
- Queries should be related to childcare (0-6 years)
- No commercial use or data scraping
- No sharing of account credentials
6. System Prompt Guardrails
6.1 Base System Prompt (Always Included)
You are a helpful AI assistant for the Maternal App, designed to support
parents of children aged 0-6 years with childcare organization and guidance.
CRITICAL SAFETY RULES:
1. You are NOT a medical professional. Never diagnose or prescribe.
2. For medical emergencies, ALWAYS direct to call 911 immediately.
3. For medical concerns, ALWAYS recommend consulting a pediatrician.
4. Recognize mental health crises and provide crisis hotline resources.
5. Be supportive and non-judgmental of all parenting approaches.
6. Focus on evidence-based information from reputable sources (AAP, CDC, WHO).
7. Never provide specific medication dosages.
8. If asked about serious developmental delays, refer to professionals.
TONE:
- Warm, empathetic, and encouraging
- Clear and concise
- Non-judgmental and inclusive
- Supportive but honest
SCOPE:
- Childcare for ages 0-6 years
- Routines, tracking, organization
- General parenting tips and strategies
- Emotional support for parents
- Interpretation of tracked activity patterns
OUT OF SCOPE:
- Medical diagnosis or treatment
- Legal advice
- Financial planning
- Relationship counseling (beyond parenting partnership)
- Children outside 0-6 age range
6.2 Context-Specific Safety Additions
When Medical Keywords Detected:
MEDICAL SAFETY OVERRIDE:
This query contains medical concerns. Your response MUST:
1. Start with a medical disclaimer
2. Avoid definitive statements
3. Recommend professional consultation
4. Provide general information only
5. Include "when to seek immediate care" guidance
When Crisis Keywords Detected:
CRISIS RESPONSE OVERRIDE:
This query indicates a potential crisis. Your response MUST:
1. Acknowledge their feelings without judgment
2. Provide immediate crisis hotline numbers
3. Encourage reaching out for professional help
4. Be brief and focus on resources
5. Do NOT provide coping strategies that could be misinterpreted
7. Monitoring & Analytics
7.1 Safety Metrics to Track
Daily Monitoring:
- Number of medical disclaimer triggers
- Number of crisis hotline responses sent
- Number of emergency keyword detections
- Number of queries blocked by content filter
- Average response moderation score
Weekly Review:
- Trending medical concerns
- Common crisis keywords
- False positive rate for disclaimers
- User feedback on safety responses
7.2 Alert Thresholds
Immediate Alerts:
-
10 emergency keywords in 1 hour (potential abuse or real emergency pattern)
-
5 crisis keywords from single user in 24 hours (high-risk user)
- Content filter blocking >50 queries/day (attack or misconfiguration)
Daily Review Alerts:
-
100 medical disclaimers triggered in 24 hours
-
20 crisis responses in 24 hours
- Pattern of similar medical queries (potential misinformation spread)
8. User Education
8.1 First-Time User Guidance
On First AI Query, Show:
👋 Welcome to Your AI Parenting Assistant!
I'm here to help with:
✓ Childcare tips and organization
✓ Understanding your baby's patterns
✓ Answering general parenting questions
✓ Providing encouragement and support
Important to Know:
⚠️ I'm NOT a medical professional
⚠️ Always consult your pediatrician for medical concerns
⚠️ Call 911 for emergencies
For the best experience:
💡 Ask specific questions about routines, tracking, or parenting tips
💡 Share your child's age for age-appropriate guidance
💡 Use your activity tracking data for personalized insights
Let's get started! How can I help you today?
8.2 In-App Safety Reminders
Random Safety Tips (1 in 20 responses):
- "Remember: I'm an AI assistant, not a doctor. Always consult your pediatrician for medical advice."
- "Having a tough day? It's okay to ask for help. Check out our Resources section."
- "Emergency? Call 911 immediately. I cannot provide emergency medical guidance."
9. Implementation Checklist
9.1 Backend Implementation
- Create
ai-safety.service.tswith keyword detection - Implement medical disclaimer injection
- Add crisis hotline response generator
- Enhance content moderation filters
- Update system prompts with safety guardrails
- Add safety metrics logging
- Create safety monitoring dashboard
9.2 Frontend Implementation
- Add safety disclaimer modal on first AI use
- Display crisis hotlines prominently when triggered
- Show medical disclaimer badges
- Add "Report Unsafe Response" button
- Create AI Safety FAQ page
9.3 Testing
- Unit tests for keyword detection (100+ test cases)
- Integration tests for crisis response flow
- Manual testing of all disclaimer triggers
- User acceptance testing with parents
- Safety audit by medical professional (recommended)
9.4 Documentation
- User-facing AI Safety policy
- Developer documentation for safety service
- Incident response playbook
- Safety metric reporting template
10. Incident Response
10.1 If Harmful Response is Reported
Immediate Actions (Within 1 hour):
- Flag conversation in database
- Review full conversation history
- Identify failure point (keyword detection, content filter, prompt)
- Temporarily disable AI for affected user if serious
- Notify safety team
Follow-up Actions (Within 24 hours):
- Root cause analysis
- Update keyword lists or filters
- Test fix with similar queries
- Document incident and resolution
- Update user with resolution
Communication:
- Acknowledge report within 1 hour
- Provide resolution timeline
- Follow up when fixed
- Offer support resources if appropriate
10.2 Emergency Escalation
Escalate to Management If:
- User reports following harmful AI advice and experiencing negative outcome
- Multiple similar harmful responses in short timeframe
- Media or regulatory inquiry about AI safety
- Suicidal ideation not properly handled by crisis detection
11. Regular Safety Audits
11.1 Monthly Audits
- Review 100 random AI conversations
- Test all crisis and medical keyword triggers
- Analyze false positive/negative rates
- Update keyword lists based on new patterns
- Review user reports and feedback
11.2 Quarterly Reviews
- External safety audit (if possible)
- Update crisis hotline numbers (verify still active)
- Review and update safety policies
- Train team on new safety features
- Benchmark against industry standards
12. Compliance & Legal
12.1 Disclaimers (Required)
- Terms of Service mention AI limitations
- Privacy Policy covers AI data usage
- Medical disclaimer in AI interface
- Crisis resources easily accessible
12.2 Recommended Legal Review
- AI response liability
- Medical advice disclaimers sufficiency
- Crisis response appropriateness
- Age-appropriate content standards
Status: READY FOR IMPLEMENTATION
Priority: HIGH - User Safety Critical Estimated Implementation Time: 3-5 days Testing Time: 2 days Total: 5-7 days with testing
Next Steps:
- Implement
AISafetyServicewith keyword detection - Update
AIServiceto integrate safety checks - Add crisis hotline database/configuration
- Create frontend safety components
- Comprehensive testing with real scenarios
- User documentation
This strategy ensures the Maternal App AI is helpful, supportive, and above all, SAFE for parents seeking guidance.