andrei/maternal-app

Fork 0

Files

Andrei 9246d4b00d

CI/CD Pipeline / Lint and Test (push) Has been cancelled

Details

CI/CD Pipeline / E2E Tests (push) Has been cancelled

Details

CI/CD Pipeline / Build Application (push) Has been cancelled

Details

feat(ai-safety): Implement comprehensive AI Safety features

- Create AISafetyService with keyword detection for emergency, medical, crisis, developmental, and stress triggers
- Add emergency response templates (911, poison control, medical disclaimer)
- Add crisis hotline integration (988, Postpartum Support, Crisis Text Line, Childhelp)
- Add medical disclaimer and developmental disclaimer templates
- Add stress support resources for overwhelmed parents
- Implement output safety checking for unsafe patterns (dosages, diagnoses)
- Add safety response injection based on trigger type
- Integrate safety checks into AI chat flow with immediate overrides for emergencies/crises
- Add base safety prompt with critical safety rules and guardrails
- Add medical and crisis safety override prompts
- Enhance system prompt with safety guardrails dynamically based on query triggers
- Export AISafetyService from AIModule for use in other modules
- All safety metrics logged for monitoring dashboard (TODO: database storage)

Safety coverage:
✅ Emergency keyword detection (not breathing, choking, seizure, etc.)
✅ Medical concern keywords (fever, vomiting, rash, medication, etc.)
✅ Crisis keywords (suicide, self-harm, PPD, abuse, etc.)
✅ Parental stress keywords (overwhelmed, burned out, isolated, etc.)
✅ Developmental concern keywords (delay, autism, ADHD, regression, etc.)
✅ Output moderation patterns (dosages, diagnoses, definitive medical statements)
✅ Crisis hotline templates with 4 major US resources
✅ Medical disclaimers with red flags and when to seek care
✅ Stress support with self-care reminders

Tested: Backend compiles and runs successfully with 0 errors

2025-10-02 19:05:45 +00:00

15 KiB

Raw Blame History

AI Safety Strategy - Maternal App

Purpose: Ensure safe, responsible, and helpful AI interactions for parents seeking childcare guidance.

Last Updated: October 2, 2025

1. Safety Principles

1.1 Core Safety Values

Medical Disclaimer First - Never provide medical diagnoses or emergency advice
Crisis Detection - Recognize mental health crises and provide resources
Age-Appropriate - Responses suitable for child ages 0-6 years
Evidence-Based - Reference pediatric guidelines when possible
Non-Judgmental - Support all parenting styles without criticism
Privacy-Focused - Never request or store unnecessary medical information

1.2 What AI SHOULD Do

✅ Provide general parenting information and tips
✅ Suggest routines and organizational strategies
✅ Explain developmental milestones
✅ Offer emotional support and encouragement
✅ Direct to professional resources when appropriate
✅ Help track and interpret patterns in child's data

1.3 What AI MUST NOT Do

❌ Diagnose medical conditions
❌ Prescribe medications or dosages
❌ Handle medical emergencies
❌ Replace professional medical advice
❌ Make definitive statements about child development delays
❌ Encourage unsafe practices

2. Medical Disclaimer Triggers

2.1 Emergency Keywords (Immediate Disclaimer)

Trigger immediate medical disclaimer and emergency guidance:

Critical Symptoms:

emergency, 911, ambulance
not breathing, can't breathe, choking
unconscious, unresponsive, passed out
seizure, convulsion, shaking uncontrollably
severe bleeding, blood loss, won't stop bleeding
severe burn, burned, scalded
poisoning, swallowed, ingested
head injury, fell, hit head
allergic reaction, anaphylaxis, swelling

Response Template:

⚠️ EMERGENCY DISCLAIMER ⚠️

This appears to be a medical emergency. Please:

1. Call emergency services immediately (911 in US)
2. If child is not breathing, start CPR if trained
3. Stay calm and follow dispatcher instructions

I'm an AI assistant and cannot provide emergency medical guidance.
Please seek immediate professional medical help.

Emergency Resources:
- US: 911
- Poison Control: 1-800-222-1222
- Nurse Hotline: [Local number]

2.2 Medical Concern Keywords (Soft Disclaimer)

Trigger medical disclaimer but allow response with disclaimer:

Symptoms:

fever, temperature, hot, feverish
vomiting, throwing up, vomit
diarrhea, loose stools, watery stool
rash, spots, bumps, hives
cough, coughing, wheezing
ear infection, ear pain, ear ache
cold, flu, sick, illness
constipation, not pooping, hard stool
dehydration, not drinking, dry
injury, hurt, pain, ache
medication, medicine, dosage, dose

Response Pattern:

⚠️ Medical Disclaimer: I'm an AI assistant, not a medical professional.
For medical concerns, always consult your pediatrician or healthcare provider.

[General information response, if appropriate]

When to seek immediate care:
- [Specific red flags for the symptom]

Consider calling your pediatrician if:
- [Warning signs]

2.3 Developmental Concern Keywords

delay, behind, not talking, not walking
autism, ADHD, disability, special needs
regression, lost skills, stopped doing

Response Pattern:

⚠️ Developmental Disclaimer: Every child develops at their own pace.
If you have concerns, consult your pediatrician or early intervention specialist.

[General milestone information]

Resources:
- CDC Milestone Tracker: [link]
- Early Intervention Services: [link]

3. Crisis Hotline Integration

3.1 Mental Health Crisis Detection

Crisis Keywords:

suicide, kill myself, want to die, end it all
hurt myself, self-harm, cutting
can't cope, can't do this, giving up
hopeless, worthless, burden
postpartum depression, PPD, severe depression
psychosis, hallucinations, voices
abuse, hitting, hurting, violent

Crisis Response Template:

🆘 CRISIS SUPPORT 🆘

It sounds like you're going through an incredibly difficult time.
Your feelings are valid, and help is available.

IMMEDIATE CRISIS RESOURCES:
━━━━━━━━━━━━━━━━━━━━━━━━━
🇺🇸 National Suicide Prevention Lifeline
   📞 988 (call or text)
   💬 Chat: 988lifeline.org

🤱 Postpartum Support International
   📞 1-800-944-4773
   💬 Text "HELP" to 800-944-4773

🆘 Crisis Text Line
   💬 Text "HOME" to 741741

👶 Childhelp National Child Abuse Hotline
   📞 1-800-422-4453

These services are:
✓ Free and confidential
✓ Available 24/7
✓ Staffed by trained counselors
✓ No judgment, only support

You don't have to go through this alone. Please reach out.

3.2 Parental Stress Detection

Stress Keywords:

overwhelmed, stressed, exhausted, burned out
crying, breaking down, can't handle
no support, alone, isolated
angry at baby, resentful, regret

Support Response:

💙 Parenting is Hard 💙

You're not alone in feeling this way. Many parents experience:
- Exhaustion and burnout
- Overwhelming emotions
- Moments of frustration

This doesn't make you a bad parent. It makes you human.

Support Resources:
🤱 Postpartum Support International: 1-800-944-4773
📞 Parents Anonymous: 1-855-427-2736
💬 Crisis Text Line: Text "PARENT" to 741741

Self-Care Reminders:
✓ Taking breaks is necessary, not selfish
✓ Asking for help is a sign of strength
✓ Your mental health matters too

[Followed by relevant coping strategies if appropriate]

4. Content Moderation & Filtering

4.1 Input Moderation

Filter Out:

Inappropriate sexual content
Violent or abusive language
Requests for illegal activities
Spam or commercial solicitation
Personal information requests (addresses, SSN, etc.)

Moderation Response:

I'm here to provide parenting support and childcare guidance.
This type of content falls outside my intended use.

If you have questions about childcare for children ages 0-6,
I'm happy to help!

4.2 Output Moderation

Before Sending AI Response, Check For:

Medical advice (trigger disclaimer if present)
Potentially harmful suggestions
Contradictory information
Inappropriate content
Dosage recommendations (remove/disclaim)

Content Safety Checks:

const unsafePatterns = [
  /\d+\s*(mg|ml|oz|tbsp|tsp)\s*(of|every|per)/i, // Dosage patterns
  /give\s+(him|her|them|baby|child)\s+\d+/i,     // Specific instructions with amounts
  /diagnose|diagnosis|you have|they have/i,      // Diagnostic language
  /definitely|certainly\s+(is|has)/i,            // Definitive medical statements
];

5. Rate Limiting & Abuse Prevention

5.1 Current Rate Limits

Free Tier: 10 AI queries per day
Premium Tier: Unlimited queries (with fair use policy)

5.2 Enhanced Rate Limiting

Suspicious Pattern Detection:

Same question repeated >3 times in 1 hour
Medical emergency keywords repeated >5 times in 1 day
Queries from multiple IPs for same user (account sharing)
Unusual query volume (>100/day even for premium)

Action on Suspicious Activity:

Temporary rate limit (1 query per hour for 24 hours)
Flag account for manual review
Send email notification to user
Log to security monitoring

5.3 Fair Use Policy (Premium)

Maximum 200 queries per day (reasonable for active use)
Queries should be related to childcare (0-6 years)
No commercial use or data scraping
No sharing of account credentials

6. System Prompt Guardrails

6.1 Base System Prompt (Always Included)

You are a helpful AI assistant for the Maternal App, designed to support
parents of children aged 0-6 years with childcare organization and guidance.

CRITICAL SAFETY RULES:
1. You are NOT a medical professional. Never diagnose or prescribe.
2. For medical emergencies, ALWAYS direct to call 911 immediately.
3. For medical concerns, ALWAYS recommend consulting a pediatrician.
4. Recognize mental health crises and provide crisis hotline resources.
5. Be supportive and non-judgmental of all parenting approaches.
6. Focus on evidence-based information from reputable sources (AAP, CDC, WHO).
7. Never provide specific medication dosages.
8. If asked about serious developmental delays, refer to professionals.

TONE:
- Warm, empathetic, and encouraging
- Clear and concise
- Non-judgmental and inclusive
- Supportive but honest

SCOPE:
- Childcare for ages 0-6 years
- Routines, tracking, organization
- General parenting tips and strategies
- Emotional support for parents
- Interpretation of tracked activity patterns

OUT OF SCOPE:
- Medical diagnosis or treatment
- Legal advice
- Financial planning
- Relationship counseling (beyond parenting partnership)
- Children outside 0-6 age range

6.2 Context-Specific Safety Additions

When Medical Keywords Detected:

MEDICAL SAFETY OVERRIDE:
This query contains medical concerns. Your response MUST:
1. Start with a medical disclaimer
2. Avoid definitive statements
3. Recommend professional consultation
4. Provide general information only
5. Include "when to seek immediate care" guidance

When Crisis Keywords Detected:

CRISIS RESPONSE OVERRIDE:
This query indicates a potential crisis. Your response MUST:
1. Acknowledge their feelings without judgment
2. Provide immediate crisis hotline numbers
3. Encourage reaching out for professional help
4. Be brief and focus on resources
5. Do NOT provide coping strategies that could be misinterpreted

7. Monitoring & Analytics

7.1 Safety Metrics to Track

Daily Monitoring:

Number of medical disclaimer triggers
Number of crisis hotline responses sent
Number of emergency keyword detections
Number of queries blocked by content filter
Average response moderation score

Weekly Review:

Trending medical concerns
Common crisis keywords
False positive rate for disclaimers
User feedback on safety responses

7.2 Alert Thresholds

Immediate Alerts:

10 emergency keywords in 1 hour (potential abuse or real emergency pattern)
5 crisis keywords from single user in 24 hours (high-risk user)
Content filter blocking >50 queries/day (attack or misconfiguration)

Daily Review Alerts:

100 medical disclaimers triggered in 24 hours
20 crisis responses in 24 hours
Pattern of similar medical queries (potential misinformation spread)

8. User Education

8.1 First-Time User Guidance

On First AI Query, Show:

👋 Welcome to Your AI Parenting Assistant!

I'm here to help with:
✓ Childcare tips and organization
✓ Understanding your baby's patterns
✓ Answering general parenting questions
✓ Providing encouragement and support

Important to Know:
⚠️ I'm NOT a medical professional
⚠️ Always consult your pediatrician for medical concerns
⚠️ Call 911 for emergencies

For the best experience:
💡 Ask specific questions about routines, tracking, or parenting tips
💡 Share your child's age for age-appropriate guidance
💡 Use your activity tracking data for personalized insights

Let's get started! How can I help you today?

8.2 In-App Safety Reminders

Random Safety Tips (1 in 20 responses):

"Remember: I'm an AI assistant, not a doctor. Always consult your pediatrician for medical advice."
"Having a tough day? It's okay to ask for help. Check out our Resources section."
"Emergency? Call 911 immediately. I cannot provide emergency medical guidance."

9. Implementation Checklist

9.1 Backend Implementation

Create ai-safety.service.ts with keyword detection
Implement medical disclaimer injection
Add crisis hotline response generator
Enhance content moderation filters
Update system prompts with safety guardrails
Add safety metrics logging
Create safety monitoring dashboard

9.2 Frontend Implementation

Add safety disclaimer modal on first AI use
Display crisis hotlines prominently when triggered
Show medical disclaimer badges
Add "Report Unsafe Response" button
Create AI Safety FAQ page

9.3 Testing

Unit tests for keyword detection (100+ test cases)
Integration tests for crisis response flow
Manual testing of all disclaimer triggers
User acceptance testing with parents
Safety audit by medical professional (recommended)

9.4 Documentation

User-facing AI Safety policy
Developer documentation for safety service
Incident response playbook
Safety metric reporting template

10. Incident Response

10.1 If Harmful Response is Reported

Immediate Actions (Within 1 hour):

Flag conversation in database
Review full conversation history
Identify failure point (keyword detection, content filter, prompt)
Temporarily disable AI for affected user if serious
Notify safety team

Follow-up Actions (Within 24 hours):

Root cause analysis
Update keyword lists or filters
Test fix with similar queries
Document incident and resolution
Update user with resolution

Communication:

Acknowledge report within 1 hour
Provide resolution timeline
Follow up when fixed
Offer support resources if appropriate

10.2 Emergency Escalation

Escalate to Management If:

User reports following harmful AI advice and experiencing negative outcome
Multiple similar harmful responses in short timeframe
Media or regulatory inquiry about AI safety
Suicidal ideation not properly handled by crisis detection

11. Regular Safety Audits

11.1 Monthly Audits

Review 100 random AI conversations
Test all crisis and medical keyword triggers
Analyze false positive/negative rates
Update keyword lists based on new patterns
Review user reports and feedback

11.2 Quarterly Reviews

External safety audit (if possible)
Update crisis hotline numbers (verify still active)
Review and update safety policies
Train team on new safety features
Benchmark against industry standards

12. Compliance & Legal

12.1 Disclaimers (Required)

Terms of Service mention AI limitations
Privacy Policy covers AI data usage
Medical disclaimer in AI interface
Crisis resources easily accessible

12.2 Recommended Legal Review

AI response liability
Medical advice disclaimers sufficiency
Crisis response appropriateness
Age-appropriate content standards

Status: READY FOR IMPLEMENTATION

Priority: HIGH - User Safety Critical Estimated Implementation Time: 3-5 days Testing Time: 2 days Total: 5-7 days with testing

Next Steps:

Implement AISafetyService with keyword detection
Update AIService to integrate safety checks
Add crisis hotline database/configuration
Create frontend safety components
Comprehensive testing with real scenarios
User documentation

This strategy ensures the Maternal App AI is helpful, supportive, and above all, SAFE for parents seeking guidance.

15 KiB Raw Blame History

AI Safety Strategy - Maternal App

1. Safety Principles

1.1 Core Safety Values

1.2 What AI SHOULD Do

1.3 What AI MUST NOT Do

2. Medical Disclaimer Triggers

2.1 Emergency Keywords (Immediate Disclaimer)

2.2 Medical Concern Keywords (Soft Disclaimer)

2.3 Developmental Concern Keywords

3. Crisis Hotline Integration

3.1 Mental Health Crisis Detection

3.2 Parental Stress Detection

4. Content Moderation & Filtering

4.1 Input Moderation

4.2 Output Moderation

5. Rate Limiting & Abuse Prevention

5.1 Current Rate Limits

5.2 Enhanced Rate Limiting

5.3 Fair Use Policy (Premium)

6. System Prompt Guardrails

6.1 Base System Prompt (Always Included)

6.2 Context-Specific Safety Additions

7. Monitoring & Analytics

7.1 Safety Metrics to Track

7.2 Alert Thresholds

8. User Education

8.1 First-Time User Guidance

8.2 In-App Safety Reminders

9. Implementation Checklist

9.1 Backend Implementation

9.2 Frontend Implementation

9.3 Testing

9.4 Documentation

10. Incident Response

10.1 If Harmful Response is Reported

10.2 Emergency Escalation

11. Regular Safety Audits

11.1 Monthly Audits

11.2 Quarterly Reviews

12. Compliance & Legal

12.1 Disclaimers (Required)

12.2 Recommended Legal Review

Status: READY FOR IMPLEMENTATION

15 KiB

Raw Blame History