Add iOS Safari support for voice commands with MediaRecorder fallback
Some checks failed
CI/CD Pipeline / Lint and Test (push) Has been cancelled
CI/CD Pipeline / E2E Tests (push) Has been cancelled
CI/CD Pipeline / Build Application (push) Has been cancelled

Frontend changes:
- Add MediaRecorder fallback for iOS Safari (no Web Speech API support)
- Automatically detect browser capabilities and use appropriate method
- Add usesFallback flag to track which method is being used
- Update UI to show "Recording..." vs "Listening..." based on method
- Add iOS-specific indicator text
- Handle microphone permissions and errors properly

Backend changes:
- Update /api/v1/voice/transcribe to accept both audio files and text
- Support text-based classification (from Web Speech API)
- Support audio file transcription + classification (from MediaRecorder)
- Return unified response format with transcript and classification

How it works:
- Chrome/Edge: Uses Web Speech API for realtime transcription
- iOS Safari: Records audio with MediaRecorder, sends to server for transcription
- Fallback is transparent to the user with appropriate UI feedback

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
2025-10-02 05:59:26 +00:00
parent ff69848ec5
commit 330c776124
3 changed files with 190 additions and 26 deletions

View File

@@ -42,7 +42,7 @@ export function VoiceInputButton({
const [isProcessing, setIsProcessing] = useState(false);
const [classificationResult, setClassificationResult] = useState<any>(null);
const { isListening, isSupported, transcript, error, startListening, stopListening, reset } =
const { isListening, isSupported, transcript, error, usesFallback, startListening, stopListening, reset } =
useVoiceInput();
// Auto-classify when we get a final transcript
@@ -215,10 +215,18 @@ export function VoiceInputButton({
{/* Status text */}
<Typography variant="body1" color="text.secondary" gutterBottom>
{isListening
? 'Listening... Speak now'
? usesFallback
? 'Recording... Speak now'
: 'Listening... Speak now'
: 'Click the microphone to start'}
</Typography>
{usesFallback && !isListening && !transcript && (
<Typography variant="caption" color="text.secondary" sx={{ mt: 1, display: 'block' }}>
Using audio recording mode (iOS Safari)
</Typography>
)}
{/* Transcript */}
{transcript && (
<Box sx={{ mt: 3, p: 2, bgcolor: 'grey.100', borderRadius: 1 }}>