AI Voice Tab Process - App Perspective
Last Updated: December 2024
Overview
This document explains the AI Voice Tab functionality from the mobile app's perspective. When a user records audio and sends it to the AI voice server, the app collects and transmits 8 specific data fields along with the audio file. This documentation covers the complete process flow, data collection methods, authentication, credit system, and server response handling.
0. AI Token Authentication System
Before recording can begin, the app must obtain a valid AI token for server authentication.
Token Endpoint: https://kagtryxgjwavupzlmlzv.supabase.co/functions/v1/generate_ai_token
Method: POST
Authorization: User's Supabase JWT (Bearer token)
Token Request
{
"request_type": "ai_voice_chat_token"
}
Token Response
{
"ai_token": "eyJhbGciOiJIUzI1NiIs...",
"expires_in": 600 // seconds (10 minutes)
}
Token Validation Rules
- Valid: Token exists AND expires in more than 3 minutes
- Invalid: No token OR token expires within 3 minutes
- Token is stored in memory only (not persisted to storage)
- New token requested automatically when invalid
Code Location: _isAITokenValid() at line ~1750, _requestAIToken() at line ~1772 in ai_voice_tab.dart
0.5 Credit System
Users have a credit-based system that limits AI voice chat usage.
Credit Types
- Weekly Credits: Reset periodically (tracked by
weekly_reset_date)
- Extra/Bank Credits: Purchased credits that don't expire
Credit Check Logic
// User can record if:
// 1. Weekly credits not exhausted (weekly_used_credits < weekly_limit_credits) OR
// 2. Bank credits available (bank_used_credits < bank_limit_credits)
Credit Update Flow
- Server returns
ai_usage_in_seconds in response
- App rounds up to nearest integer for credits used
- If weekly limit is 0: credits added directly to bank usage
- If weekly credits available: deduct from weekly first
- If weekly exhausted: deduct from bank credits
- Updated credits saved to
user_profile.json
Source: user_profile.json → ai_usage object
Code Location: _hasAvailableCredits() at line ~1669, _updateAIUsageCredits() at line ~1581
1. Data Payloads Sent to AI Voice Server
When user sends audio, the following 8 data fields are collected and sent to the server as multipart form data:
- Field name:
audio_file
- Format: Recorded WAV file (44.1kHz, 128kbps)
- Max duration: 2 minutes (120 seconds)
- Filename:
recording.wav
Field name: course_progress
Source: AppData/JSONs/User_Audio_Usage/[CourseCode]_Usage.json
{
"course_code": "C14ENF",
"title": "Confidence",
"audio_files": {
"1": {
"title": "How to Use These Confidence Building Sessions",
"no_of_plays": 1
},
"2": {
"title": "Release Social Anxiety and Build Inner Strength",
"no_of_plays": 3
},
"3": {
"title": "Overcome Fear and Embrace Your True Self",
"no_of_plays": 0
}
}
}
Code Location: _collectCourseProgress() at line ~1449
Field name: journal_responses
Source: AppData/JSONs/User_Workbook_Responses/[CourseCode]1_response.json
Example filename: C14ENF1_response.json
{
"has_journal": true,
"entries": {
"daily_entries": [
{
"date": "2024-11-14",
"mood_rating": "7",
"anxiety_level": 4,
"triggers": ["work_stress", "social_situations"],
"thoughts_feelings": "Felt anxious about the presentation..."
}
]
}
}
If no journal found:
{
"has_journal": false,
"entries": ["User has not used the journal yet"]
}
Code Location: _collectJournalResponses() at line ~1482
Field name: user_context
Source: user_profile.json
{
"current_course": "C14ENF",
"user_name": "John"
}
Code Location: _collectUserContext() at line ~1521
Field name: conversation
Source: AppData/JSONs/Conversations/[CourseCode]_conversation.json
Content: Raw JSON string of previous conversation or "No prior conversation exists"
Code Location: _loadConversation() at line ~1532
Field name: voice
Source: user_profile.json → preferred_ai_voice
Default: "tara"
Note: User profile is reloaded before each request to get the latest voice preference
Field name: session_id
Format: UUID v4 (generated per request)
Example: a1b2c3d4-e5f6-7890-abcd-ef1234567890
Field name: timestamp
Format: ISO8601 timestamp
Example: 2024-12-23T14:30:00.000Z
Server Endpoint: http://worker1.ipnoelp.com/webhook/api/v1/voice-chat-TEST
Method: POST (multipart/form-data)
Authorization: Bearer [AI Token] (NOT user JWT)
Timeout: 240 seconds (4 minutes) send/receive
Content-Type: multipart/form-data
Accept: application/json
2. Process Flow Chart
Complete flow from user interaction to AI response:
Pre-Recording Phase
1
User presses record button → Check internet connectivity
2
Check if user has available credits (weekly OR bank) → If no credits, show "Credit Limit Reached" dialog
3
Check AI token validity → If invalid/expiring, request new token from generate_ai_token endpoint
Status: "Authenticating..."
4
Check microphone permission → Request if needed → Show settings dialog if permanently denied
Recording Phase
5
Clear any cached previous response → Start recording (WAV, 44.1kHz, 128kbps)
Status: "Listening..."
Timer shown: MM:SS
6
Recording continues until user stops OR max 2 minutes reached
Processing Phase
7
User stops recording → Save audio file
Status: "Sending..."
8
Collect all data payloads (course progress, journal, user context, conversation, voice)
9
POST to voice-chat endpoint with AI token authentication
Upload progress logged to console
10
After 7 seconds, start rotating status text every 7 seconds:
"Thinking..." → "Processing..." → "Analyzing..." → "Reasoning..." → "Reflecting..." → "Contemplating..." → "Evaluating..." → "Pondering..."
Response Phase
11
Server responds with audio_url, ai_usage_in_seconds, and conversation
12
Update AI usage credits locally → Save conversation to file
Status: "Receiving..."
13
Download AI response audio from audio_url
Save to: AppData/AI_Voice_Responses/response_[timestamp].wav
14
Play AI audio response using just_audio
Status: "Replying..."
15
Playback complete → Mark response as replayable → Return to idle
Status: "Press record button"
Shows: "or replay response" link
3. Journal Responses Collection
Implementation Process (_collectJournalResponses method)
1
Directory Path: Get application documents directory path
2
File Pattern Search: Search for files in User_Workbook_Responses directory with exact pattern:
[CourseCode]1_response.json
Example: If course is "C14ENF", look for "C14ENF1_response.json"
3
File Check: Check if the expected file exists at the exact path
4
Data Extraction: If file exists, read and parse JSON, extract data field
Return Data Structure
{
"has_journal": true,
"entries": [journal_data from "data" field]
}
{
"has_journal": false,
"entries": ["User has not used the journal yet"]
}
Code Location: Lines 1482-1518 in ai_voice_tab.dart → _collectJournalResponses() method
4. Server Response Handling
Expected Response Format
{
"audio_url": "https://server.com/path/to/ai_response.wav",
"ai_usage_in_seconds": 45.5,
"conversation": { ... updated conversation history ... }
}
Response Processing Steps
1
URL Extraction: Extract audio_url from server response
2
Credit Update: If ai_usage_in_seconds present, round up and update local credits
3
Conversation Save: If conversation present, save to [CourseCode]_conversation.json
4
Download: Download AI response audio file from provided URL
Timeout: 120 seconds
5
Storage: Save to AppData/AI_Voice_Responses/response_[timestamp].wav
6
Playback: Use just_audio player to play the response
7
Replay Cache: Mark response as replayable for user to replay later
8
Cleanup: Files are cleaned up when widget is disposed OR when new recording starts
Error Handling
- No internet: "No internet connection" (recording blocked)
- No credits: Shows "Credit Limit Reached" dialog with purchase option
- Token failure: "Failed to connect to AI. You can try switching to other voices in General Settings, but it might result in fewer conversations."
- Server errors: "Failed to connect to AI. You can try switching to other voices in General Settings, but it might result in fewer conversations."
- Download failure: "Failed to download response"
- Permission denied: Shows "Microphone Permission Required" dialog with settings button
- Permanently denied: Opens system settings via
openAppSettings()
5. Recording Modes
Tap Mode (Default)
- Tap microphone button to start recording
- Tap again to stop recording and send
- Max duration: 2 minutes
Hold Mode
- Press and hold microphone button for 500ms+ to activate
- Release to stop recording and send immediately
- If permission dialog appears during hold, switches to tap mode after permission granted
Code Location: _onButtonPress() at line ~2182, _onButtonRelease() at line ~2196
6. UI States
State Machine
States: 'idle' | 'recording' | 'processing' | 'playback'
idle:
- Status: "Press record button" (online) or "Offline" (offline)
- Microphone button glows
- Replay link shown if previous response available
- Tutorial border animation pulsing
recording:
- Status: "Listening..."
- Timer displayed (MM:SS)
- Stop/square icon shown in button
- Buddha orb pulsing
processing:
- Status: "Sending..." → then rotating thinking words
- Loading dots animation in button
- Buddha orb pulsing
playback:
- Status: "Replying..."
- Waveform animation in button
- Buddha orb pulsing
Connectivity States
- Online: Full functionality, status shows "Press record button"
- Offline: Recording blocked, status shows "Offline", orb appears static
7. File Storage Paths
| Purpose |
Path |
| Recording (temp) |
[temp]/voice_recording_[timestamp].wav |
| AI Response |
AppData/AI_Voice_Responses/response_[timestamp].wav |
| Conversation History |
AppData/JSONs/Conversations/[CourseCode]_conversation.json |
| Audio Usage |
AppData/JSONs/User_Audio_Usage/[CourseCode]_Usage.json |
| Journal Responses |
AppData/JSONs/User_Workbook_Responses/[CourseCode]1_response.json |
| User Profile |
AppData/JSONs/user_profile.json |