Back to Blog
AI
43 min read

Building a Voice Memo App with React Native + Whisper

Paweł Karniej·February 2026

Building a Voice Memo App with React Native + Whisper

February 2026

Voice memos are boring. Until you add AI.

I built YapperX in 2024 — a lightweight companion for capturing quick thoughts and voice memos.

Here's exactly how I built it, including the complete React Native + Whisper integration.

Table of Contents

  • Why Voice + AI is the Future
  • App Architecture Overview
  • Audio Recording in React Native
  • Whisper Transcription Setup
  • AI Summarization with GPT-4
  • Smart Categorization
  • Search and Organization
  • UI/UX for Voice Apps
  • Performance Optimization
  • Monetization Strategy
  • Real User Feedback and Lessons
  • Why Voice + AI is the Future

    The problem with traditional voice memos:

    • Hard to search through
    • No organization
    • Forget what you recorded
    • Can't quickly find specific information

    The AI solution:

    • Automatic transcription with Whisper
    • AI-powered summaries and key points
    • Smart categorization and tagging
    • Full-text search across all recordings

    Market opportunity:

    • Built-in voice memo apps suck
    • Most voice apps focus on transcription only
    • Adding AI analysis creates defensible value

    YapperX approach:

    • Simple recording → transcription → summary flow
    • Freemium model with credits for AI features
    • Focus on making voice memos actually searchable

    App Architecture Overview

    React Native App (Expo)
    
    Audio Recording (expo-av)
    
    Local Storage (SQLite + FileSystem)
    
    Convex Functions
    
    OpenAI Whisper + GPT-4

    Key architectural decisions:

  • Expo for rapid development: Native audio handling without ejecting
  • Local-first approach: Works offline, syncs when connected
  • SQLite for metadata: Fast search and organization
  • Convex for AI processing: Secure API key handling
  • File-based audio storage: Keep recordings on device
  • Audio Recording in React Native

    Basic Recording Setup

    // hooks/useAudioRecording.ts
    import { Audio, AVPlaybackStatus } from 'expo-av'
    import { useState, useRef } from 'react'
    import * as FileSystem from 'expo-file-system'
    
    interface Recording {
      id: string
      uri: string
      duration: number
      createdAt: Date
    }
    
    export const useAudioRecording = () => {
      const [recording, setRecording] = useState<Audio.Recording>()
      const [isRecording, setIsRecording] = useState(false)
      const [isPlaying, setIsPlaying] = useState(false)
      const [recordings, setRecordings] = useState<Recording[]>([])
      const sound = useRef<Audio.Sound>()
    
      const startRecording = async () => {
        try {
          // Request permissions
          const permission = await Audio.requestPermissionsAsync()
          if (permission.status !== 'granted') {
            throw new Error('Audio permission not granted')
          }
    
          // Configure audio mode
          await Audio.setAudioModeAsync({
            allowsRecordingIOS: true,
            playsInSilentModeIOS: true,
            shouldDuckAndroid: true,
            playThroughEarpieceAndroid: false,
          })
    
          // Start recording
          const { recording } = await Audio.Recording.createAsync({
            ...Audio.RecordingOptionsPresets.HIGH_QUALITY,
            android: {
              ...Audio.RecordingOptionsPresets.HIGH_QUALITY.android,
              extension: '.m4a',
              outputFormat: Audio.RECORDING_OPTION_ANDROID_OUTPUT_FORMAT_MPEG_4,
              audioEncoder: Audio.RECORDING_OPTION_ANDROID_AUDIO_ENCODER_AAC,
              sampleRate: 44100,
              numberOfChannels: 2,
              bitRate: 128000,
            },
            ios: {
              ...Audio.RecordingOptionsPresets.HIGH_QUALITY.ios,
              extension: '.m4a',
              outputFormat: Audio.RECORDING_OPTION_IOS_OUTPUT_FORMAT_MPEG4AAC,
              audioQuality: Audio.RECORDING_OPTION_IOS_AUDIO_QUALITY_HIGH,
              sampleRate: 44100,
              numberOfChannels: 2,
              bitRate: 128000,
              linearPCMBitDepth: 16,
              linearPCMIsBigEndian: false,
              linearPCMIsFloat: false,
            },
          })
    
          setRecording(recording)
          setIsRecording(true)
        } catch (err) {
          console.error('Failed to start recording', err)
          throw err
        }
      }
    
      const stopRecording = async () => {
        if (!recording) return null
    
        try {
          setIsRecording(false)
          await recording.stopAndUnloadAsync()
          
          const uri = recording.getURI()
          const status = await recording.getStatusAsync()
          
          if (uri && status.isLoaded) {
            // Create permanent file
            const filename = recording-${Date.now()}.m4a
            const permanentUri = ${FileSystem.documentDirectory}${filename}
            await FileSystem.moveAsync({
              from: uri,
              to: permanentUri,
            })
    
            const newRecording: Recording = {
              id: Date.now().toString(),
              uri: permanentUri,
              duration: status.durationMillis || 0,
              createdAt: new Date(),
            }
    
            setRecordings(prev => [newRecording, ...prev])
            setRecording(undefined)
            
            return newRecording
          }
        } catch (error) {
          console.error('Error stopping recording:', error)
          throw error
        }
        
        return null
      }
    
      const playRecording = async (uri: string) => {
        try {
          if (sound.current) {
            await sound.current.unloadAsync()
          }
    
          const { sound: newSound } = await Audio.Sound.createAsync(
            { uri },
            { shouldPlay: true }
          )
          
          sound.current = newSound
          setIsPlaying(true)
    
          newSound.setOnPlaybackStatusUpdate((status: AVPlaybackStatus) => {
            if (status.isLoaded && status.didJustFinish) {
              setIsPlaying(false)
            }
          })
        } catch (error) {
          console.error('Error playing recording:', error)
        }
      }
    
      const stopPlayback = async () => {
        if (sound.current) {
          await sound.current.stopAsync()
          setIsPlaying(false)
        }
      }
    
      const deleteRecording = async (recordingId: string) => {
        const recordingToDelete = recordings.find(r => r.id === recordingId)
        if (recordingToDelete) {
          try {
            await FileSystem.deleteAsync(recordingToDelete.uri)
            setRecordings(prev => prev.filter(r => r.id !== recordingId))
          } catch (error) {
            console.error('Error deleting recording:', error)
          }
        }
      }
    
      return {
        recording,
        isRecording,
        isPlaying,
        recordings,
        startRecording,
        stopRecording,
        playRecording,
        stopPlayback,
        deleteRecording,
      }
    }

    Recording UI Component

    // components/RecordingButton.tsx
    import React from 'react'
    import { View, TouchableOpacity, Text, Animated } from 'react-native'
    import { useAudioRecording } from '../hooks/useAudioRecording'
    
    export const RecordingButton = () => {
      const { isRecording, startRecording, stopRecording } = useAudioRecording()
      const pulseAnim = useRef(new Animated.Value(1)).current
    
      useEffect(() => {
        if (isRecording) {
          const pulseAnimation = Animated.loop(
            Animated.sequence([
              Animated.timing(pulseAnim, {
                toValue: 1.2,
                duration: 1000,
                useNativeDriver: true,
              }),
              Animated.timing(pulseAnim, {
                toValue: 1,
                duration: 1000,
                useNativeDriver: true,
              }),
            ])
          )
          pulseAnimation.start()
        } else {
          pulseAnim.setValue(1)
        }
      }, [isRecording])
    
      const handlePress = async () => {
        if (isRecording) {
          await stopRecording()
        } else {
          await startRecording()
        }
      }
    
      return (
        <View style={{ alignItems: 'center', justifyContent: 'center' }}>
          <Animated.View
            style={{
              transform: [{ scale: pulseAnim }],
            }}
          >
            <TouchableOpacity
              onPress={handlePress}
              style={{
                width: 80,
                height: 80,
                borderRadius: 40,
                backgroundColor: isRecording ? '#FF3B30' : '#007AFF',
                alignItems: 'center',
                justifyContent: 'center',
                shadowColor: '#000',
                shadowOffset: { width: 0, height: 2 },
                shadowOpacity: 0.25,
                shadowRadius: 4,
                elevation: 5,
              }}
            >
              <View
                style={{
                  width: isRecording ? 20 : 30,
                  height: isRecording ? 20 : 30,
                  borderRadius: isRecording ? 4 : 15,
                  backgroundColor: 'white',
                }}
              />
            </TouchableOpacity>
          </Animated.View>
          
          <Text style={{
            marginTop: 12,
            fontSize: 16,
            fontWeight: '500',
            color: isRecording ? '#FF3B30' : '#007AFF'
          }}>
            {isRecording ? 'Stop' : 'Record'}
          </Text>
        </View>
      )
    }

    Whisper Transcription Setup

    Convex Function for Transcription

    // convex/functions/transcribe-audio/index.ts
    import { serve } from 'https://deno.land/std@0.168.0/http/server.ts'
    import { createClient } from 'https://esm.sh/@convex/convex-js@2'
    
    const openaiApiKey = Deno.env.get('OPENAI_API_KEY')
    const convexUrl = process.env.CONVEX_URL
    const convexServiceKey = process.env.CONVEX_DEPLOY_KEY
    
    const corsHeaders = {
      'Access-Control-Allow-Origin': '*',
      'Access-Control-Allow-Headers': 'authorization, x-client-info, apikey, content-type',
    }
    
    serve(async (req) => {
      if (req.method === 'OPTIONS') {
        return new Response('ok', { headers: corsHeaders })
      }
    
      try {
        const formData = await req.formData()
        const audioFile = formData.get('audio') as File
        const userId = formData.get('userId') as string
        const recordingId = formData.get('recordingId') as string
    
        if (!audioFile || !userId) {
          return new Response(
            JSON.stringify({ error: 'Missing audio file or user ID' }),
            { status: 400, headers: corsHeaders }
          )
        }
    
        // Initialize Convex client
        const convex = createClient(convexUrl!, convexServiceKey!)
    
        // Check user's transcription credits
        const { data: user, error: userError } = await convex
          .from('users')
          .select('transcription_credits, subscription_tier')
          .eq('id', userId)
          .single()
    
        if (userError || !user) {
          return new Response(
            JSON.stringify({ error: 'User not found' }),
            { status: 404, headers: corsHeaders }
          )
        }
    
        if (user.subscription_tier === 'free' && user.transcription_credits <= 0) {
          return new Response(
            JSON.stringify({ error: 'No transcription credits remaining' }),
            { status: 402, headers: corsHeaders }
          )
        }
    
        // Prepare form data for Whisper API
        const whisperFormData = new FormData()
        whisperFormData.append('file', audioFile)
        whisperFormData.append('model', 'whisper-1')
        whisperFormData.append('response_format', 'verbose_json')
        whisperFormData.append('language', 'en') // Auto-detect if needed
    
        // Call OpenAI Whisper API
        const response = await fetch('https://api.openai.com/v1/audio/transcriptions', {
          method: 'POST',
          headers: {
            'Authorization': Bearer ${openaiApiKey},
          },
          body: whisperFormData,
        })
    
        if (!response.ok) {
          const error = await response.text()
          console.error('Whisper API error:', error)
          return new Response(
            JSON.stringify({ error: 'Transcription failed' }),
            { status: 500, headers: corsHeaders }
          )
        }
    
        const transcriptionData = await response.json()
    
        // Store transcription in database
        const { error: insertError } = await convex
          .from('transcriptions')
          .insert({
            id: recordingId,
            user_id: userId,
            transcription: transcriptionData.text,
            segments: transcriptionData.segments,
            language: transcriptionData.language,
            duration: transcriptionData.duration,
            created_at: new Date().toISOString(),
          })
    
        if (insertError) {
          console.error('Database insert error:', insertError)
          return new Response(
            JSON.stringify({ error: 'Failed to save transcription' }),
            { status: 500, headers: corsHeaders }
          )
        }
    
        // Deduct credit (if not unlimited)
        if (user.subscription_tier !== 'unlimited') {
          await convex
            .from('users')
            .update({ 
              transcription_credits: Math.max(0, user.transcription_credits - 1) 
            })
            .eq('id', userId)
        }
    
        return new Response(
          JSON.stringify({
            transcription: transcriptionData.text,
            segments: transcriptionData.segments,
            language: transcriptionData.language,
            duration: transcriptionData.duration,
          }),
          { headers: { ...corsHeaders, 'Content-Type': 'application/json' } }
        )
    
      } catch (error) {
        console.error('Transcription error:', error)
        return new Response(
          JSON.stringify({ error: 'Internal server error' }),
          { status: 500, headers: corsHeaders }
        )
      }
    })

    React Native Transcription Hook

    // hooks/useTranscription.ts
    import { useState } from 'react'
    import { convex } from '../lib/convex'
    import { useAuth } from './useAuth'
    
    export interface TranscriptionResult {
      transcription: string
      segments?: Array<{
        start: number
        end: number
        text: string
      }>
      language?: string
      duration?: number
    }
    
    export const useTranscription = () => {
      const [isTranscribing, setIsTranscribing] = useState(false)
      const { user } = useAuth()
    
      const transcribeAudio = async (
        audioUri: string, 
        recordingId: string
      ): Promise<TranscriptionResult> => {
        if (!user) {
          throw new Error('User not authenticated')
        }
    
        setIsTranscribing(true)
    
        try {
          // Create form data
          const formData = new FormData()
          formData.append('audio', {
            uri: audioUri,
            type: 'audio/m4a',
            name: 'recording.m4a',
          } as any)
          formData.append('userId', user.id)
          formData.append('recordingId', recordingId)
    
          // Call transcription function
          const { data, error } = await convex.action('transcribe-audio', {
            body: formData,
          })
    
          if (error) {
            console.error('Transcription error:', error)
            throw error
          }
    
          return data
        } catch (error) {
          console.error('Transcription failed:', error)
          throw error
        } finally {
          setIsTranscribing(false)
        }
      }
    
      const getStoredTranscription = async (recordingId: string) => {
        const { data, error } = await convex
          .from('transcriptions')
          .select('*')
          .eq('id', recordingId)
          .single()
    
        if (error) {
          console.error('Error fetching transcription:', error)
          return null
        }
    
        return data
      }
    
      return {
        transcribeAudio,
        getStoredTranscription,
        isTranscribing,
      }
    }

    AI Summarization with GPT-4

    Summarization Edge Function

    // convex/functions/summarize-transcription/index.ts
    serve(async (req) => {
      const { transcription, userId, recordingId, summaryType = 'brief' } = await req.json()
    
      // Check user credits
      const { data: user } = await convex
        .from('users')
        .select('ai_credits, subscription_tier')
        .eq('id', userId)
        .single()
    
      if (user.subscription_tier === 'free' && user.ai_credits <= 0) {
        return new Response(
          JSON.stringify({ error: 'No AI credits remaining' }),
          { status: 402, headers: corsHeaders }
        )
      }
    
      const prompts = {
        brief: `Summarize this transcription in 2-3 bullet points, focusing on key information:
    
    ${transcription}`,
        
        detailed: `Analyze this transcription and provide:
    1. Main topics discussed
    2. Key decisions or action items
    3. Important dates, numbers, or names mentioned
    4. Overall summary
    
    Transcription:
    ${transcription}`,
        
        actionItems: `Extract action items and next steps from this transcription:
    
    ${transcription}
    
    Format as:
    - [ ] Action item 1
    - [ ] Action item 2
    etc.`
      }
    
      try {
        const response = await fetch('https://api.openai.com/v1/chat/completions', {
          method: 'POST',
          headers: {
            'Authorization': Bearer ${openaiApiKey},
            'Content-Type': 'application/json',
          },
          body: JSON.stringify({
            model: 'gpt-4',
            messages: [
              { 
                role: 'system', 
                content: 'You are a helpful assistant that summarizes voice memos clearly and concisely.' 
              },
              { 
                role: 'user', 
                content: prompts[summaryType] || prompts.brief 
              }
            ],
            max_tokens: 500,
            temperature: 0.3,
          }),
        })
    
        const data = await response.json()
        const summary = data.choices[0].message.content
    
        // Store summary in database
        await convex
          .from('summaries')
          .upsert({
            recording_id: recordingId,
            user_id: userId,
            summary_type: summaryType,
            summary: summary,
            created_at: new Date().toISOString(),
          })
    
        // Deduct credit
        if (user.subscription_tier !== 'unlimited') {
          await convex
            .from('users')
            .update({ ai_credits: Math.max(0, user.ai_credits - 1) })
            .eq('id', userId)
        }
    
        return new Response(
          JSON.stringify({ summary }),
          { headers: corsHeaders }
        )
    
      } catch (error) {
        return new Response(
          JSON.stringify({ error: 'Summarization failed' }),
          { status: 500, headers: corsHeaders }
        )
      }
    })

    React Native Summary Hook

    // hooks/useSummary.ts
    export const useSummary = () => {
      const [isSummarizing, setIsSummarizing] = useState(false)
      const { user } = useAuth()
    
      const generateSummary = async (
        transcription: string,
        recordingId: string,
        summaryType: 'brief' | 'detailed' | 'actionItems' = 'brief'
      ) => {
        setIsSummarizing(true)
    
        try {
          const { data, error } = await convex.action('summarize-transcription', {
            body: { 
              transcription, 
              userId: user?.id, 
              recordingId, 
              summaryType 
            }
          })
    
          if (error) throw error
          return data.summary
        } catch (error) {
          console.error('Summary generation failed:', error)
          throw error
        } finally {
          setIsSummarizing(false)
        }
      }
    
      const getStoredSummary = async (recordingId: string, summaryType: string) => {
        const { data, error } = await convex
          .from('summaries')
          .select('summary')
          .eq('recording_id', recordingId)
          .eq('summary_type', summaryType)
          .single()
    
        return error ? null : data?.summary
      }
    
      return { generateSummary, getStoredSummary, isSummarizing }
    }

    Smart Categorization

    Auto-categorization with GPT-4

    // convex/functions/categorize-recording/index.ts
    serve(async (req) => {
      const { transcription, userId, recordingId } = await req.json()
    
      const categories = [
        'Work/Business',
        'Personal/Ideas', 
        'Meeting Notes',
        'Shopping/Tasks',
        'Health/Medical',
        'Creative/Projects',
        'Learning/Education',
        'Other'
      ]
    
      const prompt = `Categorize this voice memo transcription into one of these categories: ${categories.join(', ')}
    
    Also suggest 2-3 relevant tags for better organization.
    
    Transcription: ${transcription}
    
    Respond in JSON format:
    {
      "category": "chosen category",
      "tags": ["tag1", "tag2", "tag3"],
      "confidence": 0.95
    }`
    
      try {
        const response = await fetch('https://api.openai.com/v1/chat/completions', {
          method: 'POST',
          headers: {
            'Authorization': Bearer ${openaiApiKey},
            'Content-Type': 'application/json',
          },
          body: JSON.stringify({
            model: 'gpt-4',
            messages: [
              { 
                role: 'system', 
                content: 'You are an expert at categorizing and tagging voice memos. Always respond with valid JSON.' 
              },
              { role: 'user', content: prompt }
            ],
            max_tokens: 200,
            temperature: 0.1,
          }),
        })
    
        const data = await response.json()
        const result = JSON.parse(data.choices[0].message.content)
    
        // Store categorization
        await convex
          .from('transcriptions')
          .update({
            category: result.category,
            tags: result.tags,
            categorization_confidence: result.confidence,
          })
          .eq('id', recordingId)
    
        return new Response(
          JSON.stringify(result),
          { headers: corsHeaders }
        )
    
      } catch (error) {
        return new Response(
          JSON.stringify({ error: 'Categorization failed' }),
          { status: 500, headers: corsHeaders }
        )
      }
    })

    Search and Organization

    SQLite Database Schema

    -- Store recording metadata
    CREATE TABLE recordings (
      id TEXT PRIMARY KEY,
      user_id TEXT NOT NULL,
      uri TEXT NOT NULL,
      duration INTEGER,
      created_at DATETIME,
      title TEXT,
      category TEXT,
      tags TEXT, -- JSON array
      is_favorited BOOLEAN DEFAULT 0
    );
    
    -- Store transcriptions
    CREATE TABLE transcriptions (
      id TEXT PRIMARY KEY,
      recording_id TEXT,
      transcription TEXT,
      language TEXT,
      confidence REAL,
      FOREIGN KEY (recording_id) REFERENCES recordings (id)
    );
    
    -- Full-text search index
    CREATE VIRTUAL TABLE transcriptions_fts USING fts5(
      transcription,
      content='transcriptions',
      content_rowid='rowid'
    );

    Search Hook

    // hooks/useSearch.ts
    import * as SQLite from 'expo-sqlite'
    
    const db = SQLite.openDatabase('yapperx.db')
    
    export const useSearch = () => {
      const [searchResults, setSearchResults] = useState([])
      const [isSearching, setIsSearching] = useState(false)
    
      const searchRecordings = async (query: string, filters?: {
        category?: string
        tags?: string[]
        dateRange?: { start: Date; end: Date }
      }) => {
        setIsSearching(true)
    
        try {
          let sql = `
            SELECT 
              r.id,
              r.title,
              r.created_at,
              r.category,
              r.tags,
              t.transcription,
              highlight(transcriptions_fts, 0, '<mark>', '</mark>') as highlighted_text
            FROM recordings r
            LEFT JOIN transcriptions t ON r.id = t.recording_id
            LEFT JOIN transcriptions_fts ON transcriptions_fts.rowid = t.rowid
            WHERE 1=1
          `
    
          const params = []
    
          // Add search query
          if (query.trim()) {
            sql +=  AND transcriptions_fts MATCH ?
            params.push(query)
          }
    
          // Add filters
          if (filters?.category) {
            sql +=  AND r.category = ?
            params.push(filters.category)
          }
    
          if (filters?.dateRange) {
            sql +=  AND r.created_at BETWEEN ? AND ?
            params.push(
              filters.dateRange.start.toISOString(),
              filters.dateRange.end.toISOString()
            )
          }
    
          sql +=  ORDER BY r.created_at DESC LIMIT 50
    
          return new Promise((resolve, reject) => {
            db.transaction(tx => {
              tx.executeSql(
                sql,
                params,
                (_, { rows }) => {
                  const results = []
                  for (let i = 0; i < rows.length; i++) {
                    results.push(rows.item(i))
                  }
                  setSearchResults(results)
                  resolve(results)
                },
                (_, error) => {
                  console.error('Search error:', error)
                  reject(error)
                  return true
                }
              )
            })
          })
        } catch (error) {
          console.error('Search failed:', error)
        } finally {
          setIsSearching(false)
        }
      }
    
      return { searchRecordings, searchResults, isSearching }
    }

    UI/UX for Voice Apps

    Recording List Component

    // components/RecordingsList.tsx
    import React from 'react'
    import { FlatList, View, Text, TouchableOpacity } from 'react-native'
    import { format } from 'date-fns'
    
    interface Recording {
      id: string
      title?: string
      transcription?: string
      category?: string
      createdAt: Date
      duration: number
    }
    
    export const RecordingsList = ({ 
      recordings, 
      onPlayRecording,
      onRecordingPress 
    }: {
      recordings: Recording[]
      onPlayRecording: (uri: string) => void
      onRecordingPress: (recording: Recording) => void
    }) => {
      const formatDuration = (milliseconds: number) => {
        const seconds = Math.floor(milliseconds / 1000)
        const minutes = Math.floor(seconds / 60)
        const remainingSeconds = seconds % 60
        return ${minutes}:${remainingSeconds.toString().padStart(2, '0')}
      }
    
      const renderRecording = ({ item }: { item: Recording }) => {
        const previewText = item.transcription?.substring(0, 100) + 
          (item.transcription?.length > 100 ? '...' : '')
    
        return (
          <TouchableOpacity 
            onPress={() => onRecordingPress(item)}
            style={{
              backgroundColor: 'white',
              padding: 16,
              marginHorizontal: 16,
              marginVertical: 8,
              borderRadius: 12,
              shadowColor: '#000',
              shadowOffset: { width: 0, height: 1 },
              shadowOpacity: 0.22,
              shadowRadius: 2.22,
              elevation: 3,
            }}
          >
            <View style={{ flexDirection: 'row', justifyContent: 'space-between', alignItems: 'flex-start' }}>
              <View style={{ flex: 1 }}>
                <Text style={{ fontSize: 16, fontWeight: '600', marginBottom: 4 }}>
                  {item.title || 'Untitled Recording'}
                </Text>
                
                <Text style={{ fontSize: 14, color: '#666', marginBottom: 8 }}>
                  {format(new Date(item.createdAt), 'MMM d, h:mm a')} • {formatDuration(item.duration)}
                </Text>
    
                {item.category && (
                  <View style={{
                    backgroundColor: '#E3F2FD',
                    paddingHorizontal: 8,
                    paddingVertical: 4,
                    borderRadius: 12,
                    alignSelf: 'flex-start',
                    marginBottom: 8,
                  }}>
                    <Text style={{ fontSize: 12, color: '#1976D2' }}>
                      {item.category}
                    </Text>
                  </View>
                )}
    
                {previewText && (
                  <Text style={{ fontSize: 14, color: '#333', lineHeight: 20 }}>
                    {previewText}
                  </Text>
                )}
              </View>
    
              <TouchableOpacity
                onPress={() => onPlayRecording(item.uri)}
                style={{
                  width: 44,
                  height: 44,
                  borderRadius: 22,
                  backgroundColor: '#007AFF',
                  alignItems: 'center',
                  justifyContent: 'center',
                  marginLeft: 12,
                }}
              >
                <Text style={{ color: 'white', fontSize: 16 }}></Text>
              </TouchableOpacity>
            </View>
          </TouchableOpacity>
        )
      }
    
      return (
        <FlatList
          data={recordings}
          keyExtractor={item => item.id}
          renderItem={renderRecording}
          style={{ flex: 1 }}
          showsVerticalScrollIndicator={false}
        />
      )
    }

    Recording Detail Screen

    // screens/RecordingDetailScreen.tsx
    export const RecordingDetailScreen = ({ route, navigation }) => {
      const { recording } = route.params
      const { generateSummary, isSummarizing } = useSummary()
      const [summary, setSummary] = useState('')
      const [summaryType, setSummaryType] = useState<'brief' | 'detailed' | 'actionItems'>('brief')
    
      const handleGenerateSummary = async () => {
        try {
          const newSummary = await generateSummary(
            recording.transcription, 
            recording.id, 
            summaryType
          )
          setSummary(newSummary)
        } catch (error) {
          // Handle error
        }
      }
    
      return (
        <ScrollView style={{ flex: 1, backgroundColor: '#F5F5F5' }}>
          {/ Recording Info /}
          <View style={{ backgroundColor: 'white', padding: 20, marginBottom: 12 }}>
            <Text style={{ fontSize: 24, fontWeight: 'bold', marginBottom: 8 }}>
              {recording.title || 'Untitled Recording'}
            </Text>
            
            <View style={{ flexDirection: 'row', alignItems: 'center', marginBottom: 16 }}>
              <Text style={{ color: '#666', marginRight: 16 }}>
                {format(new Date(recording.createdAt), 'MMM d, yyyy h:mm a')}
              </Text>
              <Text style={{ color: '#666' }}>
                {formatDuration(recording.duration)}
              </Text>
            </View>
    
            <PlaybackControls recording={recording} />
          </View>
    
          {/ Transcription /}
          <View style={{ backgroundColor: 'white', padding: 20, marginBottom: 12 }}>
            <Text style={{ fontSize: 18, fontWeight: '600', marginBottom: 12 }}>
              Transcription
            </Text>
            <Text style={{ fontSize: 16, lineHeight: 24, color: '#333' }}>
              {recording.transcription}
            </Text>
          </View>
    
          {/ Summary Section /}
          <View style={{ backgroundColor: 'white', padding: 20, marginBottom: 12 }}>
            <View style={{ flexDirection: 'row', justifyContent: 'space-between', alignItems: 'center', marginBottom: 16 }}>
              <Text style={{ fontSize: 18, fontWeight: '600' }}>AI Summary</Text>
              
              <View style={{ flexDirection: 'row' }}>
                {['brief', 'detailed', 'actionItems'].map((type) => (
                  <TouchableOpacity
                    key={type}
                    onPress={() => setSummaryType(type as any)}
                    style={{
                      paddingHorizontal: 12,
                      paddingVertical: 6,
                      borderRadius: 16,
                      backgroundColor: summaryType === type ? '#007AFF' : '#F0F0F0',
                      marginLeft: 8,
                    }}
                  >
                    <Text style={{
                      fontSize: 12,
                      color: summaryType === type ? 'white' : '#333',
                      textTransform: 'capitalize',
                    }}>
                      {type}
                    </Text>
                  </TouchableOpacity>
                ))}
              </View>
            </View>
    
            {summary ? (
              <Text style={{ fontSize: 16, lineHeight: 24, color: '#333' }}>
                {summary}
              </Text>
            ) : (
              <TouchableOpacity
                onPress={handleGenerateSummary}
                disabled={isSummarizing}
                style={{
                  backgroundColor: '#007AFF',
                  padding: 16,
                  borderRadius: 8,
                  alignItems: 'center',
                }}
              >
                <Text style={{ color: 'white', fontWeight: '600' }}>
                  {isSummarizing ? 'Generating...' : 'Generate Summary'}
                </Text>
              </TouchableOpacity>
            )}
          </View>
        </ScrollView>
      )
    }

    Performance Optimization

    Lazy Loading and Virtualization

    // Use FlatList for large recording lists
    const RecordingsList = () => {
      const getItemLayout = (data, index) => ({
        length: ITEM_HEIGHT,
        offset: ITEM_HEIGHT * index,
        index,
      })
    
      return (
        <FlatList
          data={recordings}
          renderItem={renderItem}
          getItemLayout={getItemLayout} // Optimize scrolling
          removeClippedSubviews={true} // Memory optimization
          maxToRenderPerBatch={10} // Render in batches
          windowSize={10} // Keep items in memory
          initialNumToRender={5} // Initial render count
        />
      )
    }

    Audio File Optimization

    // Compress audio files for faster uploads
    const compressAudio = async (originalUri: string) => {
      // Use expo-av to compress
      const compressedUri = await Audio.CompressAsync(originalUri, {
        bitrate: 64000, // 64kbps for voice is sufficient
        sampleRate: 22050, // Lower sample rate for voice
      })
      return compressedUri
    }

    Offline-First Architecture

    // Queue transcriptions for when online
    const queueTranscription = async (recordingId: string, audioUri: string) => {
      await AsyncStorage.setItem(pending_transcription_${recordingId}, JSON.stringify({
        recordingId,
        audioUri,
        timestamp: Date.now(),
      }))
    }
    
    // Process queued transcriptions when online
    const processQueuedTranscriptions = async () => {
      const keys = await AsyncStorage.getAllKeys()
      const pendingKeys = keys.filter(key => key.startsWith('pending_transcription_'))
      
      for (const key of pendingKeys) {
        try {
          const data = JSON.parse(await AsyncStorage.getItem(key) || '{}')
          await transcribeAudio(data.audioUri, data.recordingId)
          await AsyncStorage.removeItem(key)
        } catch (error) {
          // Keep in queue for retry
        }
      }
    }

    Monetization Strategy

    Freemium Model

    const CREDIT_LIMITS = {
      free: {
        transcriptions: 10,
        summaries: 5,
        storage: '100MB',
      },
      pro: {
        transcriptions: 500,
        summaries: 200,
        storage: '10GB',
        ai_analysis: true,
      },
      unlimited: {
        transcriptions: -1, // unlimited
        summaries: -1,
        storage: '100GB',
        ai_analysis: true,
        priority_support: true,
      }
    }

    Pricing Strategy

    YapperX pricing (based on actual data):

    • Free: 10 transcriptions/month
    • Pro Monthly: $4.99
    • Pro Annual: $29.99

    Paywall Placement

    // Show paywall after user gets value
    const checkAndShowPaywall = (transcriptionsUsed: number) => {
      if (transcriptionsUsed === 3) {
        // First paywall after they've seen the value
        showPaywall('first_value_experienced')
      } else if (transcriptionsUsed >= 8) {
        // Second paywall before hitting limit
        showPaywall('approaching_limit')
      } else if (transcriptionsUsed >= 10) {
        // Hard limit reached
        showPaywall('limit_reached')
      }
    }

    Real User Feedback and Lessons

    What Users Love

    "Finally, I can find that important thing I recorded 3 months ago"

    "The AI summaries are surprisingly good"

    "Works perfectly offline, syncs when I'm back online"

    What Users Complained About

    "Transcription takes too long" (Fixed with optimistic UI updates)

    "App crashes with long recordings" (Fixed with audio compression)

    "Can't organize recordings" (Fixed with categories and tags)

    Lessons Learned

    1. Audio Quality Matters

    Users will abandon if transcription quality is poor. Spend time on recording settings.

    2. Offline-First is Critical

    Voice memos are often recorded when connectivity is poor. Build for offline.

    3. AI Adds Real Value

    Summaries and categorization are the features users pay for, not just transcription.

    4. Search is Everything

    The ability to search through all recordings is what keeps users engaged long-term.

    5. Simple UI Wins

    Voice apps should be fast and simple. Complex interfaces kill the experience.

    Next Steps

    To build your own voice memo app with AI:

  • Start with basic recording: Get audio recording working perfectly first
  • Add transcription: Use the Whisper setup above
  • Build search: Full-text search is crucial for retention
  • Add AI features: Summaries and categorization drive conversions
  • Optimize performance: Focus on offline-first and fast UI
  • Want the complete setup? Ship React Native includes:

    • Complete voice memo app template
    • Whisper transcription ready to go
    • AI summarization and categorization
    • Search and organization features
    • Monetization setup with RevenueCat

    Get Ship React Native and start building your AI-powered voice app today.


    Written by Paweł Karniej, creator of YapperX. Follow @thepawelk for more real-world React Native insights.