Skip to content

olich97/vokala

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

17 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ—£οΈ Vokala

A powerful real-time speech translation Android app that combines cutting-edge audio processing with machine learning to break down language barriers.

✨ Features

  • Real-time Voice Activity Detection - Advanced WebRTC VAD for accurate speech detection
  • Multi-speaker Recognition - Identify and track different speakers in conversations
  • Audio Visualization - Real-time waveform and spectrum analysis
  • Rust-powered Audio Engine - High-performance native audio processing
  • Modern Android UI - Built with Jetpack Compose for a smooth user experience
  • Offline-first Architecture - Works without internet connectivity (planned)

πŸ—οΈ Architecture

Vokala uses a hybrid architecture combining the best of native Android development with high-performance Rust:

  • Android Frontend: Kotlin + Jetpack Compose for the UI
  • Rust Core: Native audio processing with WebRTC VAD and signal analysis
  • ML Pipeline: TensorFlow Lite integration for on-device inference
  • JNI Bridge: Seamless communication between Kotlin and Rust

πŸš€ Current Status

βœ… Phase 1: Core Audio Pipeline (Completed)

  • Rust audio processing with WebRTC VAD
  • Real-time audio visualization
  • Speaker identification framework
  • Audio debug panel

πŸ”„ Phase 2: Advanced Speech Recognition (In Progress)

  • Whisper model integration for offline ASR
  • Language identification
  • Mobile performance optimization
  • Multi-language support

πŸ“‹ Upcoming Phases

  • Phase 3: Neural Machine Translation
  • Phase 4: Voice Cloning & Synthesis
  • Phase 5: UI/UX Enhancement
  • Phase 6: Performance Optimization & Testing

πŸ› οΈ Tech Stack

Android

  • Language: Kotlin
  • UI Framework: Jetpack Compose
  • Architecture: MVVM with ViewModels
  • Build System: Gradle with Kotlin DSL
  • Min SDK: 26 (Android 8.0)
  • Target SDK: 35

Rust Core (lingua_core)

  • Audio Processing: WebRTC VAD, RustFFT, Spectrum Analysis
  • ML Framework: Tract (TensorFlow/ONNX)
  • Signal Processing: DASP, Rubato, RealFFT
  • Threading: Crossbeam for concurrent processing

Dependencies

  • TensorFlow Lite for on-device ML
  • JTransforms for audio processing
  • Material 3 Design System
  • Kotlin Coroutines for async operations

πŸš€ Getting Started

Prerequisites

  • Android Studio Arctic Fox or later
  • Rust toolchain (for building lingua_core)
  • Android NDK 27.2.12479018
  • JDK 17

Setup

  1. Clone the repository

    git clone https://github.com/yourusername/vokala.git
    cd vokala
  2. Install Rust and Android targets

    curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
    rustup target add aarch64-linux-android armv7-linux-androideabi x86_64-linux-android
  3. Build and run

    ./gradlew assembleDebug
    ./gradlew installDebug

Project Structure

vokala/
β”œβ”€β”€ app/                          # Android application
β”‚   β”œβ”€β”€ src/main/java/com/vokala/
β”‚   β”‚   β”œβ”€β”€ MainActivity.kt       # Entry point
β”‚   β”‚   β”œβ”€β”€ core/                 # Core business logic
β”‚   β”‚   β”‚   β”œβ”€β”€ models/           # Model management
β”‚   β”‚   β”‚   β”œβ”€β”€ speaker/          # Speaker identification
β”‚   β”‚   β”‚   └── translation/      # Translation engine
β”‚   β”‚   └── ui/                   # Compose UI components
β”‚   β”‚       β”œβ”€β”€ main/             # Main screen & viewmodel
β”‚   β”‚       β”œβ”€β”€ components/       # Reusable UI components
β”‚   β”‚       └── theme/            # App theming
β”œβ”€β”€ lingua_core/                  # Rust audio processing library
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ lib.rs               # JNI interface & main logic
β”‚   β”‚   └── speaker.rs           # Speaker identification
β”‚   └── Cargo.toml               # Rust dependencies
└── models/                      # ML models directory

🎯 Key Features in Detail

Audio Processing Pipeline

The Rust-powered audio engine provides:

  • Voice Activity Detection: WebRTC VAD with configurable sensitivity
  • Real-time Analysis: FFT-based spectrum analysis and visualization
  • Speaker Identification: Embedding-based speaker recognition
  • Audio Preprocessing: Noise reduction and signal enhancement

Android Integration

  • Permissions: Automatic audio recording permission handling
  • Real-time UI: Smooth audio visualization with Compose Canvas
  • Background Processing: Efficient audio pipeline with minimal UI blocking
  • Model Management: On-device ML model loading and inference

πŸ”§ Development

Building the Rust Core

cd lingua_core
cargo build --release --no-default-features

Running Tests

# Rust tests
cd lingua_core
cargo test

# Android tests
./gradlew test
./gradlew connectedAndroidTest

Debug Features

  • Audio debug panel for testing voice detection
  • Real-time audio metrics display
  • Speaker identification visualization
  • Performance monitoring

πŸ“± Supported Devices

  • Minimum: Android 8.0 (API 26)
  • Architecture: ARM64, ARMv7, x86_64
  • Permissions: Microphone access required

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

  • WebRTC team for the excellent VAD implementation
  • OpenAI for Whisper speech recognition
  • The Rust audio processing community
  • Android Jetpack Compose team

Note: Vokala is currently in active development. Some features may be experimental or incomplete. Check the progress tracking for the latest updates.

About

A powerful real-time speech translation Android app that combines cutting-edge audio processing with machine learning to break down language barriers.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors