🗣️ Vokala

A powerful real-time speech translation Android app that combines cutting-edge audio processing with machine learning to break down language barriers.

✨ Features

Real-time Voice Activity Detection - Advanced WebRTC VAD for accurate speech detection
Multi-speaker Recognition - Identify and track different speakers in conversations
Audio Visualization - Real-time waveform and spectrum analysis
Rust-powered Audio Engine - High-performance native audio processing
Modern Android UI - Built with Jetpack Compose for a smooth user experience
Offline-first Architecture - Works without internet connectivity (planned)

🏗️ Architecture

Vokala uses a hybrid architecture combining the best of native Android development with high-performance Rust:

Android Frontend: Kotlin + Jetpack Compose for the UI
Rust Core: Native audio processing with WebRTC VAD and signal analysis
ML Pipeline: TensorFlow Lite integration for on-device inference
JNI Bridge: Seamless communication between Kotlin and Rust

🚀 Current Status

✅ Phase 1: Core Audio Pipeline (Completed)

Rust audio processing with WebRTC VAD
Real-time audio visualization
Speaker identification framework
Audio debug panel

🔄 Phase 2: Advanced Speech Recognition (In Progress)

Whisper model integration for offline ASR
Language identification
Mobile performance optimization
Multi-language support

📋 Upcoming Phases

Phase 3: Neural Machine Translation
Phase 4: Voice Cloning & Synthesis
Phase 5: UI/UX Enhancement
Phase 6: Performance Optimization & Testing

🛠️ Tech Stack

Android

Language: Kotlin
UI Framework: Jetpack Compose
Architecture: MVVM with ViewModels
Build System: Gradle with Kotlin DSL
Min SDK: 26 (Android 8.0)
Target SDK: 35

Rust Core (`lingua_core`)

Audio Processing: WebRTC VAD, RustFFT, Spectrum Analysis
ML Framework: Tract (TensorFlow/ONNX)
Signal Processing: DASP, Rubato, RealFFT
Threading: Crossbeam for concurrent processing

Dependencies

TensorFlow Lite for on-device ML
JTransforms for audio processing
Material 3 Design System
Kotlin Coroutines for async operations

🚀 Getting Started

Prerequisites

Android Studio Arctic Fox or later
Rust toolchain (for building lingua_core)
Android NDK 27.2.12479018
JDK 17

Setup

Clone the repository

git clone https://github.com/yourusername/vokala.git
cd vokala

Install Rust and Android targets

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
rustup target add aarch64-linux-android armv7-linux-androideabi x86_64-linux-android

Build and run

./gradlew assembleDebug
./gradlew installDebug

Project Structure

vokala/
├── app/                          # Android application
│   ├── src/main/java/com/vokala/
│   │   ├── MainActivity.kt       # Entry point
│   │   ├── core/                 # Core business logic
│   │   │   ├── models/           # Model management
│   │   │   ├── speaker/          # Speaker identification
│   │   │   └── translation/      # Translation engine
│   │   └── ui/                   # Compose UI components
│   │       ├── main/             # Main screen & viewmodel
│   │       ├── components/       # Reusable UI components
│   │       └── theme/            # App theming
├── lingua_core/                  # Rust audio processing library
│   ├── src/
│   │   ├── lib.rs               # JNI interface & main logic
│   │   └── speaker.rs           # Speaker identification
│   └── Cargo.toml               # Rust dependencies
└── models/                      # ML models directory

🎯 Key Features in Detail

Audio Processing Pipeline

The Rust-powered audio engine provides:

Voice Activity Detection: WebRTC VAD with configurable sensitivity
Real-time Analysis: FFT-based spectrum analysis and visualization
Speaker Identification: Embedding-based speaker recognition
Audio Preprocessing: Noise reduction and signal enhancement

Android Integration

Permissions: Automatic audio recording permission handling
Real-time UI: Smooth audio visualization with Compose Canvas
Background Processing: Efficient audio pipeline with minimal UI blocking
Model Management: On-device ML model loading and inference

🔧 Development

Building the Rust Core

cd lingua_core
cargo build --release --no-default-features

Running Tests

# Rust tests
cd lingua_core
cargo test

# Android tests
./gradlew test
./gradlew connectedAndroidTest

Debug Features

Audio debug panel for testing voice detection
Real-time audio metrics display
Speaker identification visualization
Performance monitoring

📱 Supported Devices

Minimum: Android 8.0 (API 26)
Architecture: ARM64, ARMv7, x86_64
Permissions: Microphone access required

🤝 Contributing

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

WebRTC team for the excellent VAD implementation
OpenAI for Whisper speech recognition
The Rust audio processing community
Android Jetpack Compose team

Note: Vokala is currently in active development. Some features may be experimental or incomplete. Check the progress tracking for the latest updates.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.idea		.idea
app		app
gradle		gradle
lingua_core		lingua_core
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
VOKALA_PROGRESS.md		VOKALA_PROGRESS.md
build.gradle.kts		build.gradle.kts
gradle.properties		gradle.properties
gradlew		gradlew
gradlew.bat		gradlew.bat
settings.gradle.kts		settings.gradle.kts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🗣️ Vokala

✨ Features

🏗️ Architecture

🚀 Current Status

✅ Phase 1: Core Audio Pipeline (Completed)

🔄 Phase 2: Advanced Speech Recognition (In Progress)

📋 Upcoming Phases

🛠️ Tech Stack

Android

Rust Core (`lingua_core`)

Dependencies

🚀 Getting Started

Prerequisites

Setup

Project Structure

🎯 Key Features in Detail

Audio Processing Pipeline

Android Integration

🔧 Development

Building the Rust Core

Running Tests

Debug Features

📱 Supported Devices

🤝 Contributing

📄 License

🙏 Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🗣️ Vokala

✨ Features

🏗️ Architecture

🚀 Current Status

✅ Phase 1: Core Audio Pipeline (Completed)

🔄 Phase 2: Advanced Speech Recognition (In Progress)

📋 Upcoming Phases

🛠️ Tech Stack

Android

Rust Core (lingua_core)

Dependencies

🚀 Getting Started

Prerequisites

Setup

Project Structure

🎯 Key Features in Detail

Audio Processing Pipeline

Android Integration

🔧 Development

Building the Rust Core

Running Tests

Debug Features

📱 Supported Devices

🤝 Contributing

📄 License

🙏 Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Rust Core (`lingua_core`)

Packages