Fake Voice Detection

🌐 Website: echo-reality-check.vercel.app

Overview

Our platform provides a powerful solution to detect whether a given voice recording is real (human) or synthetically generated by AI. As synthetic voices become increasingly indistinguishable from real human speech, the risk of audio-based impersonation and misinformation is growing rapidly.

What It Does

Users can upload any audio clip — from phone calls, voice notes, podcast snippets, or social media uploads — and our system will analyze the audio using state-of-the-art models trained on both real and synthetic datasets. Within seconds, the platform provides a result indicating whether the voice is real or AI-generated, along with a confidence score and key signal indicators.

How It Works

Models : We are using an ensemble learning approach, combining multiple models specialized in audio anomaly detection and deepfake recognition. The ensemble gives a prediction confidence score indicating how strongly the system believes the audio is real or fake.
Input Methods : Users can submit audio by upload via UI option or API Integration. Both routes support near real-time inference and return structured prediction data.
Final Verdict:
- Classifies the clip as either "Real" or "Fake".
- Confidence Score: A percentage indicating how confident the model is in its prediction.
- Feature-Based Explanation: Highlights the specific audio features that influenced the decision, such as unnatural frequency shifts, missing micro-modulations, or robotic harmonics.
- Deviation Metrics: Shows how much the audio differs from the learned human baseline across the most deviated features.
The platform is optimized for speed. Typical analysis time is under 5 seconds, making it usable in real-time applications such as voice verification, emergency fraud checks, or content screening pipelines.

Dataset: We have trained it on the dataset with both real human voices and AI-generated samples.
Features Used
- We have trained our models initially on a set of 72 extracted audio features like Mel-spectrograms or MFCCs, spectral centroid etc
- These features feed into both the detection models and the explanation engine, making results both accurate and interpretable.

💻 Try It Out

Visit echo-reality-check.vercel.app to check our POC.
Click on one of the 15 available audio samples
Get an instant prediction with a confidence score
Compare how well the model performs on real vs fake voices