Home
World Journal of Advanced Research and Reviews
International Journal with High Impact Factor for fast publication of Research and Review articles

Main navigation

  • Home
    • Journal Information
    • Abstracting and Indexing
    • Editorial Board Members
    • Reviewer Panel
    • Journal Policies
    • WJARR CrossMark Policy
    • Publication Ethics
    • Instructions for Authors
    • Article processing fee
    • Track Manuscript Status
    • Get Publication Certificate
    • Current Issue
    • Issue in Progress
    • Past Issues
    • Become a Reviewer panel member
    • Join as Editorial Board Member
  • Contact us
  • Downloads

eISSN: 2581-9615 || CODEN (USA): WJARAI || Impact Factor: 8.2 || ISSN Approved Journal

FusionNet: A parallel deep learning model for speech recognition with feature clustering

Breadcrumb

  • Home
  • FusionNet: A parallel deep learning model for speech recognition with feature clustering

Revati Harichandra Ramteke 1, * and Seema B. Rathod 2

1 Research Scholar, MTech Computer Science and Engineering, SIPNA College of Engineering and Technology, Amravati. 

2 Professor, Computer Science and Engineering, SIPNA College of Engineering and Technology, Amravati.

Research Article

World Journal of Advanced Research and Reviews, 2025, 28(03), 001-008

Article DOI: 10.30574/wjarr.2025.28.3.4010

DOI url: https://doi.org/10.30574/wjarr.2025.28.3.4010

Received 22 October 2025; revised on 29 November 2025; accepted on 01 December 2025

FusionNet is a parallel, hybrid deep-learning framework engineered for next-generation speech recognition and on-device speech-to-text processing. The system is implemented as an Android application (Java/XML) and integrated with Firebase Realtime Database to support secure, user-centric data management. Audio input undergoes a multi-stage preprocessing pipeline where MFCC, spectral, and temporal features are extracted and clustered using K-Means to group acoustically similar speech segments. These clustered representations are simultaneously processed through a dual-branch architecture: a Convolutional Neural Network (CNN) that learns spectral signatures and a Bidirectional Long Short-Term Memory (BiLSTM) network that models temporal dependencies. The fused embeddings are then classified using a Random Forest classifier, improving prediction stability in noisy or accent-variable conditions.

To enhance semantic clarity, an NLP engine supported by a generative AI model refines the raw transcriptions, corrects contextual errors, and extracts user intent. Real-time inference is achieved via TensorFlow Lite (TFLite), enabling low-latency, energy-efficient execution directly on mobile hardware without cloud dependency. FusionNet demonstrates robustness against ambient noise, speaker variability, and multilingual inputs, making it a practical and scalable solution for voice-driven applications. This hybrid architecture effectively combines clustering, parallel deep learning, classical ML classification, and generative AI reasoning to deliver an intelligent, high-accuracy speech recognition system tailored for real-world deployment. 

Speech Recognition; Fusionnet; MFCC; CNN–Bilstm; Feature Clustering; K-Means; Random Forest; NLP; Generative AI; Speech-To-Text; On-Device AI; Tensorflow Lite; Mobile Deep Learning; Firebase Realtime Database; Multilingual Processing

https://journalwjarr.com/sites/default/files/fulltext_pdf/WJARR-2025-4010.pdf

Get Your e Certificate of Publication using below link

Download Certificate

Preview Article PDF

Revati Harichandra Ramteke and Seema B. Rathod. FusionNet: A parallel deep learning model for speech recognition with feature clustering. World Journal of Advanced Research and Reviews, 2025, 28(03), 001-008. Article DOI: https://doi.org/10.30574/wjarr.2025.28.3.4010.

Copyright © 2025 Author(s) retain the copyright of this article. This article is published under the terms of the Creative Commons Attribution Liscense 4.0

Footer menu

  • Contact

Copyright © 2026 World Journal of Advanced Research and Reviews - All rights reserved

Developed & Designed by VS Infosolution