Search
Search Results
-
Predicting spotify audio features from Last.fm tags
Music information retrieval (MIR) is an interdisciplinary research field that focuses on the extraction, processing, and knowledge discovery of...
-
Flow-Audio-Synth: A Video-to-Audio Model which Captures Dynamic Features
With the rapid development of cross-modal generative models, video-to-audio generation has become a novel endeavor. Some previous papers process the... -
MP3 Audio watermarking using calibrated side information features for tamper detection and localization
Audio contents are frequently stocked up and transmitted in compressed formats. Among the many existing audio compression schemes, MPEG-1 Audio Layer...
-
LIFA: Language identification from audio with LPCC-G features
In Western countries, speech recognition-based technologies have significantly developed compared to the countries of the South Asian subcontinent...
-
Comparative analysis of audio classification with MFCC and STFT features using machine learning techniques
In the era of automated and digitalized information, advanced computer applications deal with a major part of the data that comprises audio-related...
-
Time-frequency visual representation and texture features for audio applications: a comprehensive review, recent trends, and challenges
The conventional audio feature extraction methods employed in the audio analysis are categorized into time-domain and frequency-domain. Recently, a...
-
Enhanced On-Device Video Summarization Using Audio and Visual Features
Video Summarization is gaining popularity, as there is lot of content available. Video Summarization is about determining the key or primary moments... -
Audio-Visual Segmentation by Leveraging Multi-scaled Features Learning
Audio-visual segmentation with semantics (AVSS) is an advanced approach that enriches Audio-visual segmentation (AVS) by incorporating object... -
AudioFormer: Channel Audio Encoder Based on Multi-granularity Features
To solve the problem of poor standardized feature extraction methods for speech emotion recognition tasks and insufficient depth representation... -
Robust multimedia spam filtering based on visual, textual, and audio deep features and random forest
Nowadays, there is a growing demand among Internet and social media users for improved protection against spam. Despite numerous studies focused on...
-
Generating Smooth Mood-Dynamic Playlists with Audio Features and KNN
Users curate music playlists for many purposes, including focus, enjoyment and therapy. Popular music streaming services generate playlists... -
Audio-Visual Segmentation with Semantics
We propose a new problem called audio-visual segmentation (AVS), in which the goal is to output a pixel-level map of the object(s) that produce sound...
-
Efficient audio-visual emotion recognition approach
Emotion recognition systems are gaining more and more importance in Artificial Intelligence. Most recently, approaches for emotion recognition based...
-
Multimodal fusion for audio-image and video action recognition
Multimodal Human Action Recognition (MHAR) is an important research topic in computer vision and event recognition fields. In this work, we address...
-
Detecting audio copy-move forgery with an artificial neural network
Given how easily audio data can be obtained, audio recordings are subject to both malicious and unmalicious tampering and manipulation that can...
-
Cascaded cross-modal transformer for audio–textual classification
Speech classification tasks often require powerful language understanding models to grasp useful features, which becomes problematic when limited...
-
SVMFI: speaker video multi-frame interpolation with the guidance of audio
Due to network constraints like latency and bandwidth, speaker videos often suffer from low frame rates and frequent frame drops. Video frame...
-
Pushing the boundaries of deepfake audio detection with a hybrid MFCC and spectral contrast approach
The proliferation of deepfake audio content presents a formidable challenge in today’s digital landscape, necessitating advanced detection techniques...
-
Neural Network-Based Multi-class Model for Abnormal Heartbeat Audio Signal Detection
Detection of abnormal heartbeats, or arrhythmias, is crucial for early diagnosis and management of cardiac diseases. Traditional methods, such as...
-
Leveraging CNN and principal component analysis for dynamic variance control in audio compression
This study addresses challenges arising from large audio file storage needs and rising network bandwidth demands. In this paper, a novel audio codec...