Search Page | SpringerLink

Article

Predicting spotify audio features from Last.fm tags

Music information retrieval (MIR) is an interdisciplinary research field that focuses on the extraction, processing, and knowledge discovery of...

Jaime Ramírez Castillo, M. Julia Flores, Philippe Leray in Multimedia Tools and Applications

02 November 2023

Conference paper

Flow-Audio-Synth: A Video-to-Audio Model which Captures Dynamic Features

With the rapid development of cross-modal generative models, video-to-audio generation has become a novel endeavor. Some previous papers process the...

Yupeng Zheng, Zixiang Lu, ... Xiangzeng Liu in Pattern Recognition and Computer Vision

2025

Article

MP3 Audio watermarking using calibrated side information features for tamper detection and localization

Audio contents are frequently stocked up and transmitted in compressed formats. Among the many existing audio compression schemes, MPEG-1 Audio Layer...

Salma Masmoudi, Maha Charfeddine, Chokri Ben Amar in Multimedia Tools and Applications

17 January 2024

Article

LIFA: Language identification from audio with LPCC-G features

In Western countries, speech recognition-based technologies have significantly developed compared to the countries of the South Asian subcontinent...

Himadri Mukherjee, Ankita Dhar, ... Umapada Pal in Multimedia Tools and Applications

14 December 2023

Article

Full access

Comparative analysis of audio classification with MFCC and STFT features using machine learning techniques

In the era of automated and digitalized information, advanced computer applications deal with a major part of the data that comprises audio-related...

Mahendra Kumar Gourisaria, Rakshit Agrawal, ... Pradeep Kumar Singh in Discover Internet of Things

03 January 2024 Open access

Article

Time-frequency visual representation and texture features for audio applications: a comprehensive review, recent trends, and challenges

The conventional audio feature extraction methods employed in the audio analysis are categorized into time-domain and frequency-domain. Recently, a...

Yogita D. Mistry, Gajanan K. Birajdar, Archana M. Khodke in Multimedia Tools and Applications

16 March 2023

Conference paper

Enhanced On-Device Video Summarization Using Audio and Visual Features

Video Summarization is gaining popularity, as there is lot of content available. Video Summarization is about determining the key or primary moments...

Lokesh kumar Thandaga Nagaraju, Ranjitha B, Jani Basha Shaik in Computer Vision and Image Processing

2024

Conference paper

Audio-Visual Segmentation by Leveraging Multi-scaled Features Learning

Audio-visual segmentation with semantics (AVSS) is an advanced approach that enriches Audio-visual segmentation (AVS) by incorporating object...

Sze An Peter Tan, Guangyu Gao, Jia Zhao in MultiMedia Modeling

2024

Conference paper

AudioFormer: Channel Audio Encoder Based on Multi-granularity Features

To solve the problem of poor standardized feature extraction methods for speech emotion recognition tasks and insufficient depth representation...

Jialin Wang, Yunfeng Xu, ... Shaojie Zhao in Neural Information Processing

2024

Article

Robust multimedia spam filtering based on visual, textual, and audio deep features and random forest

Nowadays, there is a growing demand among Internet and social media users for improved protection against spam. Despite numerous studies focused on...

Marouane Kihal, Lamia Hamza in Multimedia Tools and Applications

03 April 2023

Conference paper

Generating Smooth Mood-Dynamic Playlists with Audio Features and KNN

Users curate music playlists for many purposes, including focus, enjoyment and therapy. Popular music streaming services generate playlists...

Shaurya Gaur, Patrick J. Donnelly in Artificial Intelligence in Music, Sound, Art and Design

2024

Article

Audio-Visual Segmentation with Semantics

We propose a new problem called audio-visual segmentation (AVS), in which the goal is to output a pixel-level map of the object(s) that produce sound...

Jinxing Zhou, Xuyang Shen, ... Yiran Zhong in International Journal of Computer Vision

15 October 2024

Article

Efficient audio-visual emotion recognition approach

Emotion recognition systems are gaining more and more importance in Artificial Intelligence. Most recently, approaches for emotion recognition based...

Areej Alasiry, Majd Al-Hussain, ... N. Ben Hadj-Alouane in Multimedia Tools and Applications

07 January 2025

Article

Full access

Multimodal fusion for audio-image and video action recognition

Multimodal Human Action Recognition (MHAR) is an important research topic in computer vision and event recognition fields. In this work, we address...

Muhammad Bilal Shaikh, Douglas Chai, ... Naveed Akhtar in Neural Computing and Applications

09 January 2024 Open access

Article

Detecting audio copy-move forgery with an artificial neural network

Given how easily audio data can be obtained, audio recordings are subject to both malicious and unmalicious tampering and manipulation that can...

Fulya Akdeniz, Yaşar Becerikli in Signal, Image and Video Processing

11 January 2024

Article

Full access

Cascaded cross-modal transformer for audio–textual classification

Speech classification tasks often require powerful language understanding models to grasp useful features, which becomes problematic when limited...

Nicolae-Cătălin Ristea, Andrei Anghel, Radu Tudor Ionescu in Artificial Intelligence Review

02 August 2024 Open access

Article

SVMFI: speaker video multi-frame interpolation with the guidance of audio

Due to network constraints like latency and bandwidth, speaker videos often suffer from low frame rates and frequent frame drops. Video frame...

Qianrui Wang, Dengshi Li, ... Aolei Chen in Multimedia Tools and Applications

12 December 2023

Article

Pushing the boundaries of deepfake audio detection with a hybrid MFCC and spectral contrast approach

The proliferation of deepfake audio content presents a formidable challenge in today’s digital landscape, necessitating advanced detection techniques...

Ameni Jellali, Ines Ben Fredj, Kaïs Ouni in Multimedia Tools and Applications

22 July 2024

Article

Neural Network-Based Multi-class Model for Abnormal Heartbeat Audio Signal Detection

Detection of abnormal heartbeats, or arrhythmias, is crucial for early diagnosis and management of cardiac diseases. Traditional methods, such as...

Pavan P. Kashyap, Revanasiddappa Madihalli, ... S. Rohith in SN Computer Science

23 December 2024

Article

Leveraging CNN and principal component analysis for dynamic variance control in audio compression

This study addresses challenges arising from large audio file storage needs and rising network bandwidth demands. In this paper, a novel audio codec...

Asish Debnath, Uttam Kr. Mondal in International Journal of Information Technology

18 August 2024

Search

Filters

Search Results

Search

Navigation