Trainable windowing coefficients in DNN for raw audio classification

dc.creatorGarcía , Mario Alejandro
dc.creatorDestefanis, Eduardo
dc.creatorRosset, Ana Lorena
dc.date.accessioned2025-10-13T20:41:41Z
dc.date.issued2020
dc.description.abstractAn artificial neural network for audio classification is pro posed. This includes the windowing operation of raw audio and the calculation of the power spectrogram. A windowing layer is initialized with a hann window and its weights are adapted during training. The non-trainable weights of spectrogram calculation are initialized with the discrete Fourier transform coefficients. The tests are performed on the Speech Commands dataset. Results show that adapting the windowing coefficients produces a moderate accuracy improvement. It is concluded that the gradient of the error function can be propagated through the neural calculation of the power spectrum. It is also concluded that the training of the windowing layer improves the model’s ability to general iz
dc.description.affiliationFil: García, Mario Alejandra. Universidad Tecnológica Nacional. Facultad Regional Córdoba. Grupo de Inteligencia Artificial; Argentina.
dc.description.affiliationFil: Destefanis, Eduardo. Universidad Tecnológica Nacional. Facultad Regional Córdoba. Centro de Investigación en Informática para la Ingeniería; Argentina.
dc.description.affiliationFil: Rosset, Ana Lorena. Universidad Nacional de Córdoba. Facultad de Ciencias Médicas. Escuela de Fonoaudiología; Argentina.
dc.description.peerreviewedPeer Reviewed
dc.formatpdf
dc.identifier.urihttps://hdl.handle.net/20.500.12272/13952
dc.language.isoen
dc.publisherCloud computing, big data y emerging topics
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 Internationalen
dc.rights.holderGarcía, Mario Alejandro; Rosset, Ana Lorena; Destefanis, Eduardo.
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/
dc.rights.usehttps://creativecommons.org/licenses/by-nc-nd/4.0/
dc.sourcecommunications in computer and information science (1291)
dc.subjectDeep learning
dc.subjectDeep neural network
dc.subjectSpeech recognition
dc.subjectRaw audio
dc.titleTrainable windowing coefficients in DNN for raw audio classification
dc.typeinfo:eu-repo/semantics/article
dc.type.versionpublisherVersion

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Trainable Windowing Coefficients in DNN for Raw Audio Classification.pdf
Size:
484.28 KB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
3.63 KB
Format:
Item-specific license agreed upon to submission
Description: