Trainable windowing coefficients in DNN for raw audio classification
dc.creator | García , Mario Alejandro | |
dc.creator | Destefanis, Eduardo | |
dc.creator | Rosset, Ana Lorena | |
dc.date.accessioned | 2025-10-13T20:41:41Z | |
dc.date.issued | 2020 | |
dc.description.abstract | An artificial neural network for audio classification is pro posed. This includes the windowing operation of raw audio and the calculation of the power spectrogram. A windowing layer is initialized with a hann window and its weights are adapted during training. The non-trainable weights of spectrogram calculation are initialized with the discrete Fourier transform coefficients. The tests are performed on the Speech Commands dataset. Results show that adapting the windowing coefficients produces a moderate accuracy improvement. It is concluded that the gradient of the error function can be propagated through the neural calculation of the power spectrum. It is also concluded that the training of the windowing layer improves the model’s ability to general iz | |
dc.description.affiliation | Fil: García, Mario Alejandra. Universidad Tecnológica Nacional. Facultad Regional Córdoba. Grupo de Inteligencia Artificial; Argentina. | |
dc.description.affiliation | Fil: Destefanis, Eduardo. Universidad Tecnológica Nacional. Facultad Regional Córdoba. Centro de Investigación en Informática para la Ingeniería; Argentina. | |
dc.description.affiliation | Fil: Rosset, Ana Lorena. Universidad Nacional de Córdoba. Facultad de Ciencias Médicas. Escuela de Fonoaudiología; Argentina. | |
dc.description.peerreviewed | Peer Reviewed | |
dc.format | ||
dc.identifier.uri | https://hdl.handle.net/20.500.12272/13952 | |
dc.language.iso | en | |
dc.publisher | Cloud computing, big data y emerging topics | |
dc.rights | Attribution-NonCommercial-NoDerivatives 4.0 International | en |
dc.rights.holder | García, Mario Alejandro; Rosset, Ana Lorena; Destefanis, Eduardo. | |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/4.0/ | |
dc.rights.use | https://creativecommons.org/licenses/by-nc-nd/4.0/ | |
dc.source | communications in computer and information science (1291) | |
dc.subject | Deep learning | |
dc.subject | Deep neural network | |
dc.subject | Speech recognition | |
dc.subject | Raw audio | |
dc.title | Trainable windowing coefficients in DNN for raw audio classification | |
dc.type | info:eu-repo/semantics/article | |
dc.type.version | publisherVersion |
Files
Original bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- Trainable Windowing Coefficients in DNN for Raw Audio Classification.pdf
- Size:
- 484.28 KB
- Format:
- Adobe Portable Document Format
License bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- license.txt
- Size:
- 3.63 KB
- Format:
- Item-specific license agreed upon to submission
- Description: