2025-10-132020https://hdl.handle.net/20.500.12272/13952An artificial neural network for audio classification is pro posed. This includes the windowing operation of raw audio and the calculation of the power spectrogram. A windowing layer is initialized with a hann window and its weights are adapted during training. The non-trainable weights of spectrogram calculation are initialized with the discrete Fourier transform coefficients. The tests are performed on the Speech Commands dataset. Results show that adapting the windowing coefficients produces a moderate accuracy improvement. It is concluded that the gradient of the error function can be propagated through the neural calculation of the power spectrum. It is also concluded that the training of the windowing layer improves the model’s ability to general izpdfenAttribution-NonCommercial-NoDerivatives 4.0 Internationalhttp://creativecommons.org/licenses/by-nc-nd/4.0/Deep learningDeep neural networkSpeech recognitionRaw audioTrainable windowing coefficients in DNN for raw audio classificationinfo:eu-repo/semantics/articleGarcía, Mario Alejandro; Rosset, Ana Lorena; Destefanis, Eduardo.https://creativecommons.org/licenses/by-nc-nd/4.0/