  1. Is it better to have a very short sample like 0.25 s so that there's definitely no other noise inside, or is it better to have a many second interval so there's more data to work with? The only other noise that would be in it is minor, even less loud than the static.
  2. I was wondering if there's a better way of enhancing understanding other than what I did before. My main problem currently is that when the person touches the microphone or breathes in it, it's extremely loud and I can't hear what he is saying in those moments. Is there a solution for that? I tried restricing the allowed frequencies to 85-255Hz (as I read on Google that human speech is in that range) in order to remove random noise, but this removed too much from the speech as well. Previous work. Can I do these things better? The sound varied a lot as he sometimes talks into the microphone, sometimes walks away and sometimes the microphone touches something, which is very loud. This was resolved with "Normalise". Static was reduced with spectral subtraction based on noise sample.
