datasets
This is yet another attempt of maintaining a list of datasets directly related to MIR. Other lists that I have found are the ISMIR page, this web page, and this web page. If you are interested in speech processing, you can find a table of speech datasets on this page. If you are interested in multi-tracks, the Open Multitrack Testbed should be a good starting point. UPF also has an excellent page with datasets for world-music, including Indian art music, Turkish Makam music, and Beijing Opera. A curated list of MIDI sources can be found here. Two additional general resources are piano-midi.de for MIDI files and freesound.org for audio files.
If you know of other data sets that should be included in this list please create an issue/pull request or just send me a note.
get the book
@ IEEE@ Wiley
get the code
matlabpython
C++