|
4.2. User Interface
The emphasis of this project is mainly the investigation of the proposed recognition techniques and therefore a great deal of time will not be placed on designing a user interface. If possible, a simple user interface could be developed for the training and testing of the networks.
4.3 Pre-processing
4.3.1. Reading data from .rcy files
The application is required to read data from .rcy files, in batches of one or more. The sample rate of each file must be recorded. The starting points of each drum hit should be detectable and it should be noted if the hit length is shorter than the maximum FFT length. A variable number of adjacent fixed length sample data segments (of nominal length 100ms) should be readable form the start of each drum, together with a sample segment taken from the very end of each drum sample (for noise reduction purposes).
All sample segment data should be passable to the FFT algorithm.
Identification files recording the actual type of each drum within a .rcy file should be creatable and readable.
4.3.2. Processing FFT data
The FFT output should be quantised into discrete frequency bands the number of which should be variable. High pass filtering must be available during quantisation, this will nominally be set at 40Hz (to remove unwanted subsonic frequencies), and should be variable during the training phase (in order to remove low frequencies from higher pitched drum types eg hats and congas). Frequencies above 14kHz should be discarded. The amplitude and frequency of peak magnitude should be recorded for each drum. It should be possible to normalise the frequency magnitude vectors for each drum individually or in respect to the break (.rcy file) to which it belongs. Noise filtering of the type discussed earlier must be available.
4.4. Neural Networks
All network types should be capable of saving and reading all weight vectors and of recording total classification error for the complete test data.
|
|