Rhythmic Research > Eigenrhythms > 5. Conclusion 5. DISCUSSION AND CONCLUSIONS Our approach can be contrasted with other work that looks at the detail of rhythm patterns. Laroche has presented a system able to extract ‘swing’ represented by slight systematic timing shifts of beats 2 and 4; his approach encodes more musical knowledge (with a correspondingly narrower applicability) than we wished to use [9]. Paulus and Klapuri present a system for comparing the rhythm patterns between two different pieces, overcoming variations in drum sounds with cepstral normalization and minor timing differences through dynamic time warping, but they do not attempt to build a parametric space of rhythmic variants [11].
One natural application for the eigenrhythm representation is for the generation of rhythm patterns that interpolate between the different points, e.g. in the plane of figure 4. We have built a crude Matlab interface to synthesize the rhythm patterns corresponding to any value of the first eight eigenrhythms allowing us to investigate the space, but we hope to produce something a bit more interactive. In evaluating our current system, it became clear that the 2 s excerpts used in modeling were too short. Many tracks had drum patterns that repeated at a larger scale that this, and the system had little chance of finding the appropriate downbeat within such patterns when only part of the cycle is modeled. Even so, the downbeat detection appears to require a more sophisticated approach. Lacking some absolute principle to decide where the cycle starts, we believe that our technique of matching a model derived from actual data is the right basic approach, but it may need to be seeded with ground-truth on actual downbeats for at least some of the training examples, and could require a family of prototype patterns (e.g. finding the temporal alignment that supports the best fit from a set of eigenrhythms) rather than relying on a single, average template. As noted above, our initial interest was simply to collect a large body of drum patterns to see what the princi- pal components would be like. However, a more careful musical information extraction technique would consider the entire drum track of a piece, looking for the regularly repeating patterns and perhaps also modeling the less repetitive breaks and ornamentations. We think the eigen anal- ysis should also be applicable to drum breaks, if they can be effectively extracted, although because their duration is less constrained some kind of sequential structure (such as a hidden Markov model) might be appropriate. One could imagine a Markov model where each state is represented by values or a distribution in eigenrhythm space, and transition probabilities encode the likely evolution of the entire piece.
There are many details even in the work we have described that deserve closer examination. Where we have investigated alternatives at all our main metric has been the genre classification accuracy, which is so low as to be suspect and doesn’t directly address our main interest of defining a space of ‘good’ drum patterns. One idea is to replace the asymmetric envelopes used to represent each drum event with a smoother shape like a full Gaussian. This might allow small time shifts (like Laroche’s ‘swing’) to be effectively encoded by eigenvectors that can perform a linear cross-fade between two nearby peaks. If, on the other hand, we wished to pursue the genre classification application, we could look at more discriminative ways to define our basis functions, such as us-ing Linear Discriminant Analysis (LDA) in place of PCA. LDA is another procedure for finding a low dimensional projection of a dataset, but it uses class labels associated with each training pattern to find projections that maximally separate classes, to be the most useful in classification [3]. There are several other interesting projection algorithms to consider. Independent Component Analysis (ICA) finds basis projections that are not orthogonal but which maximize the statistical independence of the projected coefficients, which can be a more semantically relevant decom-position [7]. Non-negative Matrix Factorization (NMF) finds linear basis sets where all the coefficients are positive or zero, so each pattern is approximated by a process of ‘adding in’ parts, rather than the balancing contrasts seen in our eigenrhythms [10]. When the underlying dataset is intrinsically nonnegative, as in our case, this can be an interesting alternative transformation; some previous applications to musical audio are reported in [14]. In conclusion, we have introduced a new representation for the complex but constrained class of popular music drum patterns, and derived our basic eigenrhythm patterns by scaling and aligning a corpus of drum tracks from real pieces, encoded as MIDI files. We hope to use larger datasets and a deeper analysis to come up with a more general model of the stylistic variations in rhythm patterns, and we hope to be able to train from, and apply to, actual recorded waveforms.
|
Featured Project
Eigenrhythms Current/Future Projects
Eigensynth: Derivative Beat Box Past Projects
Phase Vocoder |
|||
Eigenrhythms | index Download | Long Version, ISMIR Version (pdf format) |
||||
| Rhythmic Research | About | Links | John Arroyo |