(PhD) Physical Modelling meets Machine Learning: Performing Music with a Virtual String Ensemble
This dissertation describes a new method of computer performance of bowed string instruments (violin, viola, cello) using physical simulations and intelligent feedback control. Computer synthesis of music performed by bowed string instruments is a challenging problem. Unlike instruments whose notes originate with a single discrete excitation (e.g., piano, guitar, drum), bowed string instruments are controlled with a continuous stream of excitations (i.e. the bow scraping against the string). Most existing synthesis methods utilize recorded audio samples, which perform quite well for single-excitation instruments but not continuous-excitation instruments.
This work improves the realism of synthesis of violin, viola, and cello sound by generating audio through modelling the physical behaviour of the instruments. A string’s wave equation is decomposed into 40 modes of vibration, which can be acted upon by three forms of external force: A bow scraping against the string, a left-hand finger pressing down, and/or a right-hand finger plucking. The vibration of each string exerts force against the instrument bridge; these forces are summed and convolved with the instrument body impulse response to create the final audio output. In addition, right-hand haptic output is created from the force of the bow against the string. Physical constants from ten real instruments (five violins, two violas, and three cellos) were measured and used in these simulations. The physical modelling was implemented in a high-performance library capable of simulating audio on a desktop computer one hundred times faster than real-time. The program also generates animated video of the instruments being performed.
To perform music with the physical models, a virtual musician interprets the musical score and generates actions which are then fed into the physical model. The resulting audio and haptic signals are examined with a support vector machine, which adjusts the bow force in order to establish and maintain a good timbre. This intelligent feedback control is trained with human input, but after the initial training is completed the virtual musician performs autonomously. A PID controller is used to adjust the position of the left-hand finger to correct any flaws in the pitch. Some performance parameters (initial bow force, force correction, and lifting factors) require an initial value for each string and musical dynamic; these are calibrated automatically using the previously-trained support vector machines. The timbre judgements are retained after each performance and are used to pre-emptively adjust bowing parameters to avoid or mitigate problematic timbre for future performances of the same music.
The system is capable of playing sheet music with approximately the same ability level as a human music student after two years of training. Due to the number of instruments measured and the generality of the machine learning, music can be performed with ensembles of up to ten stringed instruments, each with a distinct timbre. This provides a baseline for future work in computer control and expressive music performance of virtual bowed string instruments.
I am currently preparing an 8-12 page booklet about the PhD.
phd-percival.pdf, (approx 11 Mb)