The initial approach was the following:
The data stream of the kinect was interpreted by a C# program, implementing the offical Kinect SDK. The gestures should be directly mapped to their corresponding feature, which is then sent out via OSC (implemented using the Bespoke OSC Library). The data of the OSC is then sent through an Ethernet connection to a second computer running Ableton Live, where all sound generation and manipulation should happen. The OSC stream was received and mapped through the Max4Live plugin Livegrabber.
This rather complicated setup was used because for both overall performance and the prior experience of using both Ableton Live and the .net C# environment. After many unsuccessful attempts on synchronising the many components, this setup configuration was deemed too complex and was left in favor of the setup described in the following paragraph.
The final setup was surprisingly simple: Processing was used to both read the Kinect data and interpret it as well as play music and modify it accordingly. For the first part, the Simple OpenNI Library was used and Minim for the second.
Performance is good enough to guarantee smooth image drawing and sound manipulation.
The mapping is still a work-in-progress:
- Overall volume is controlled by the user's distance.
- Cutoff frequency of a low-pass filter is mapped to the relative height of the right hand.
- Delay volume is mapped to the relative height of the left hand.
For complexity reasons, it was decided to first implement full functionality for a single user before introducing multi-user interaction. If multiple users are detected, only the last one has an active role.
As planned the visual interface is implemented through a grey scale point-cloud that draws the depth map of the camera and differentiates multiple users by colour. If no player is present, a MOVE! message is displayed. Below, a screenshot of the visual interface during testing phase can be seen. Please note that the representation is much better in full screen.
- Smoothing for big parameter jumps (probably simple interpolation)
- Adding more than one musical pattern
- Proper multi-user support
- Implement special gestures (clapping etc)