Download Statistical approach for sounds modeling
This article introduces a mathematical approach to extracting some statistical parameters from noise-like sounds. This approach could define a new spectral model or extend existing ones to analyze and synthesize such complex sounds. In the future, this method could also permit electro-acoustic composers to make musical transformations. We have performed several synthesis tests with synthetic and natural sounds using these parameters. The main assumptions is that analyzed sounds must not contain any transient or slow-varying deterministic component. The natural sounds resynthesized during our experiments sounds like the original ones even if some defects are perceptible because of the analysis limitations. We propose some initial solutions for future works.
Download A system for data-driven concatenative sound synthesis
In speech synthesis, concatenative data-driven synthesis methods prevail. They use a database of recorded speech and a unit selection algorithm that selects the segments that match best the utterance to be synthesized. Transferring these ideas to musical sound synthesis allows a new method of high quality sound synthesis. Usual synthesis methods are based on a model of the sound signal. It is very difficult to build a model that would preserve the entire fine details of sound. Concatenative synthesis achieves this by using actual recordings. This data-driven approach (as opposed to a rule-based approach) takes advantage of the information contained in the many sound recordings. For example, very naturally sounding transitions can be synthesized, since unit selection is aware of the context of the database units. The C ATERPILLAR software system has been developed to allow data-driven concatenative unit selection sound synthesis. It allows high-quality instrument synthesis with high level control, explorative free synthesis from arbitrary sound databases, or resynthesis of a recording with sounds from the database. It is based on the new software-engineering concept of component-oriented software, increasing flexibility and facilitating reuse.
Download Harmonic-band wavelet coefficient modeling for pseudo-periodic sounds processing
In previous papers [1], [2] we introduced a model for pseudo-periodic sounds based on Wornell results [3] concerning the synthesis of 1/f noise by means of the Wavelet transform (WT). This method provided a good model for representing not only the harmonic part of reallife sounds but also the stochastic components. The latter are of fundamental importance from a perceptual point of view since they contain all the information related to the natural dynamic of musical timbres. In this paper we introduce a refinement of the method, making the spectralmodel technique more flexible and the resynthesis coefficient model more accurate. In this way we obtain a powerful tool for sound processing and cross-synthesis.
Download Flexible software framework for modal synthesis
Modal synthesis is an important area of physical modeling whose exploration in the past has been held back by a large number of control parameters, the scarcity of generalpurpose design tools and the difficulty of obtaining the computational power required for real-time synthesis. This paper presents an overview of a flexible software framework facilitating the design and control of instruments based on modal synthesis. The framework is designed as a hierarchy of polymorphic synthesis objects, representing modal structures of various complexity. As a method of generalizing all interactions among the elements of a modal system, an abstract notion of energy is introduced, and a set of energy transfer functions is provided. Such abstraction leads to a design where the dynamics of interactions can be largely separated from the specifics of particular modal structures, yielding an easily configurable and expandable system. A real-time version of the framework has been implemented as a set of C++ classes along with an integrating shell and a GUI, and is currently being used to design and play modal instruments, as well as to survey fundamental properties of various modal algorithms.
Download 3D graphics tools for sound collections
Most of the current tools for working with sound work on single soundfiles, use 2D graphics and offer limited interaction to the user. In this paper we describe a set of tools for working with collections of sounds that are based on interactive 3D graphics. These tools form two families: sound analysis visualization displays and model-based controllers for sound synthesis algorithms. We describe the general techniques we have used to develop these tools and give specific case studies from each family. Several collections of sounds were used for development and evaluation. These are: a set of musical instrument tones, a set of sound effects, a set of FM radio audio clips belonging to several music genres, and a set of mp3 rock song snippets.
Download Analysing auditory representations for sound classification with self-organising neural networks
Three different auditory representations—Lyon’s cochlear model, Patterson’s gammatone filterbank combined with Meddis’ inner hair cell model, and mel-frequency cepstral coefficients—are analyzed in connection with self-organizing maps to evaluate their suitability for a perceptually justified classification of sounds. The self-organizing maps are trained with a uniform set of test sounds preprocessed by the auditory representations. The structure of the resulting feature maps and the trajectories of the individual sounds are visualized and compared to one another. While MFCC proved to be a very efficient representation, the gammatone model produced the most convincing results.
Download Visualization and calculation of the roughness of acoustical musical signals using the Synchronization Index Model (SIM)
The synchronization index model of sensory dissonance and roughness accounts for the degree of phase-locking to a particular frequency that is present in the neural patterns. Sensory dissonance (roughness) is defined as the energy of the relevant beating frequencies in the auditory channels with respect to the total energy. The model takes rate-code patterns at the level of the auditory nerve as input and outputs a sensory dissonance (roughness) value. The synchronization index model entails a straightforward visualization of the principles underlying sensory dissonance and roughness, in particular in terms of (i) roughness contributions with respect to cochlear mechanical filtering (on a Critical Band scale), and (ii) roughness contributions with respect to phase-locking synchrony (=the synchronization index for the relevant beating frequencies on a frequency scale). This paper presents the concept, and implementation of the synchronization index model and its application to musical scales.
Download The best of two worlds: retrieving and browsing
This paper describes the combination of two software systems for work with music corpora in electronic formats. A set of algorithms has been developed in CPN View (a class library for representing music scores) that deals with music score processing. These facilitate access to the ever-increasing collections of music corpora [1]. The Sonic Browser (a browser that uses sonic spatialization for navigating music or sound databases) has been developed to the proof-of-concept and prototype implementation stage. In previous work it has been demonstrated that with the Sonic Browser it is up to 28% faster for users to find a particular melody in a set of melodies, compared to visual browsing [2].
Download Blackboard system and top-down processing for the transcription of simple polyphonic music
A system is proposed to perform the automatic music transcription of simple polyphonic tracks using top-down processing. It is composed of a blackboard system of three hierarchical levels, receiving its input from a segmentation routine in the form of an averaged STFT matrix. The blackboard contains a hypotheses database, a scheduler and knowledge sources, one of which is a neural network chord recogniser with the ability to reconfigure the operation of the system, allowing it to output more than one note hypothesis at a time. The basic implementation is explained, and some examples are provided to illustrate the performance of the system. The weaknesses of the current implementation are shown and next steps for further development of the system are defined.
Download Robust multipich estimation for the analysis and manipulation of polyphonic musical signals
A method for the estimation of the multiple pitches of concurrent musical sounds is described. Experimental data comprised sung vowels and the whole pitch range of 26 musical instruments. Multipitch estimation was performed at the level of a single time frame for random pitch and sound source combinations. Note error rates for mixtures ranging from one to six simultaneous sounds were 2.1 %, 2.4 %, 3.8 %, 8.1 %, 12 %, and 18 %, respectively. In musical interval and chord identification tasks, the algorithm outperformed the average of ten trained musicians. Particular emphasis was laid on robustness in the presence of other sounds and noise. The algorithm is based on an iterative estimation and separation procedure and is able to resolve at least a couple of most prominent pitches even in ten sound polyphonies. Sounds that exhibit inharmonicities can be handled without problems, and the inharmonicity factor and spectral envelope of each sound is estimated along with the pitch. Examples are given of musical signal manipulations that become possible with the proposed method.