INDEX

College of Santa Fe Auditory Theory

Lecture 018 Timbre I

INSTRUCTOR CHARLES FEILDING

  1. What is timbre?
  2. Acoustics of timbre
  3. Note Envelope
  4. Note onset
  5. Psychoacoustics of timbre
  6. Critical Bands and Timbre
  7. Acoustic cues and timbre perception
  8. The pipe organ as a timbral synthesizer
  9. Brain Bullets

5.1 What is timbre? 210

Pitch and loudness are two of three important descriptors of musical sounds commonly used by musicians; the other being 'timbre'. Pitch relates to issues such as notes on a score, key, melody, harmony, tuning systems, and intonation in performance. Loudness relates to matters such as musical dynamic (e.g. pp, p, mp, mf, f, ff, etc.), the balance between members of a musical ensemble (e.g. between individual parts, choir and orchestra, or soloist and accompaniment). Timbre to sound quality descriptions include: mellow, rich, covered, open, dull, bright, dark, strident, grating, harsh, shrill, sonorous, sombre colourless and lacklustre. Timbral descriptors are therefore used to indicate the perceived quality or tonal nature of a sound which can have a particular pitch and loudness also.

There is no subjective rating scale against which timbre judgements can be made, unlike pitch and loudness which can, on average, be reliably rated by listeners on scales from 'high' to 'low' The commonly quoted American National Standards Institute formal definition of timbre reflects this: 'Timbre is that attribute of auditory sensation in terms of which a listener can judge two sounds similarly presented and having the same loudness anc pitch as being dissimilar' (ANSI, 1960). In other words, two sounds that are perceived as being different but which have the same perceived loudness and pitch differ by virtue of their timbre. The timbre of a note is the aspect by which a listener recognizes the instrument which is playing a note when, for example, instruments play notes with the same pitch, loudness and duration. The definition given by Scholes (1970) encompasses some timbral descriptors: 'Timbre means tone quality-coarse or smooth, ringing or more subtly penetrating, "scarlet" like that of a trumpet, "rich brown" like that of a cello, or "silver" like that of the flute. These colour analogies come naturally to every mind ... The one and only factor in sound production which conditions timbre is the presence or absence, or relative strength or weakness, of overtones'. (Table 3.1 gives the relationship between overtones and harmonics.) Whilst his colour analogies might not come naturally to every mind, Scholes' later comments about the acoustic nature of sounds which have different timbres are a useful contribution to the acoustic discussion of the timbre of musical sounds.

When considering the notes played on pitched musical instruments, timbre relates to those aspects of the note which can be varied without affecting the pitch, duration or loudness of the note as a whole, such as the spectral components present and the way in which their frequencies and amplitudes vary during the sound. In Chapter 4 the acoustics of musical instruments is considered in terms of the output from the instrument as a consequence of the effect of the sound modifiers on the sound input (e.g. Figure 4.2). What is not considered, due to the complexity of modelling, is the acoustic development from silence at the start of the note and back to silence at the end. It is then, convenient to consider a note in terms of three phases: the 'onset' or 'attack' (the build-up from silence at the start of the note), the 'steady state' (the main portion of the note), and the 'offset' or 'release' (the return to silence at the end of the note after the energy source is stopped). The onset and offset portions of a note tend to last for a short time of the order of a few tens of milliseconds (or a few hundredths of a second). Changes that occur during the onset and offset phases, and in particular during the onset, turn out to have a very important role in defining the timbre of a note. In this chapter, timbre is considered in terms of the acoustics of sounds which have different timbres, and the psychoacoustics of how sounds are perceived. Finally, the pipe organ is reviewed in terms of its capacity to synthesize different timbres.

5.2 Acoustics of timbre 211

The description of the acoustics of notes played on musical instruments presented in Chapter 4 was in many cases, supported by plots of waveforms and spectra of the outputs from some instruments (Figures 4.17, 4.22, 4.24 and 4.29). Except in the plots for the plucked notes on the lute and guitar (Figure 4.11) where the waveforms are for the whole note and the spectra are for a single spectral analysis, the waveform plots show a few cycles from the steady-state phase of the note concerned and the spectral plots are based on averaging together individual spectral measurements taken during the steady-state phase. The number of spectra averaged together depend on how long the steady-state portion of the note lasts. For the single notes illustrated in Chapter 4, spectral averaging was carried out over approximately a quarter to three quarters of a second, depending on the length of the note available. An alternative way of thinking about this is in terms of the number of cycles of the waveform over which the averaging takes place, which would be 110 cycles for a quarter of a second to 330 cycles for three quarters of a second for A4 ([0 = 440 Hz), or 66 cycles to 198 cycles for C4 <10 = 261.6 Hz). Such average spectra are commonly used in for analysing the frequency components of musical notes, and they are known as 'long-term average spectra' or 'LTAS'. One main advantage of using LTAS is that the spectral features of interest during the steady state portion of the note are enhanced in the resulting plot by the averaging process with respect to competing acoustic sounds such as background noise which change over the period of the LTAS and thus average towards zero.

LTAS cannot, however, be used to investigate acoustic features that change rapidly such as the onset and offset of musical notes, because these will also tend to average towards zero. In terms of the timbre of the note, it is not only the variations that occur during the onset and offset that are of interest, but also how they change with time. Therefore an analysis method is required in which the timing of acoustic changes during a note is preserved in the result. One analysis technique commonly used for the acoustic analysis of speech is a plot of amplitude, frequency and time known as a 'spectrogram'. Frequency is plotted on the vertical scale, time on the horizontal axis and ampltude is plotted as the darkness on a grey scale, or in some cases the colour, of the spectrogram. The upper plot in Figure 5.1 shows a spectrogram and acoustic pressure waveform of C4 played on a principal 8' (open flue), the same note for which an L T AS is presented in Figure 4.17. The LTAS plot in Figure 4.17 showed that the first and second harmonics dominate the spectrum, the amplitude of the third harmonic being approximately 8 dB lower than the first harmonic, and with energy also clearly visible in the fourth, fifth,


Fig 5.1 Waveform and spectrogram of whole note (upper) and and onset phase (lower) for C4 played on the principle 8' stop (open flue) for which an LTAS is shown in Fig 4.17. Notge oneset, steady state, and offset phases are marked.


seventh and eighth harmonics whose amplitudes are at least 25 dB lower than that of the first harmonic.

A spectrogram shows which frequency components are present (measured against the vertical axis), at what amplitude (blackness of marking) and when (measured against the horizontal axis). Thus harmonics are represented on spectrograms as horizontal lines, where the vertical position of the line marks the frequency and the horizontal position shows the time for which that harmonic lasts. The amplitudes of the harmonics are plotted as the blackness of marking of the lines. The frequency and time axes on the spectrogram are marked and the amplitude is shown as the blackness of the marking. The spectrogram shown in Figure 5.1 shows three black horizontal lines which are the first three harmonics of the principal note (since the frequency axis is linear, they are equally spaced). The first and second harmonics are slightly blacker (and thicker) than the third, reflecting the amplitude difference as shown in Figure 4.17. The fourth, fifth and seventh harmonics are visible and their amplitude relative to the first harmonic is reflected in the blackness with which they are plotted.

5.2.1 Note envelope

The onset, steady-state, and offset phases of the note are indicated above the waveform in the Figure, and these are determined mainly with reference to the spectrogram because they relate to the changes in spectral content at the start and end of the note, leaving the steady portion in between. However, 'steady state' does not mean that no aspect of the note varies. The timbre of a principal organ stop sounds 'steady' during a prolonged note such as that plotted, which lasts for approximately 2 seconds, but it is clear from the acoustic pressure waveform plot in Figure 5.1 that the amplitude, or 'envelope', varies even during the so-called 'steady-state' portion of this note. This is an important aspect of musical notes to be aware of when, for example, synthesising notes of musical instruments; particularly if using looping techniques on a sampling synthesiser.

For the principal pipe, the end of the note begins when the key is released and the air flowing from the organ bellows to drive the air reed is stopped. In the note offset for this example which lasts approximately 200 ms, the high harmonics die away well before the first and second. However, interpretation of note offsets is rather difficult if the recording has been made in an enclosed environment as opposed to free space (see Chapter 4), since any reverberation due to the acoustics of the space are also being analysed (see Chapter 6). It is difficult to see the details of the note onset in this example, due to the time scale required to view the complete note.

The note onset phase is particularly important to perceived timbre. Since listeners can reliably perceive the timbre of notes during the steady-state phase, it is clear that the offset phase is rather less important to the perception of timbre than the onset and steady state phases. The onset phase is also more acoustically robust from the effects of the local environment in which the notes are played, since colouration of the direct sound by the first reflection (see Chapter 6) may occur after the onset phase is complete (and therefore transmitted uncoloured to the listener). By definition the first reflection certainly occurs after part of the note onset has been heard uncoloured. The onset phase is therefore a vital element and the offset phase an important factor in terms of timbre perception. Spectrograms whose time scales are expanded to cover the time of the note onset phase are particularly useful when analysing notes acoustically. The lower plot in Figure 5.1 shows an expanded timescale version of the upper plot in the figure, showing the note onset phase which lasts approximately 70 ms, and start of the steadystate phase. It can be seen that the detail of the onset instant of each of the harmonics is clearly visible, with the second harmonic starting approximately 30 ms before the first and third harmonics. This is a common feature of organ pipes voiced with a chiff or consonantal onset which manifests itself acoustically in the onset phase as a initial jump to the first, or sometimes higher, overblown mode. The first overblown mode for an open flue pipe is to the second harmonic (see Chapter 4). Careful listening to pipes voiced with a chiff will reveal that open pipes start briefly an octave high since their first overblown mode is the second harmonic, and stopped pipes start an octave and a fifth high since their first overblown mode is the third harmonic. The fourth harmonic in the figure starts with the third and its amplitude briefly drops 60 ms into the sound when the fifth starts, and the seventh starts almost with the second and its amplitude drops 30 IDS later. The effect of the harmonic buildup on the acoustic pressure waveform can be observed in the figure in terms of the c11anges in its s11ape, particularly the gradual increase in amplitude, during the onset phase. The onset phase for this principal organ pipe is a complex series of acoustic events, or acoustic 'cues', which are available as potentially contributors to the listener's perception of the timbre of the note.

5.2.2 Note onset

In order to provide some data to enable appreciation of the importance of the note onset phase for timbre perception, Figures 5.2 to 5.4 are presented in which the note onset and start of the steady-state phases for four organ stops, four woodwind instruments and four brass instruments respectively are presented for the note C4 (except for the trombone and tuba for which the note is C3). By way of a caveat it should be noted that

Fig 5.2 Waveform (upper) and spctrogram (lower) of the note onset phase for C4 played on the pipe organ stops listed.LTAS for the hautbois and trompette notes are shown in Fih 4.22 and for the gedackt in Fig 4.17

these figures are presented to provide only examples of the general nature of the acoustics of the note onset phases for these instruments. Had these notes been played at a different loudness, by a different player, on a different instrument, in a different environment, or even just a second time by the same played on the same instrument in the same place while attempting to keep all details constant, the waveforms and spectra would probably be noticeably different.

The organ stops for which waveforms and spectra are illustrated in Figure 5.2 are three reed stops: hautbois and trompette (LTAS in Figure 4.22) and a regal, and gedackt (LTAS in Figure 4.17) which is an example of a stopped flue pipe. The stopped flue supports only the odd modes (see Chapter 4) , and during the onset phase of this particular example, the fifth harmonic sounds first, which is the second overblown mode sounding two octaves and a major third above the fundamental (see Figure 3.3),

Fig 5.3 Waveform (upper) and spectrogram (lower) of the note onset phase for C4 played on the instruments listed. LTAS for the clarinet and saxophone are shown in Fig 4.24


followed by the fundamental and then the third harmonic, giving a characteristic chiff to the stop. The onset phase for the reed stops is considerably more complicated since many more harmonics are present in each case. The fundamental for the hautbois and regal is evident first, and the second harmonic for the trompette. In all cases, the fundamental exhibits a frequency rise at onset during the first few cycles of reed vibration. The staggered times of entry of the higher harmonics. forms part of the acoustic timbral characteristic of that particular stop, the trompette having all harmonics present up to the 4 kHz upper frequency bound of the plot, the hautbois having all harmonics up to about 2.5 kHz, and the regal exhibiting little or no evidence (at this amplitude setting of the plot) of the fourth or eighth harmonics.

Figure 5.3 shows plots for four woodwind instruments: clarinet, oboe, tenor saxophone and flute. For these particular examples of the clarinet and tenor saxophone, the fundamental is apparent first and the oboe note begins with the second harmonic, followed by the third and fourth harn10nics after approximately 5 ms, and then the fundamental some 8 ms later. The higher harmonics of the clarinet are apparent nearly 30 ms after the fundamental; the dominance of the odd harmonics is discussed in Chapter 4. This particular example of C4 from the flute begins with a notably 'breathy' onset just prior to and as the fundamental component starts. This can be seen in the frequency region of the spectrogram that is above 2 kHz lasting some 70 ms. The higher harmonics enter one by one approximately 80 ms after the fundamental starts. The rather long note onset phase is characteristic of a flute note played with some deliberation. The periodicity in the waveforms develops gradually, and in all cases, there is an appreciable time over which the amplitude of the waveform reaches its steady state.

Figure 5.4 shows plots for four brass instruments. The notes played on the trumpet and French horn are C4 and those for the trombone and tuba are C3. The trumpet is the only example with energy in high harmonic components in this particular example, with the fourth, fifth and sixth harmonics having the highest amplitudes. The other instruments in this figure do not have energy apparent above approximately the fourth harmonic (French horn and tuba) or the sixth harmonic for the trombone. The note onset phase for all four instruments starts with the fundamental (noting that this is rather weak for the trombone) and continues with increasing frequency. The waveforms in all cases become periodic almost immediately.

Waveforms and spectrograms are presented in the upper plot of Figure 5.5 for C4 played with a bow on a violin. Approximately 250 ms into the violin note, vibrato is apparent as a frequency variation particularly in the high harmonics. This is a feature of using a linear frequency scale, since a change of x Hz in hi will manifest itself as a change of 2.:t" Hz in the second harmonic, 3x Hz in the third harmonic and so on. In general the frequency change in the 11th harmonic will be I1X Hz, therefore the frequency variation in the upper harmonics during vibrato will be greater than that for the lower harmonics when frequency is plotted on a linear scale as in the figure. Vibrato often has a delayed start, as in this example, as the player makes subtle intonation adjustments to the note. This particular bowed violin note has an onset phase of approximately 160 ms and an offset phase of some 250 ms.

Fig 5.4 Waveform (upper) and spectrogram (lower) of the note onset phase for C4 played on a trumpet and french horn and C3 played on a trombone and a tuba


Finally in this section, a note analyses from a CD recording of a professional tenor singing the last three syllables of the word 'vittoria' (i.e. 'toria') on Bb4 from the second act of Tosco by Puccini (lower plot in Figure 5.5). This is a moment in the score when the orchestra stops playing and leaves the tenor singing alone. The orchestra stops approximately 500 ms into this example, its spectrographic record can be seen particularly in the lower left-hand comer of the spectrogram, where it is almost impossible to analyse any detailed acoustic patterning. This provides just a hint at the real acoustic analysis task facing the hearing system when listening to music. The spectrogram of the tenor shows the harmonics and the extent of the vibrato very clearly, and his singer's formant (compare with Figure 4.38) can be clearly seen in the frequency region between 2.4 kHz and 3.5 kHz. The first and third of the three syllables ('toria') are

Fig 5.5 Waveform (upper) and spectrograph (lower) of C4 played on a violin and analysed from a CD recording of the last three syllables of the word vittoria from act 2 of Tosca by Puccinni sung by a professional tenor.


long, and the second ('ri') is considerably shorter in this particular tenor's interpretation. The second syllable manifests itself as the dip in amplitude of all harmonics just over half-way through the spectrogram.

5.3 Psychoacoustics of timbre 220

A number of psychoacoustic experiments have been carried out to explore listeners' perceptions of the timbre of musical instruments and the acoustic factors on which it depends. Such experiments have demonstrated, for example, that listeners cannot reliably identify musical instruments if the onset and offset phases of notes are removed. For example, if recordings of a note played on a violin open string and the same note played on a trumpet are modified to remove their onset and offset phases in each case, it becomes very difficult to tell them apart. The detailed acoustic nature of a number of example onset phases is provided in Figures 5.1 to 5.5, from which differences can be noted. Thus, for example the initial scraping of the bow on a stringed instrument, the consonant-like onset of a note played on a brass instrument, the breath noise of the flautist, the initial flapping of a reed, the percussive thud of a piano hammer and the final fall of the jacks of a harpsichord back onto the strings are all vital acoustic cues to the timbral identity of an instrument. Careful attention must be paid to such acoustic features, for example when synthesising acoustic musical instruments if the resulting timbre is to sound convincingly natural to listeners.

5.3.1 Critical bands and timbre

A psychoacoustic description of timbre perception must be based on the nature of the critical bandwidth variation with frequency since this describes the nature of the spectral analysis carried out by the hearing system. The variation in critical bandwidth is such that it becomes wider with increasing frequency, and the general conclusion was drawn in the section on pitch perception in Chapter 3 (Section 3.2) that no harmonic above about the fifth to seventh is resolved no matter what the value of f0, Harmonics below the fifth to seventh are therefore resolved separately by the hearing system (e.g. see Figure 3.11), which suggests that these harmonics might play a distinct and individual role in timbre perception. Harmonics above the fifth or seventh, on the other hand, which are not separately isolated by the hearing system are not likely to have such a strong individual effect on timbre perception, but could affect it as groups that lie within a particular critical band. Based on this notion, the perceived timbre is reviewed of instruments for which the results of acoustic analysis are presented in this book, bearing in mind that these analyses are for single examples of notes played on these instruments by a particular player on a particular instrument at a particular loudness and pitch in a particular acoustic environment. Instruments amongst those for which spectra have been presented that have significant amplitudes in harmonics above the fifth or seventh during their steady-state phases include organ reed stops (see Figures 4.22 and 5.2), the tenor saxophone (see Figures 4.24 and 5.3), the trumpet (see Figure 5.4), the violin and professional singing voice (see Figure 5.5). The timbres of such instruments might be compared with those of other instruments using descriptive terms such as 'bright', 'brilliant', or 'shrill'. Instruments which do not exhibit energy in harmonics above the fifth or seventh during their steady-state phases include the principal 8' (see Figures 4.17 and 5.1), the gedackt 8' (see Figures 4.17 and 5.2), the clarinet, oboe and flute (see Figures 4.24 and 5.3), and the trombone, French horn and tuba (see Figures 4.29 and 5.4). In comparison with their counterpart organ stops or other instruments of their category (woodwind or brass), their timbres might be described as being: 'less bright' or 'dark', 'less brilliant' or 'dull', or 'less shrill' or 'bland'.

Within this latter group of instruments there is an additional potential timbral grouping between those instruments which exhibit all harmonics up to the fifth or seventh, such as the clarinet, oboe, flute, compared with those which just have a few low harmonics such as the principal 8', gedackt 8', trombone, French horn and tuba. It may come as a surprise to find the flute in the same group as the oboe and clarinet, but the lack of the seventh harmonic in the flute spectrum compared to the clarinet and oboe (see Figure 5.3) is crucial. Notes excluding the seventh harmonic sound considerably less 'reedy' than those with it, the seventh harmonic is one of the lowest which is not resolved by the hearing system (provided the sixth and/or eighth are/is also present). This last point is relevant to the clarinet where the seventh harmonic is present but both the sixth and eighth are weak. The clarinet has a particular timbre of its own due to the dominance of the odd harmonics in its output, and it is often described as being 'nasal'. Organists who are familiar with the effect of the tierce (1 X) and the rarely found septieme (1~) stops (see Section 5.4) will appreciate the particular timbral significance of the fifth and seventh harmonics respectively and the 'reediness' they tend to impart to the overall timbre when used in combination with other stops.

Percussion instruments which make use of bars, membranes or plates as their vibrating system (described in Section 4.4) which are struck have a distinct timbral quality of their own. This is due to the non-harmonic relationship between the frequencies of their natural modes which provides a clear acoustic cue to their family identity. It gives the characteristic 'clanginess' to this class of instruments which endows them with a timbral quality of their own.

5.3.2 Acoustic cues and timbre perception

Timbre judgements are highly subjective and therefore individualistic. Unlike pitch or loudness judgements, where listeners might be asked to rate sounds on scales of low to high or soft to loud respectively, there is no 'right' answer for timbre judgements. Listeners will usually be asked to compare the timbre of different sounds and rate each numerically between two opposite extremes of descriptive adjectives, for example on a one to ten scale between 'bright' (1)-'dark' (10) or 'brilliant' (1) 'dull' (0), and a number of such descriptive adjective pairs could be rated for a particular sound. The average of ratings obtained from a number of listeners is often used to give a sound an overall timbral description. Hall (1991) suggests that it is theoretically possible that one day up to five specific rating scales could be 'sufficient to accurately identify almost any timbre'.

Researchers have attempted to identify relationships between particular features in the outputs from acoustic musical instruments and their perceived timbre. A significant experiment in this field was conducted by Grey (1977). Listeners were asked to rate the similarity between recordings of pairs of synthesised musical instruments on a numerical scale from one to thirty. All sounds were equalised in pitch, loudness and duration. The results were analysed by 'multidimensional scaling' which is a computational technique that places the instruments in relation to each other in a multidimensional space based on the similarity ratings given by listeners. In Grey's experiment, a threedimensional space was chosen and each dimension in the resulting three-dimensional representation was then investigated in terms of the acoustic differences between the instru ments lying along it 'to explore the various factors which contributed to the subjective distance relationships'. Grey identified the following acoustic factors with respect to each of the three axes: (1) 'spectral energy distribution' observed as increasing high-frequency components in the spectrum; (2) 'synchronicity in the collective attacks and decays of upper harmonics' from sounds with note onsets in which all harmonics enter in close time alignment to those in which the entry of the harmonics is tapered; and (3) from sounds with 'precedent high-frequency, low-amplitude energy, most often inharmonic energy, during the attack phase' to those without highfrequency attack energy. These results serve to demonstrate that (a) useful experimental work can and has been carried out on timbre, and (b) that acoustic conclusions can be reached which fit in with other observations, for example the emphasis of Grey's axes (2) and (3) on the note onset phase.

The sound of an acoustic musical instrument is always changing, even during the rather misleadingly so-called 'steady-state' portion of a note. This is clearly shown, for example in the waveforms and spectrograms for the violin and sung notes in Figure 5.5. Pipe organ notes are often presented as being 'steady' due to the inherent air flow regulation within the instrument, but Figure 5.1 ShO",IS that even the acoustic output from a single organ pipe has an amplitude envelope that is not particularly steady. This effect manifests itself perceptually extremely clearly when attempts are made to synthesise the sounds of musical instruments electronically and no attempt is made to vary the sound in any way during its steady state. Variation of some kind is needed during any sound in order to hold the listener's attention. The acoustic communication of new information to a listener, whether speech, music, environmental sounds or warning signals from a natural or person-made source, requires that the input signal varies in some way, with time. Such variation may be of the pitch, loudness or timbre of the sound. The popularity of post-processing effects, particularly chorus (see Chapter 7), either as a feature on synthesisers themselves or as a studio effects unit reflects this. However, whilst these can make sounds more interesting to listen to by time variation imposed by adding post-processing, such an addition rarely does anything to improve the overall naturalness of a synthesised sound.

A note from any acoustic musical instrument typically changes dynamically throughout in its pitch, loudness and timbre. Pitch and loudness have one dimensional subjective scales from 'low' to 'high' which can be related fairly directly to physical changes which can be measured, but timbre has no such one-dimensional subjective scale. Methods have been proposed to track the dynamic nature of timbre based on the manner in which the harmonic content of a sound changes throughout. The 'tristimuIus diagram' described by Pollard and Jansson (1982) is one such method in which the time course of individual notes is plotted on a triangular graph such as the example plotted in Figure 5.6. The graph is plotted based on the proportion of energy in (1) the second, third and fourth harmonics or 'mid' frequency components (Y axis); and (2) the high-frequency partials, which here are the fifth and above, or 'high' frequency components (X axis); and (3) the fundamental or /., (where X and Y tend towards zero). The corners of the plot in Figure 5.6 are marked: 'mid', 'high' and 'fo' to indicate this. A point on a tristimulus diagram therefore indicates the relationship between /'1' harmonics which are resolved and harmonics which are not resolved.

The tristimulus diagram enables the dynamic relationship between high, mid and It) to be plotted as 11 line, and 11 number

Fig 5.6 Approximate timbre representation by means of a tristimulus diagram for note onsets of notes played on a selection of instruments. In each case the note onset tracks along the lines towards the open circle which represents the approximate steady state position. Mid represents strong mid frequency partials, high represents strong high frequency partials and f0 represents strong fundamental.


are shown in the figure for the note onset phases of notes from a selection of instruments (data from Pollard and Jansson, 1982). The time course is not even and is not calibrated here for clarity. The approximate steady-state position of each note is represented by the open circle, and the start of the note is at the other end of the line. The note onsets in these examples lasted as follows: gedackt 00-60 ms); trumpet (10-100 ms); clarinet (30-160 ms); principal 00-150 ms); and viola 00-65 ms). The tracks taken by each note is very different and the steady-state positions lie in different locations. Pollard and Jansson present data for additional notes on some of these instruments which suggest that each instrument maintains its approximate position on the tristimulus representation as shown in the figure. This provides a method for visualising timbral differences between instruments which is based on critical band analysis. It also provides a particular representation which gives an insight as to the nature of the patterns which could be used to represent timbral differences perceptually.

There is still much work to be done on timbre to understand it more fully. Whilst there are results and ideas which indicate what acoustic aspects of different instruments contribute to the perception of their timbre differences, such difference are far too coarse to explain how the experienced listeners are able to tell apart the timbre differences between, for example violins made by different makers. The importance of timbre in music performance has been realised for many hundreds of years as manifested in the so-called 'king' of instruments-the pipe organ -well before there was any detailed knowledge of the function of the human hearing system.

5.4 The pipe organ as a timbral synthesiser 226

There are references to the existence of some form of pipe organ since at least 250 BC (e.g. Sumner, 1975), and it is one of the earliest forms of an acoustic timbral synthesiser based on the 'harmonic additive synthesis' principle for the production of sound. In harmonic additive synthesis, the timbre of the output sound is manipulated by means of adding harmonics together, and the stops of a pipe organ provide the means for this process. An organ stop which has the same f0 values as on a piano (i.e. 10 for its A4 is 440 Hz-see Figure 3.21) is known as an 'eight foot' (8') rank on the manuals and 'sixteen foot' (16') rank on the pedals, because eight and sixteen feet are the approximate lengths of open pipes of the bottom note of a manual (C2) and the pedals (C1) respectively. A 4' rank and a 2' rank would sound one and two octaves higher than an 8' rank respectively, and a 32' rank would sound one octave lower than a 16' rank. It should be noted that the footage terminology is used to denote the sounding pitch of the rank and give no indication as to whether open or stopped pipes are employed. Thus the bottom pipes of a stopped rank on a manual sounding a pitch equivalent to a rank of 8' open pipes would be four foot long physically but its stop knob would be labelled 8'. Organs have a number of stops on each manual of various footages, most of which are flues. Some are voiced to be used alone as solo stops usually as 8' stops, but the majority are voiced to blend together, allowing variations in loudness and timbre to be achieved by acoustic synthesis involving drawing different combinations of stops. The timbral changes are controlled by reinforcing the natural ham10nics of the 8' harmonic series on the manuals (16 foot harmonic series for the pedals). The following equation relates the footage of a stop to the member of the 8' natural harmonic series which its 10 reinforces: 8 Stop footage = N (5.1) where N = harmonic number (1, 2, 3, ...)



Example 5.1 Find the footage of pipe organ stops which reinforce the third and sixth natural harmonics of the 8' harmonic series.

The third harmonic is reinforced by a stop of 8/3 = 2 2/3'

The sixth harmonic is reinforced by a stop of 8/6 = 1 1/3'


However, it is important to note that a single 8 foot principal stop, the foundation tone of the organ, produces a sound which is itself rich in harmonics (see Figure 5.1). Therefore the addition of a 4' principal will enhance not only the second harmonic of the 8' stop, but it will also enhance all other even harmonics. The odd harmonics of the 8' pipe are not members of the harmonic series of the 4' pipe. In general, when a stop is added whose to is set to reinforce a member (n = 1, 2, 3, 4, ...) of the natural harmonic series at 8' pitch on the manuals 06' pitch on the pedals), it enhances the (2n, 311, 411, ...) members also. Those stops which reinforce harmonics which are not in unison 0:1) with, or a whole number of octaves (i.e. 2:1,4:1, 8:1, ... 2,,:1) away from the first harmonic are known as 'mutation' stops.

There is a basic pipe organ timbral problem when tuning the instrument to equal temperament (see Chapter 3). Stops have to be tuned in their appropriate integer frequency ratio (see Figure 3.3) to reinforce harmonics appropriately, but as a result of this those which therefore introduce beats when chords are played. For example, supposing two stops are drawn, an 8' and a 2 2/3. The 2 2/3 stop sounds an octave and a third above the 8' stop, and reinforces the third harmonic of the 8' harmonic series and therefore it must be exactly in tune with the third harmonic of the 8' stop. Thus if middle C is played with these two stops drawn, the f0 of the C on the 2 2/3 rank will be exactly in tune with the third harmonic of the C on the 8' rank. If the organ is tuned in equal temperament and the G above middle C is also played to form a two-note chord, the second harmonic of the G on the 8' rank will beat with the to of the C on the 2 2/3 rank as well as with the third harmonic of the C on the 8' rank. Equal-tempered tuning thus colours with beats the desired effect of adding mutation stops to build up the timbre of the organ. Mutation stops therefore tended to go out of fashion with the introduction of equal-tempered tuning on pipe organs (Padgham, 1986). Recent revivals in authentic performance of early music has extended to the pipe organ with the use of non-equal-tempered tuning systems and increased use of mutation stops. This gives new life particularly to contrapuntal music.

 

You Need to Know

 

Pitch relates to issues such as notes on a score, key, melody, harmony, tuning systems, and intonation in performance. Loudness relates to matters such as musical dynamic (e.g. pp, p, mp, mf, f, ff, etc.), the balance between members of a musical ensemble (e.g. between individual parts, choir and orchestra, or soloist and accompaniment). Timbre to sound quality descriptions include: mellow, rich, covered, open, dull, bright, dark, strident, grating, harsh, shrill, sonorous, sombre colourless and lacklustre. Timbral descriptors are therefore used to indicate the perceived quality or tonal nature of a sound which can have a particular pitch and loudness also.

 

The commonly quoted American National Standards Institute formal definition of timbre reflects this: 'Timbre is that attribute of auditory sensation in terms of which a listener can judge two sounds similarly presented and having the same loudness anc pitch as being dissimilar'

 

When considering the notes played on pitched musical instruments, timbre relates to those aspects of the note which can be varied without affecting the pitch, duration or loudness of the note as a whole, such as the spectral components present and the way in which their frequencies and amplitudes vary during the sound.

 

It is then, convenient to consider a note in terms of three phases: the 'onset' or 'attack' (the build-up from silence at the start of the note), the 'steady state' (the main portion of the note), and the 'offset' or 'release' (the return to silence at the end of the note after the energy source is stopped). The onset and offset portions of a note tend to last for a short time of the order of a few tens of milliseconds (or a few hundredths of a second). Changes that occur during the onset and offset phases, and in particular during the onset, turn out to have a very important role in defining the timbre of a note.

 

In terms of the timbre of the note, it is not only the variations that occur during the onset and offset that are of interest, but also how they change with time.

 

'steady state' does not mean that no aspect of the note varies. The timbre of a principal organ stop sounds 'steady' during a prolonged note such as that plotted, which lasts for approximately 2 seconds, but it is clear from the acoustic pressure waveform plot in Figure 5.1 that the amplitude, or 'envelope', varies even during the so-called 'steady-state' portion of this note. This is an important aspect of musical notes to be aware of when, for example, synthesising notes of musical instruments; particularly if using looping techniques on a sampling synthesiser

 

Psychoacoustics of timbre

that listeners cannot reliably identify musical instruments if the onset and offset phases of notes are removed. For example, if recordings of a note played on a violin open string and the same note played on a trumpet are modified to remove their onset and offset phases in each case, it becomes very difficult to tell them apart.

 

Thus, for example the initial scraping of the bow on a stringed instrument, the consonant-like onset of a note played on a brass instrument, the breath noise of the flautist, the initial flapping of a reed, the percussive thud of a piano hammer and the final fall of the jacks of a harpsichord back onto the strings are all vital acoustic cues to the timbral identity of an instrument. Careful attention must be paid to such acoustic features, for example when synthesising acoustic musical instruments if the resulting timbre is to sound convincingly natural to listeners.

 

Critical bands and timbre

The variation in critical bandwidth is such that it becomes wider with increasing frequency, and the general conclusion was drawn in the section on pitch perception in Chapter 3 (Section 3.2) that no harmonic above about the fifth to seventh is resolved no matter what the value of f0

Harmonics below the fifth to seventh are therefore resolved separately by the hearing system (e.g. see Figure 3.11), which suggests that these harmonics might play a distinct and individual role in timbre perception.

Harmonics above the fifth or seventh, on the other hand, which are not separately isolated by the hearing system are not likely to have such a strong individual effect on timbre perception, but could affect it as groups that lie within a particular critical band.

Instruments amongst those for which spectra have been presented that have significant amplitudes in harmonics above the fifth or seventh during their steady-state phases include organ reed stops, the tenor saxophone, the trumpet , the violin and professional singing voice. The timbres of such instruments might be compared with those of other instruments using descriptive terms such as 'bright', 'brilliant', or 'shrill'. Instruments which do not exhibit energy in harmonics above the fifth or seventh during their steady-state phases include the principal 8', the gedackt 8', the clarinet, oboe and flute, and the trombone, French horn and tuba. In comparison with their counterpart organ stops or other instruments of their category (woodwind or brass), their timbres might be described as being: 'less bright' or 'dark', 'less brilliant' or 'dull', or 'less shrill' or 'bland'.

Within this latter group of instruments there is an additional potential timbral grouping between those instruments which exhibit all harmonics up to the fifth or seventh, such as the clarinet, oboe, flute, compared with those which just have a few low harmonics such as the principal 8', gedackt 8', trombone, French horn and tuba. It may come as a surprise to find the flute in the same group as the oboe and clarinet, but the lack of the seventh harmonic in the flute spectrum compared to the clarinet and oboe (see Figure 5.3) is crucial. Notes excluding the seventh harmonic sound considerably less 'reedy' than those with it, the seventh harmonic is one of the lowest which is not resolved by the hearing system (provided the sixth and/or eighth are/is also present). This last point is relevant to the clarinet where the seventh harmonic is present but both the sixth and eighth are weak. The clarinet has a particular timbre of its own due to the dominance of the odd harmonics in its output, and it is often described as being 'nasal'.

Percussion instruments which make use of bars, membranes or plates as their vibrating system which are struck have a distinct timbral quality of their own. This is due to the non-harmonic relationship between the frequencies of their natural modes which provides a clear acoustic cue to their family identity. It gives the characteristic 'clanginess' to this class of instruments which endows them with a timbral quality of their own.

Acoustic cues and timbre perception

Timbre judgements are highly subjective and therefore individualistic. Unlike pitch or loudness judgements, where listeners might be asked to rate sounds on scales of low to high or soft to loud respectively, there is no 'right' answer for timbre judgements.

 

Grey identified the following acoustic factors with respect to each of the three axes: (1) 'spectral energy distribution' observed as increasing high-frequency components in the spectrum; (2) 'synchronicity in the collective attacks and decays of upper harmonics' from sounds with note onsets in which all harmonics enter in close time alignment to those in which the entry of the harmonics is tapered; and (3) from sounds with 'precedent high-frequency, low-amplitude energy, most often inharmonic energy, during the attack phase' to those without high frequency attack energy.

Variation of some kind is needed during any sound in order to hold the listener's attention. The acoustic communication of new information to a listener, whether speech, music, environmental sounds or warning signals from a natural or person-made source, requires that the input signal varies in some way, with time. Such variation may be of the pitch, loudness or timbre of the sound. The popularity of post-processing effects, particularly chorus (see Chapter 7), either as a feature on synthesisers themselves or as a studio effects unit reflects this. However, whilst these can make sounds more interesting to listen to by time variation imposed by adding post-processing, such an addition rarely does anything to improve the overall naturalness of a synthesised sound.

The pipe organ as a timbral synthesiser

An organ stop which has the same f0 values as on a piano (i.e. 10 for its A4 is 440 Hz) is known as an 'eight foot' (8') rank on the manuals and 'sixteen foot' (16') rank on the pedals, because eight and sixteen feet are the approximate lengths of open pipes of the bottom note of a manual (C2) and the pedals (C1) respectively. A 4' rank and a 2' rank would sound one and two octaves higher than an 8' rank respectively, and a 32' rank would sound one octave lower than a 16' rank. It should be noted that the footage terminology is used to denote the sounding pitch of the rank and give no indication as to whether open or stopped pipes are employed. Thus the bottom pipes of a stopped rank on a manual sounding a pitch equivalent to a rank of 8' open pipes would be four foot long physically but its stop knob would be labelled 8'. Organs have a number of stops on each manual of various footages, most of which are flues. Some are voiced to be used alone as solo stops usually as 8' stops, but the majority are voiced to blend together, allowing variations in loudness and timbre to be achieved by acoustic synthesis involving drawing different combinations of stops. The timbral changes are controlled by reinforcing the natural hamonics of the 8' harmonic series on the manuals (16 foot harmonic series for the pedals).

However, it is important to note that a single 8 foot principal stop, the foundation tone of the organ, produces a sound which is itself rich in harmonics (see Figure 5.1). Therefore the addition of a 4' principal will enhance not only the second harmonic of the 8' stop, but it will also enhance all other even harmonics. The odd harmonics of the 8' pipe are not members of the harmonic series of the 4' pipe. In general, when a stop is added whose to is set to reinforce a member (n = 1, 2, 3, 4, ...) of the natural harmonic series at 8' pitch on the manuals 06' pitch on the pedals), it enhances the (2n, 311, 411, ...) members also. Those stops which reinforce harmonics which are not in unison 0:1) with, or a whole number of octaves (i.e. 2:1,4:1, 8:1, ... 2,,:1) away from the first harmonic are known as 'mutation' stops.

 

There is a basic pipe organ timbral problem when tuning the instrument to equal temperament (see Chapter 3). Stops have to be tuned in their appropriate integer frequency ratio (see Figure 3.3) to reinforce harmonics appropriately, but as a result of this those which therefore introduce beats when chords are played. For example, supposing two stops are drawn, an 8' and a 2 2/3. The 2 2/3 stop sounds an octave and a third above the 8' stop, and reinforces the third harmonic of the 8' harmonic series and therefore it must be exactly in tune with the third harmonic of the 8' stop. Thus if middle C is played with these two stops drawn, the f0 of the C on the 2 2/3 rank will be exactly in tune with the third harmonic of the C on the 8' rank. If the organ is tuned in equal temperament and the G above middle C is also played to form a two-note chord, the second harmonic of the G on the 8' rank will beat with the to of the C on the 2 2/3 rank as well as with the third harmonic of the C on the 8' rank. Equal-tempered tuning thus colours with beats the desired effect of adding mutation stops to build up the timbre of the organ. Mutation stops therefore tended to go out of fashion with the introduction of equal-tempered tuning on pipe organs (Padgham, 1986). Recent revivals in authentic performance of early music has extended to the pipe organ with the use of non-equal-tempered tuning systems and increased use of mutation stops. This gives new life particularly to contrapuntal music.