The Fine Art of Mickey-Mousing
Notes on Music and Film (3)
A car is shown in drone’s-eye view on the highway. Diving down and zooming in, the propellored camera briefly hovers close to the vehicle. Through the window we see a couple arguing. At least, that’s what we guess they’re doing, watching their hands and faces; all we hear is highway noise. Then, miraculously and suddenly, we’re inside on the back seat. Now we hear what it’s all about: …
Close coordination of sound and image is the modicum of realism that we expect in the cinema: its eye-ear synchrony roughly mirrors that of the real world. Roughly; for particularly in the manipulation of perspective there is room for creative freedom, which may easily solidify into convention. Suppose we hear the dialogue up close right from the start, with the camera still high up in the air. The split audiovisual perspective will hardly be noticed if the next shot reveals the quarrellers, as we expected.
What seems to be a matter of course in perception: the synchrony of hearing and seeing, is not a straightforward matter technically. The specific requirements of photography and sound recording are not easily reconciled. Just think of The Dueling Cavalier, that movie-within-the-movie in Singing in the Rain, as a demonstrating of what may go wrong when sound is added in a silent movie studio. It may remind us of the brief moment in film history when synchronized sound was an option, but not a matter of course.

“Audiovisual Counterpoint”
In their famous Statement on Sound (1928), the Soviet film directors Eisenstein, Pudovkin and Alexandrov expressed their concern that theatrical, synchronized dialogue would destroy what they called “the culture of montage”: the freedom to create new meanings through unexpected combinations of diverse elements. Freedom from realistic synchrony should allow for “an orchestral counterpoint of visual and aural images” (Buhler 2018: 26, 27).
The musical term ‘counterpoint’, the combination of two or more distinct musical voices in orderly harmonic progressions, is here applied to contrasting messages in two different media. But the significance of harmonious ‘voices’ tends to get lost in that transfer. In practice, what it amounts to is mostly contrast (or ‘dissonance’, to stay with musical terms, as Eisenstein et. al. in fact speak of ‘discord’; Chion 2019: 36, Buhler 2018: 27). Typical are instances of ironic juxtaposition, as when Shostakovich “scores a scene in which the heroine sobs out her agony to a party official with light-hearted, percussive music…” (Kalinak 2010: 59).
There is, evidently, a profound difference between the cinematic functions of sound – ‘sound effects’ – and those of music. While sound generally is the product of things happing on- or offscreen, audiences are quite used to accept the presence of music that has no clear origin in the plot – the cinematic equivalent of theatrical pit music. In opera and film, music is supposed to bear some significant relation to visible events; but that relation may be of many kinds. For the spectator it remains a continuous guessing game, played mostly subconsciously, and often with no precise answers.

Breaking the Noise Barrier
Even though profound, the distinction between sound effect and music is not rigid, and when music meets the image it may easily collapse. This is bound to happen when music is the sound world by exclusivity, as is the case in opera. It implies that anything that makes a sound does so musically (or should). Primarily, that is the human voice: speaking becomes singing. But the principle extends to all sounds. From clashing swords to waterfalls – when a sound is called for, it is provided by the orchestra, in a ‘musicalized’ form. In opera, unscored stage sounds are musically destructive. But, more interestingly: when music takes control of the total sound world, it may let us ‘hear’ events that in reality are noiseless, such as human gestures, and above all, emotions.
It is in the early animated cartoon that sound effects are most consistently drawn into the sphere of music, and vice versa. Cartoon music continually breaks through the noise barrier, the fine line that separates ordinary sounds from music. Following the action, all the sounds – all the wheezes, boings, wooshes and splashes – are made part of the music, as in comic books they are drawn into language.
Precisely this, according to Adorno and Eisler (Composing for the Films, 1947), is the source of the peculiar musical humour (Witz) of the cartoon. Music is reduced, for a moment, to sound object. One might call it ‘the objectification gimmick’.
For Adorno and Eisler, this is the manifestation of a broader cultural phenomenon: “the idea of technification”, which in the animated cartoon “has most deeply penetrated into the function of music” (2006: 133, my transl.). It is in itself already “somewhat comical” when music – ‘live’ by nature, something that exists only in the act of making music – is commodified, transformed into an endlessly reproducible object.
That part of the argument will probably be lost on present-day audiences. Technification has progressed so much further, that the old cartoons may sound and look nostalgically theatrical, while ‘real’ music is at risk of becoming a vanishing rarity. Even so, the Witz or gimmick remains effective: the fact that music may break the noise barrier without harm to its integrity.
It is not surprising, maybe, that the early cartoon shuns dialogue, but is full of on-stage music, music that is created by the characters themselves. On-stage or (‘diegetic’) on-screen music is at the same time music (evidently), and sound – because as such it belongs to the world that is represented. Music-making is in these cartoons a grim obsession: Mickey’s whistling, drumming and tootering in Steamboat Willie (1928) is as fanatic as his grin. He even turns his fellow animals into helpless musical instruments: a new twist to the objectification gimmick.
Rarely, maybe, have music and cruelty been such close companions.
“Harmful duplication”
The term mickey-mousing has soon become standard for close synchronization of music and screen events. It carried a pejorative meaning almost from the start (Goldmark 2013: 230); but that had little to do with animal ethics. One reason is that in close synchronization music becomes too notable. It may even seem to dominate the image – something unacceptable in classical Hollywood screen drama.
In the context of musical aesthetics, the practice runs up against a concern that upgrading the sound effect to music, implies a degradation of music to mere sound. The shock when music is brought down to earth, reduced to its physical characteristics, may actually be profound. It is not only an aesthetic shock (art versus real life), but also cognitive: recognizing sound as music implies qualities that lie well beyond the physical domain. (Imagine being told by your beloved: what a fine body you are!)
The practice also clashes with the idea that the audiovisual media should have their own integrity. According to Adorno and Eisler, music should not double what is already in the image: “harmful duplication” (schädliche Verdopplung) may be the result when music takes on an “illustrative” function (2006: 19). This is particularly reprehensible when those musical illustrations have degraded to cliché signals: the “Aha, nature” experience. (Aha, Romans/American Indians. Aha, jolly Irish.)
The alternative to such illustration is, again, ‘dramaturgical counterpoint’. The term has an aura of the learned, complex, and abstract. But it is not easy to imagine instances in which music and image relate as quasi-independent ‘voices’. Maybe the scene I’ve described above may count as a contrapuntal moment: image (car on the highway) and sound (couple quarrelling) are synchronous in the story line, but represent different perspectives.
Most often, however, counterpoint boils down to contrast. And counterpoint in this sense may become cliché as easily become as ‘illustration’. Just think of one of the worst ‘contrapuntal’ clichés in contemporary cinema: elegiac music accompanying scenes of extreme violence.
It has not escaped notice that Hanns Eisler’s own film scores may blatantly contradict the principle, even in the examples highlighted in Composing for the Films. His music for Joris Ivens’ silent documentary film Rain (Regen, 1929), composed in 1941, has been called “unremittingly descriptive” (Cook 200: 64). The wind is here rendered (“reproduced”, “translated”) by violin trills; shaking tree branches by a phrase in the piano; rain drops by second intervals; and a downpour by a tremolo (Adorno and Eisler 2006: 110-111).
Eisler’s score has a second life away from the screen as a chamber music piece titled 14 Ways of Describing Rain (Vierzehn Arten den Regen zu beschreiben). If this really is what it is, a description, it must have been even more duplicative as film music than any ‘illustration’. What is the point in describing what is visibly there?
Eisler’s 14 Ways might be called modernist mickey-mousing; and one may wonder what’s wrong with that. As an essay in advanced film music aesthetics however it is curiously anachronistic: the film Rain belongs to the silent era, when music had the privilege of not adding to, but being the sound world.
(Ivens’ Rain was originally released without sound; the Dutch composer-writer Lou Lichtveld wrote the first score in 1932.)

Tom-and-Jerrying
Mickey-mousing may have a reputation for being vulgar, crude and simplistic, but it becomes truly interesting when we start asking how exactly music relates to the worlds of sound and vision. In other words, when we consciously play the guessing game: why music?
Take, for instance, The Invisible Mouse, a 1947 Tom and Jerry cartoon with music by Scott Bradley.
About halfway the film we see a matchbook that seems to be dancing in the air, held (as we know) by the invisible mouse. The tauntingly good-humoured march tune that accompanies its movements supposedly expresses the mouse’s mood and intention. When a match seems to bend all by itself and then breaks away from the carton, the tune ‘bends down’ with it in what must be an auditory manifestation of the visible movement – even though in reality the bending is noiseless. It is an auditory image (‘illustration’?) of bending, rather than a bending noise. The match is struck with a realistic woosh, but when the flames break out, an orchestral trill presents an auditory ‘flickering’ that again has no plausible realistic origin. It may also express excitement; but whose could it be? The cat is still blissfully asleep; the mouse is out of the frame; maybe it’s ours?
When the fire bell rings, it is, we must assume, an internal alarm in the cat. But when the piano is played by invisible hands (or feet), it is ‘on-stage’ music. In this way, from moment to moment the music displays an amazing spectrum of functions. Most remarkable is that we grasp all this intuitively, without a moment’s puzzlement.
That it can all be be seamlessly integrated into the musical continuity, is above all due to the steady pulse which reigns over the rhythm. The images have been timed to a clear beat, as a dance, in the first stage of design. We may easily follow the beat even when we silence the music.