|
Vocal
extraction and re-mixing
Why
is suppressing the backing music more difficult than suppressing
vocals?
It
is often thought (even by professionals) that if you can remove
vocals by subtraction of left and right channels (phase cancellation),
then you can also extract vocals using
the same principle. Here it is argued in a very simple way that
this reasoning is in fact wrong.
Suppose
we represent the backing music by M, the vocal by V, and the left
and right channels by L and R respectively.
If the vocal is centre-panned, then its left and right parts are
the same (VL=VR=V)so that we can say L = ML + V, R = MR + V
Taking L-R or R-L we cancel the V (vocal) component, leaving the
difference of MR and ML, because L-R = ML+V-(MR+V)=ML-MR. Assuming
ML and MR are different, 'some' music is left (side-effects like
reverb effects etc are disregarded here, for simplicity)
Now,
it is often claimed that subtracting the vocal-less track from a
(mono) mix with vocal, will similarly cancel the music and leave
the vocal. This is not true.
Let's say the vocal-less track is L-R, like in the above. A mono
(sum) mix is L+R. Subtracting these two will yield either 2L or
2R, so certainly not something that would consist of an enhanced
vocal with suppressed music.
Any variation on this, involving factors 1/2 etc, will not make
any difference: the end result is always just a linear mix of left
and right channels, and there is no reason at all why the backing
music should be suppressed.
The point is that a vocal can be removed because it is basically
mono, and occupies a single point in space (disregarding reverb).
This is not true for the backing music, which is spatially spread
out, which is why suppressing the backing music is fundamentally
more difficult. For the same reason, as reverb spreads
the vocal in space, reverb limits successful removal of vocals,
a well-known fact.
Another
approach sometimes found is to derive a spectral mask from the vocal-less
track, that is then used to reduce the backing music by spectral
subtraction (like noise reduction), which is a quite different technique.
Although not very good in practice, it may work occasionally, and
yield a very moderate reduction of the backing music. However, it
is an extremely cumbersome procedure, and the results are usually
very disappointing.
One
other approach, that does sometimes work, is to subtract an original
instrumental, from the (exact same) track with vocal . You are then
using the fact that the 'separation' has practically been done for
you by the engineer who mixed the track with and without vocals.
Obviously, you need to have both tracks available
to do this!
If
your efforts to extract a vocal, that you really want, fail, we
can help.
Now,
as CSP's vocal removal does not rely on the cancellation
principle, and leaves the stereo image intact, CSP can perform
vocal enhancement or extraction. Though it is still more
difficult than vocal suppression, on most stereo material significant
reduction of backing music can be obtained with little effect on
the vocal itself.
In most cases, we
can make 'a-capellas' that make re-mixing a lot easier!.
|