Audio Breakpoints
To clear up a potential mistake, this blogpost is about breakpoints for audio files, not the ones we use for debugging code ;-). A breakpoint file forms the basis for what is often called ‘envelopes’ or ‘automation tracks’ in DAWs.
They are simple files that contain pairs of timestamp:value
data. With this simple structure, they
allow us to specify what certain properties of the sound file should look like at various points in time.
For example, this is a screenshot of an automation track I made in FL Studio to apply panning on a rendition of Africa, by Toto:
Here you can compare the two sound files (easier on headphones):
In this post we will go over the fundamentals of working with breakpoint files, and in the next post we will look at how we can change the panning code from the previous post to make use of this.
Such a pan could for example have been encoded like this:
time (sec) | value |
---|---|
0 | 1 |
5 | -1 |
10 | 0 |
13.37 | 0.55 |
As the first value encodes time, the first value has to be a strictly rising value (until we invent time travel). The second value however depends on what exactly you are ‘automating’. In the case of our left-right pan we can encode it with values ranging from -1 to 1. As we’ve seen in the previous post this will allow us to modify the samples in a suitable way for getting this effect. Do note however that our panning function is not a perfect pan yet - so we won’t get quite the same result as FL Studio does. More on that later.
Linear interpolation
As you might have noticed, the time values in our breakpoint file do not have to increment at the same time-increments as our samples. That’s great because we can keep the files smaller, but we don’t want the audio to ‘jump’ from left to right at a certain instant in time. If we encode the values:
time (sec) | value |
---|---|
0 | 1 |
5 | -1 |
We don’t mean to say “start at 1, and jump to -1 at second number 5”. What we actually want to say is “Start at 1, and gradually decrease until -1”. The tool to solve this is Linear Interpolation.
So now it’s just a matter of finding out which two values in a breakpoint file the sample we are
processing fall inbetween of. Thus, if our sample is played 2.5
seconds into the song, we find
that the value must be half way between the values at second 0
and second 5
, leaving us with a
pan of 0
(Thus, entirely balanced left-right audio). The values 0
and 5
we will call the
span
.
Recipe
(As usual, all the code can be found on Github.
Having this background information on breakpoint files, we also have a rough recipe in mind for how to work with breakpoint files. There’s a few steps our breakpoint module will need to do:
- Read a breakpoint file
- Parse strings to “Time-Value” pairs
- Given a frame time, find the span it falls between
- Use linear interpolation to find the exact value
There’ll be a few edge cases to take care off, but this is a rough outline of what we need.
Parsing breakpoint files
The first function we should implement is the function to parse an actual file to a slice of
breakpoints. We’re assuming breakpoints will be passed as a file, but we’ll actually take an
abstraction of this and just accept an io.Reader
instead.
Our breakpoint time-value pars will be encoded like time:value
. Although you could use a different
separator if you want to, it’ll just be a minor adjustment to the code.
First we can define our breakpoint type:
|
|
Hence we take some input from an io.Reader, we parse it into separate lines. Then for each line we
split on our separator (:
) and turn these values into float64
values. These get bundled up into
our Breakpoint
struct and then added to a slice called []Breakpoint
which we will return to the
user. This code could use some slice bounds checks, but this way it is easier to read for this post
|
|
Finding the correct value
The other important part for our breakpoint module is to actually return a value given a slice of breakpoints and a requested time. As mentioned earlier, this will be done with linear interpolation on the values that we find for a given span. The first edge case we need to think about: What if our data point lies after the last entry? In this case, no interpolation nees to happen and we just return the last value.
The first step will be to find the correct span. We can iterate over all Time-Value pairs until we
have exceeded the Time
part, thus knowing that the last Time we saw is the ‘start’ of our span.
|
|
With this code we have found the correct startSpan, and we also have a small guard statement in case there are no breakpoints being passed.
The first edge case can be handled here, if our startSpan
is larger than the amount of breakpoints in
the slice, we can just return the value of the last breakpoint we encountered in the loop.
|
|
Now that we have taken care of this edge case, we actually can retrieve a span. And we hit the second edge-case, what if the two times in our breakpoint are the same? Imagine this:
time(sec) | value |
---|---|
5 | 1 |
5 | -1 |
When this happens we have an instant jump. In this case we have to return the value associated with
the last entry. Thus in our example, The result would be -1
. This could be the case if the user
would want a gradual rise to the value 1
in the first 5 seconds, and then an immediate jump to
-1
. This can be detected if the ‘distance’ between our two timestamps is zero.
|
|
Finally we get past the ‘edge cases’ and end up at the interpolation part. We can finish the
function with this using the width
we calculated above:
|
|
Great, with this we are all set to start working on our first automation track!
Resources
If you liked this and want to know when I write new posts, the best way to keep up to date is by following me on twitter. ;-)