Earlier this year, I had a desire to capture time-lapse video of some construction that would take an unknown amount of time and occurred mostly during short periods of activity separated by intervals of inactivity with varying time. Because the overall recording time was unknown, this represented an interesting set of challenges that I wrote some software to address, using a Raspberry Pi and a USB webcam. (See the end of this article for the complete source.)
There is no shortage of articles around the web describing how to capture time lapse videos with a Raspberry Pi, such as Caroline Dunn’s article in Tom’s Hardware. I find that most of these involve some kind of script to take still images at intervals, and another component (usually using ffmpeg) to combine those images into a video.
Although this approach works fine, it suffers from several obvious shortcomings:
- Frames stored as individual image files tend to require much more space to store them than an equivalent video file, since each frame will usually be very similar to the preceding ones: video codecs are designed to take advantage of this redundancy.
- Captured images and generated videos must be manually retrieved (or deleted) on capture completion. For long captures or when generating multiple sequences, this may require regular human intervention to ensure the system is still running and storage is still available.
Since the events I wished to capture would take an unknown amount of time (possibly months), it was important to me that the system should be reliable and not require regular (manual) maintenance as well as not use more storage space than necessary. To that end, I chose to write my own set of scripts that is described by the rest of this article.
The hardware I had available amounted to a Raspberry Pi 2 (and a USB WiFi adapter) with 32GB microSD card as boot volume, a Logitech C925e USB webcam, and extension cords and power supply suitable to place the system wherever I needed. The Pi 2 is a rather dated and slow machine by now, but ought to be sufficient for this application.
The Logitech C925e advertises support for 1080p video at 30 frames per second, which is a much higher framerate than I need for time-lapse video capture (around one frame per second seems fine) but a respectable resolution. Investigating the camera’s actual capabilities shows it’s actually capable of higher resolution than that:
$ ffmpeg -f v4l2 -list_formats all -i /dev/video0 [video4linux2,v4l2 @ 0x1083cd0] Raw : yuyv422 : YUYV 4:2:2 : 640x480 160x90 160x120 176x144 320x180 320x240 352x288 432x240 640x360 800x448 800x600 864x480 960x720 1024x576 1280x720 1600x896 1920x1080 2304x1296 2304x1536 [video4linux2,v4l2 @ 0x1083cd0] Compressed: mjpeg : Motion-JPEG : 640x480 160x90 160x120 176x144 320x180 320x240 352x288 432x240 640x360 800x448 800x600 864x480 960x720 1024x576 1280x720 1600x896 1920x1080
This output indicates that the camera supports output as either raw video or Motion JPEG in a variety of resolutions up to 1920x1080 pixels. Raw video can also be output at higher resolutions up to 2304x1536, but at a lower framerate (around 12 fps, it turns out). Since I didn’t care about any framerate higher than about 1 fps, it was a logical choice to run at that higher resolution.
The basic use of ffmpeg for this application uses a V4L2 device as input (the webcam) and outputs at a very low framerate to
ffmpeg -i v4l2 -video_size 2304x1536 -i /dev/video0 \ -vf fps=0.1 -t 60 timelapse.mp4
In this instance I’ve captured video at one frame per 10 seconds (
-vf fps=0.1) and chosen to capture one minute of video (
-t 60). In the final script these are configurable, but this illustrates the concept nicely.
Although the required framerate for video encoding is low in this application, video encode performance remains a concern because the Raspberry Pi 2 is not a very powerful computer. I wanted to use a video codec that achieves good compression, and it needed to do so with enough speed that frames can be encoded at least as frequently as they are captured.
Some versions of the Raspberry Pi support hardware-accelerated H.264 encoding (in a quick search, it’s unclear if that includes the Pi 2), but I didn’t try to make that work: there doesn’t seem to be support for the relevant hardware in ffmpeg so I would have needed to use some other software to do the encoding, and it’s unclear what the hardware encoder’s limitations are (such as maximum resolution). I instead did some manual experimentation by doing a live video capture with assorted codecs:
VP9 and H.265 are attractive choices because they are very good at efficiently compressing video, but I found performance to be too bad to be usable for this application (achieving less than 0.1 frames per second at the target resolution). Both x264 and VP8 perform acceptably and achieve similar compression, so I opted to use VP8 since it’s not legally encumbered by any patent licensing requirements (unlike H.264).
I somewhat arbitrarily chose a maximum bitrate of 20 megabits per second and CRF of 4 to achieve a high-quality encode, and ended up with this
ffmpeg -i v4l2 -video_size 2304x1536 -i /dev/video0 \ -an -c:v libvpx -b:v 20M -crf 4 \ -vf fps=0.1 -t 60 timelapse.mp4
With a way to use ffmpeg to capture time-lapse video directly (no intermediate image files!), we can move to thinking about where video will be stored.
As mentioned earlier, I only have a 32GB SD card at hand for this Raspberry Pi, and I don’t trust it not to corrupt data without warning (both the Pi itself and the card; I don’t really trust either with my data) so it seemed important to ensure that the Pi’s local storage would not fill up with video and prevent further recording, as well as to copy data off the Pi shortly after its creation. I chose to address this by having the system upload video to Google Cloud Storage (GCS). There’s nothing particularly special about GCS over one of the many other object storage systems available from many different service providers; it was just convenient for me to use GCS.
I didn’t want to completely clean up video periodically (by deleting old files) in case of an upload failure, and it’s useful to get more frequent feedback on how video capture is going in the form of segments that can be viewed immediately so I also wanted to have the system incrementally upload video as it is captured rather than uploading larger chunks at long intervals (say, every day). Incremental upload also helps reduce the bandwidth needs of the system, since the total data transfer is spread over a longer interval.
As discussed above, by capturing video at a low frame rate rather than individual images at the same rate we can save storage space, improve picture quality, or possibly both. Incrementally uploading video files is somewhat more challenging than handling a similar collection of still images however, since still images have a convenient 1:1 relation to the captured frames (so it’s easy to assume that a file’s existence implies a complete frame) whereas video files become larger over time as frames are added.
To incrementally upload video, we need to choose a container that remains valid when more data is appended to it, then design a method to efficiently append new data to what’s already present in remote storage.
ISO MPEG-4 containers (
.mp4 files) are a common choice for video files and are supported by most software. However this container is not very well-suited to this application because by default some metadata (the
MOOV atom) gets placed at the end of the file. ffmpeg can put that at the beginning of a file to make a file “streamable” by using the
-movflags faststart option, but that doesn’t really solve the live capture problem because the metadata stored in the
MOOV atom that the
faststart option moves around needs to be derived from the entire encoded file: ffmpeg implements
faststart simply by outputting a file, then moving the
MOOV block to the beginning and copying the rest of the file to follow it. Since this requires the entire input be available first, it is not appropriate for a pseudo-live stream.
The Matroska container (
.mkv, and also conventionally used for
webm) on the other hand turns out to work well for this application: I found that ffmpeg does update some headers at the beginning of a Matroska file when it stops encoding (similar to what it does for mp4 fast start), but the fields that get populated are not required to decode the video: they appear to only contain things like the total video length, which decoders do not require. In some experiments, I found that other programs were happy to play back a Matroska video that I had copied while ffmpeg was encoding it, even when capturing live video without a defined duration; they simply stopped playback on reaching the end of the data. In some situations players failed to report the overall video length or showed a wrong duration when asked to decode such a truncated file, but they were still able to play back everything that was present.
Recalling that I chose to use Google Cloud Storage to store captured video,
gsutil is a convenient way to interface with the object storage system from shell scripts. Since
ffmpeg is also easily driven from a shell script, the default choice for implementing the entire system was also shell, rather than some other (perhaps less quirky) programming language.
To implement incremental upload of files, the general algorithm for copying a ‘source’ file on the local system to a ‘destination’ file on remote storage can be expressed as follows (assuming, as we established in the previous section, that files are only appended to):
- Check whether
- If no, copy entire
destination fileand exit
- If no, copy entire
- Get size of
- If same size as
source file, do nothing and exit
- If same size as
- Append bytes from
source filestarting at offset
Somewhat problematically, in most object storage systems like Google Cloud Storage, “objects” (files in our abstraction) are immutable: it is not possible to modify an object in place. Making changes to an existing object will then usually involve making a copy of the object with the changes applied, and doing so is most obviously implemented by downloading the original and making changes, then uploading the changed version (possibly replacing the original object).
It should be obvious that appending to a file stored on GCS by downloading it and re-uploading doesn’t achieve the goal of incremental upload, since in that case we could simply upload the entire local file. Fortunately, it’s possible to “compose” an object from multiple pieces: given two objects,
gsutil compose can be used to concatenate them into a single object without making a copy of either. With that operation, appending to a file for incremental upload is simply a matter of uploading the new data as a new object, then performing a
compose operation to add it to the original object.
The following shell script implements this incremental upload; I call it
gcs-incremental. When passed the path to a local file and a location on Cloud Storage, it implements the algorithm described above.
There are several aspects of this implementation worth noting:
- Getting the size of a file on GCS is slightly tricky because
gsutil statprints file properties in a format similar to HTTP headers. It’s not too hard to extract a number with
gsutildoesn’t provide a way to select only part of a file to upload, so we use
ddto read part of the file and pipe the data to
gsutil. Use of a pipe prevents parallelization of the upload so performance is limited somewhat.
- In order to compose the old and new file parts, we need to write a temporary file. This script assumes that a suffix of
.partand an integer is sufficiently unique to avoid potential conflicts, but it would probably misbehave if multiple uploaders were trying to update the same file.
- After a chunk of a file is uploaded, that data is erased from the local disk by using
fallocateto punch a hole in the file, replacing all of the data that’s been uploaded with zeroes and freeing any space on disk that it used.
The hole-punching in the source file is what allows the overall time-lapse capture system to assume that the amount of storage available is not a concern. Files that have been uploaded will remain on disk but consume essentially no space while retaining their original size, making it easy to tell whether a file has been uploaded in its entirety even after the fact. Failed uploads may cause increased disk usage (because holes will not be punched), but no loss of data so they can be retried later.
Having implemented mechanisms to capture video (easily done with
ffmpeg) and to incrementally upload those videos to GCS (
gcs-incremental), what remains is to combine them into something that will capture video and incrementally upload it.
Putting everything together into one script, the intended function can be summarized as follows:
- Capture video with ffmpeg for a chosen duration
- In a loop, run
ffmpegand emit to a file with a chosen name with capture duration set to
- After the specified time is elapsed, ensure videos are uploaded and exit
- In a loop, run
- Periodically do an incremental upload of captured video files, stopping once video capture ends
The video capture itself is fairly easy to write. Assuming a few variables specifying things like how long to capture for, what video device to use as input, and what framerate to output, I ended up with these shell functions:
now provides the current UNIX time, which is convenient for computing the total amount of time capture has been running.
run_capture assumes its working directory is appropriate for storing video and captures a sequence of files
1.mkv and so forth. Capturing a sequence of files ensures that if some transient error occurs (perhaps if the camera is accidentally unplugged) capture will resume without overwriting any older data.
Because the script needs to also run periodic uploads,
run_capture is started in the background and its PID is saved so its status can be polled later, in particular for checking whether it has completed and exited.
Periodically doing an incremental upload is slightly more interesting, because I wanted to upload video parts at a regular cadence without any particular dependence on how long the upload takes. A naive version might implement a simple algorithm:
- Wait for
- Do incremental upload
- If capture is still running, goto 1
If it takes any meaningful amount of time to perform the upload however, the uploads will be separated by the chosen interval and occur less frequently than intended. This is probably actually fine, but I chose to get clever with it to achieve a regular cadence.
If uploads should occur at intervals without regard for how long they take to complete, this implies there must be two concurrent processes: one that waits for intervals to expire (the “timer” task) and another that actually does the upload (the “uploader” task). This becomes difficult when we recall that
gcs-incremental cannot be expected to work correctly if invoked in parallel, since this implies there must be some mechanism to synchronize uploads both between incremental uploads and the final upload that runs once capture completes.
A reasonably simple approach to this problem in more capable (than shell scripting) programming languages is to use a multi-producer queue: the uploader task can pull upload jobs out of a queue and execute them serially, while the timer and capture tasks place new jobs into the queue as appropriate (at intervals and once capture completes, respectively).
In a shell script, I realized it’s possible to implement a queue with a pipe: if the receiver reads lines from a pipe in a loop until closed, other tasks can write lines to the pipe which will be processed in sequence. I ended up with this code for the receiver:
We create a named pipe
msgpipe and direct data from that pipe to the input of
while read n loop will read lines from the input and stop once the input is closed, running
gcs-incremental over all of the video files in its working directory.1 Invoking the
trigger_upload function will queue uploads of all current files.
pipe_holder is an unusual component that is required only as a result of POSIX named pipe semantics: the read end of a pipe is closed once all writers disconnect, which in this application would be after the first segment upload is triggered because
trigger_upload opens the pipe, writes to it and closes it again. The presence of the
pipe_holder prevents the uploader task from exiting until the
pipe_holder itself exits.
Because it may be possible for concurrent writes to the pipe to be accidentally interleaved, I forced invocations of
trigger_upload to be serialized through use of signals:
The loop represented by
waker_pid simply waits at intervals and sends
SIGUSR1 to the main script. We
trap that signal and in response execute
trigger_upload that writes to the pipe. Reception of this signal can interrupt a
wait as embodied in the
interruptible_sleep function and asynchronously trigger an upload, but it is guaranteed by the system (through the general semantics of signals) that only one handler will execute at a time.
The final piece is to wait for capture to complete and trigger one final upload. Since the PID of the capture task was stored, this is as simple as
waiting on it in a loop:
The dance with
wait_status here is required because
wait can be interrupted by a signal, and in fact we expect it to be periodically interrupted by a
SIGUSR1 when we want to trigger an upload (as described in the previous section). In this situation the return code of
wait is documented to be 128 or greater, so only when the return code is less than 128 is the capture task known to have exited.
Once the capture task exits, all that remains is to clean up and ensure all data has been uploaded:
We first terminate the waker task to prevent any more uploads from being triggered, and trigger a final upload. Since the capture task has exited by this point, this upload is guaranteed to see all the data that will ever exist.
After triggering the upload we kill the
pipe_holder, closing the
msgpipe which will make the uploader exit once it processes everything remaining in the pipe. To avoid waiting forever if there’s a problem while uploading, I chose to wait only up to an hour for it to complete before exiting.
As discussed earlier, there are a few variables set at the top of the script guiding script operation. These mostly configure how video should be captured, but also specify the location in GCS for data storage:
Because it’s useful to be able to change these without modifying the script, I opted to make it take paths on the command line which indicate files that
timelapser will execute during startup.
sourceing configuration files permits arbitrary configuration to be easily written and doesn’t require any special parsing, which is convenient. The configuration I ended up using for actual video capture looks like this:
This runs captures for 9 hours at 0.1 fps, uploading video segments every 10 minutes. Because video is captured to the working directory, the configuration ensures that a directory named for the current time is created to store video, and the same directory name is used in the remote storage. Using a directory name based on the start time allows video segments to be easily correlated with when they were actually captured.
To run the system automatically, I set up some systemd units that will capture a video every day during working hours. A timer that triggers on weekdays:
And the matching service that is started by the timer:
The service runs the script as its own user which is not really required, but is convenient for confining the effects of video capture to a well-defined space (mostly that user’s home directory).
I wanted to make it easy to deploy and manage the scripts, in particular to be able to more easily handle changes and deploy the scripts to a fresh Pi in the future should I desire. The easy approach to deployment is to simply copy the scripts to the machine to run them, but it’s somewhat easier at deployment-time to use the OS package manager. Since Debian derivatives are usually used on Raspberry Pis, I spent some time learning how to create Debian packages and constructed a
timelapser package containing the scripts and configuration needed to run this system.
The package does the following:
- Install the systemd units to
- Install the sample configuration to
It does not currently create the user or attempt to configure
gsutil, so post-install operations should probably include:
- Create the user and allow access to video devices:
adduser timelapser --ingroup video(optionally choosing a non-default location for the home directory and so forth)
- Log in to a Google cloud account for storage access:
sudo -u timelapser
gcloud auth login
/etc/timelapser.confto configure video capture options and storage location
- If a bucket does not already exist, create a bucket to store uploaded video
To change the time at which video is captured,
timelapser.timer can be used to override the provided
Finally, the usual
systemd commands can be used to start automatically running capture on a schedule:
systemctl enable --now timelapser.timer
Or to run it once then stop (useful for testing configuration):
systemctl start timelapser.service
Congratulations on making it to the end of this article. Here is the result all in one place, much easier to use as-is than trying to reassemble it from the code fragments in the article.
timelapser_1.0.tar.xz: complete code and packaging information, buildable with
gitlab.com/taricorp/timelapser: at time of this writing, the same as the above source tarball hosted on Gitlab. Easier to browse and may receive some updates.
Although I’m happy with how this system works, some additional work is called for once a complete time-lapse has been captured.
Because videos are captured such that they play back in real-time (with video duration being equal to the original amount of time over which it was captured), I first combine all the video segments for each day and add a time readout:
concat input format allows days where video capture was interrupted and resumed later (writing to another file) to still result in a single video for the entire day.
drawtext filter applied via the
-vf option takes the timestamp of each frame (starting at 0 at the beginning of the video) and formats it as hours, minutes and seconds then overlays the text on the video at the top-middle.
setpts then takes the same timestamp and divides by 600 (setting the output frame’s timestamp to that new value), so the video now plays back 600 times faster than the input.
Since sometimes there are long stretches of “nothing”, it’s useful later to do some filtering of each day’s video to drop frames where there’s very little change, combining the videos for each day into a single longer video; again using the
concat input format to ffmpeg again and a different set of filters:
select filter will use or discard frames, and
gt(scene,0.02) will only select those frames that differ from the previous frame by more two percent (according to some unspecified
scene metric).2 To speed the resultant video up further,
setpts=N/(30*TB) increases speed by a further 30 times.
Compared to use of
PTS in the earlier example,
TB is used here because input frames are dropped: the
PTS is based on the input frame’s time, so if
PTS were used then the time filled by dropped frames would still exist in the output video but the frame itself would not exist (the previous one would continue to be shown). Since the goal of dropping similar frames is to reduce the runtime of the final video while preserving interesting activity,
TB is the better choice.
As I write this conclusion, I’ve used this set of scripts to capture two different sets of videos to good effect, each covering more than a month of real time. The results have been satisfactory, and the system has been entirely hands-off aside from initial configuration and disabling it again when I was done, which nicely fulfills the goal of having a system that requires minimal attention.
I notice while writing this that the uploader task could be made more efficient by receiving the name of a file to upload rather than a string that is otherwise ignored, which would allow it to inspect only the relevant file rather than all those that exist. ↩︎
I arrived at the 2% difference in scene metric by experiment, finding that value wasn’t too sensitive (changes in the display time didn’t cause frames to be retained, for instance) but also that it didn’t seem to drop interesting periods of action. ↩︎