My camera (an Olympus SP-570UZ) allows me to optionally record a four-second audio clip with each photo I take. I haven’t used this feature much because I typically upload my photos to Flickr, and there’s been no good way to associate the audio with the video. Ideally, I would like an audio player to appear below the photo, but there aren’t really any public audio-sharing websites with much longevity. And, in any case, Flickr won’t allow me to embed an audio player in my photo description.
Recently, it occurred to me that since Flickr allows short movies (up to 1:30 long), maybe I could create a single-frame movie with the still picture as the frame and the audio as the soundtrack. Then the Flickr movie player would serve as the control for the audio, and the audio and the video would stay associated with each other.
I decided to try to use ffmpeg to create the movie, since it seems to be able to do almost anything with video and audio. The command line for ffmpeg is a bit obscure, so this blog post documents about two hours of my time spent getting it to work.
My camera produces 3648×2736 JPEG images, and the audio files are 8 kHz sample rate, mono, 8 bit unsigned PCM samples in WAV file format. I decided my goal would be to create a motion JPEG (MJPEG) encoded AVI file with maximum quality.
I started by searching the web to see if anyone had done this before. By studying those examples and experimenting, I came up with the following ffmpeg command line:
ffmpeg.exe -loop_input -shortest -f image2 -r 0.25 -i P910033.jpg -i P910033.wav -vcodec mjpeg -qscale 1 -t 4 foo.avi
Most of my attempts caused ffmpeg to hang. But eventually, I got the error message below:
Duration: 00:00:04.00, start: 0.000000, bitrate: N/A
Stream #0.0: Video: mjpeg, yuvj422p, 3648x2736, 0.25 tbr, 0.25 tbn, 0.25 tbc
[wav @ 01a80050]Estimating duration from bitrate, this may be inaccurate
Input #1, wav, from 'P6060033.wav':
Duration: 00:00:04.02, bitrate: 64 kb/s
Stream #1.0: Audio: pcm_u8, 8000 Hz, 1 channels, u8, 64 kb/s
[mp2 @ 01ac6310]Sampling rate 8000 is not allowed in mp2
Output #0, avi, to 'foo.avi':
Stream #0.0: Video: mjpeg, yuvj422p, 3648x2736, q=2-31, 200 kb/s, 90k tbn, 0
.25 tbc
Stream #0.1: Audio: mp2, 8000 Hz, 1 channels, s16, 64 kb/s
Stream mapping:
Stream #0.0 -> #0.0
Stream #1.0 -> #0.1
Error while opening encoder for output stream #0.1 - maybe incorrect parameters?such as bit_rate, rate, width or height
At last, I understood the problem: ffmpeg needs the audio sampled at some rate other than 8 kHz. So I decided to use Audacity, another open-source application, to upsample the sound. However, now Audacity was unhappy with this audio format.
So I used Project->Import Raw Data and selected my WAV file. I set up the import with the following parameters:
I knew this would work, because the WAV file format consists of a header, followed by PCM data, in this case, 8 kHz unsigned samples. So the result in the audio editor would be an audio file with the WAV header as a noisy sound at the start, followed by the data I wanted. The selected (darker) portion of the WAV file below is the header. I used Edit->Cut to remove it.
Finally, I tried to save the audio at a different sample rate. The audio file has a pulldown menu that lets you change the sample rate, but it doesn’t do what I wanted — what it does is play the audio file back at a different rate with aliasing.
Instead, after consulting the Audacity documentation, I discovered you use the menu in at the lower-left corner of the main Audacity window to set the sample rate.
Change this to 48000, and choose File->Export as WAV to save at the new sample rate. I re-ran ffmpeg, and the resulting AVI file would play in QuickTime and VLC player (although VLC crashes afterward), but it would not work in Windows Media Player (audio played, no video), divx, realplayer or Flickr. So, I decided to try encoding to mp4 instead with the following command:
ffmpeg.exe -loop_input -shortest -f image2 -r 0.25 -i P910033.jpg -i P910033.wav bar.mp4
The resulting mp4 file plays in all the media players (although, again, VLC crashes after playing it), and Flickr can read it successfully as well. Here is what it looks like on Flickr:
Using size as a proxy for quality, however, the encoded video is much smaller than the input JPEG file. Can someone suggest additional flags to ffmpeg to improve the encoding quality?
Ben Mesander has more than 18 years of experience leading software development teams and implementing software. His strengths include Linux, C, C++, numerical methods, control systems and digital signal processing. His experience includes embedded software, scientific software and enterprise software development environments.