Author Topic: Coffeepost #1: Audio Formats "How I learned to stop Worrying and Love the Webm"  (Read 490 times)

Offline kanliot

  • Jr. Member
  • **
  • Posts: 51
We've had mp3 for 20 years now, let's look back onto the evolution of audio formats.  Back in the day mp3's sounded poor, and you'd have to send $$ to Germany to make music in the mp3 format.  (mp3 patents have expired now)

The loudness wars aren't over.  But they have wrecked audio.  Youtube's Loudness Wars Why I don't Buy Music Anymore

The first two successor to mp3's are Ogg and AAC.  AAC has broad hardware encoding support, tends to be encoded in the MP4 container, but is patent-encumbered.  Ogg is a free format, not sure if it's GPL encumbered, but it has a similar performance to AAC.  Except actually there is a difference.  Although both are about 20% or so sounding better than mp3 at the same bitrate, AAC ( found in m4a files and video files ) tends to be much better at encoding spatial data, like the spatial data in reverb or echo effect.  Also, since AAC uses a high-pass filter, Ogg generally sounds better than AAC in audio without that spatial stuff.  Drums, noise effects, electronic synth instruments typically sound less distorted in Ogg at the same bitrate.

That said, Ogg is not a common format.  AAC in .m4a is a common format.  You'll typically find an copyrighted M4a library in an ITunes library.  Also, there are new versions of AAC, that sound much better at lower bitrates.  AAC is partially patent encumbered.  FFMpeg comes with a free AAC encoder, but you can replace that encoder with the patent encumbered AAC library for noncommercial use. 

Let's look at some lossless formats.
APE, AIFF, WAV, and FLAC  are some lossless formats.  You might know that when you save a JPEG image, you can save disk space by decreasing the quality.  The aforementioned formats don't do this, rather they store each sample, at the sample rate in Hz.  You might have a 22000, 48000 or a 44100 Hz audio file, that means that the lossless formats would store 8, 16, or 24 bits of data for each sample.   FLAC and APE are suitable for archiving, although not as common as WAV.  AIFF is common, but can be surprising if you run across a 24 bit AIFF file. 

None of the lossless formats are commonly found in media container files like .mp4.  A container file can contain, a video track, several audio tracks, and a subtitle track, all in the same file.  It also allows tagging and seeking.  Ogg is a container file, which typically contains Vorbis audio.  I might have misspoke earlier, when I called Ogg/Vorbis files "Ogg". 

Let's look at the three main containers used today in media files.  They are, Ogg, mp4, and mkv.  None of the container formats are patent encumbered, but rather some devices don't like the other containers.  Mp4 would be the most common, and it's in wide use as a streaming format.  The .mp4 file extension kind of implies that a video stream is present in the container, but since it's just a container, any combination of video and audio could be present in fact.  An AAC file would be an audio stream (technically variants exist) outside of a MP4 container, and a .m4a file would be a file with no video, but only AAC audio.  For some reason, the flexibility of the mp4 container format makes it more difficult for metadata tagging like storing album art and track data.

Ogg format has a nastly little drawback as well.  By definition, Ogg files don't include an index into the media so media players can seek easily.  This generally makes seeking through an Ogg file, well, hit or miss.  The lack of index does make the media file use less disk space.  You might save 10% or so. 

Ogg containers shouldn't contain AAC streams, and MP4 containers shouldn't contain Ogg/vorbis streams.  This is a technical limitation that everyone supports, since the containers are used for diffent purposes. 

Which takes us to the next container format, MKV.  MKV is a container format with a good index, but lousy tagging support, AFAIK.  You can use it for both AAC and Ogg/Vorbis streams, so it has at least one use.  Webm files are actually MKV files with a smaller set of features, and prohibit the AAC streams.  You might put a AAC stream in a .mkv file, then rename that file to .mka, to indicate that there is no video stream present.   

Webm files also contain OPUS audio, which is a next-generation audio codec.  It sounds really good, sounds  better than Ogg.  It's a free format, and opus files encoded at about 420KB/s sound good enough to be mistaken for lossless audio.  With one caveat, opus files don't support encoding audio at 44100 Hz, rather the sample rate is converted to an OPUS native rate.  Surprisingly, even with the change in sample rate, OPUS files still sound great.  There is no downside.

So what I'm getting at, is one of the best audio file formats, is .webm.  But webm files can contain Ogg/Vorbis, OPUS, or even audio with video.

Lastly a quick table of video formats

divx                   legacy
MPEG                   legacy
h.264 (aka AVC)        current, hardware encoding
h.265 (aka hEVC)       current, superior for lower bitrates
AV1                    Next-gen, not patent encumbered, not ready yet
« Last Edit: March 31, 2019, 08:04:05 am by kanliot »