Skip to main content

aomenc

aomenc, AOM-AV1, or just libaom is a command line application for encoding AV1 written in C and Assembly developed by AOMedia, which is also the reference encoder for AV1.

Choosing forks​

Mainline aomenc is unfortunately not perfect. It suffers from bad defaults, heavy focus on PSNR which reduces its psycho-visual capabilities, settings that does X instead of Y, among others. Fortunately there are a couple forks that were created to combat these issues.

These forks fix up the poor decisions made by the original AOM devs and most importantly introduce new parameters and tunes to help fine-tune the encoder even more.

Our recommendadion is to use either aom-av1-lavish or aom-psy101, as both are actively maintained with good defaults and have been extensively tested by the encoding community.

FFmpeg​

aomenc is available in FFmpeg via libaom-av1, check if you have it by running ffmpeg -h encoder=libaom-av1. You can input non-FFmpeg standard aomenc parameters via -aom-params.

Mainline aomenc

Since FFmpeg encoder libraries come as the most default, barebones as possible (Therefore mainline aomenc), it is not recommended to use it. Unless you build it yourself.

Supported Color Space​

aomenc supports the following color spaces:

FormatChroma SubsamplingSupported Bit Depth(s)
YUV420P4:2:08-bit
YUV422P4:2:28-bit
YUV444P4:4:48-bit
GBRP-8-bit
GRAY8-8-bit
YUV420P10LE4:2:010-bit
YUV422P10LE4:2:210-bit
YUV444P10LE4:4:410-bit
GBRP10LE-10-bit
GRAY10LE-10-bit
YUV420P12LE4:2:012-bit
YUV422P12LE4:2:212-bit
YUV444P12LE4:4:412-bit
GBRP12LE-12-bit
GRAY12LE-12-bit

Installation​

The compilation parts of this installation will assume aom-av1-lavish as the default fork of choice.

aomenc should be available in your distribution's package manager.

But if you want to compile the community forks, you can also do that. CMake, Perl, GNU Make, and nasm (assuming x64, if x86 use yasm) will be needed for compilation.

Clone the aom-av1-lavish repo Endless_Merging branch, cd and create build folder
git clone https://github.com/Clybius/aom-av1-lavish -b Endless_Merging
cd aom-av1-lavish && mkdir -p aom_build && cd aom_build
CMake configuration
cmake .. -DBUILD_SHARED_LIBS=0 -DENABLE_DOCS=0 -DCONFIG_TUNE_BUTTERAUGLI=0 -DCONFIG_TUNE_VMAF=0 -DCONFIG_AV1_DECODER=0 -DENABLE_TESTS=0 -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_FLAGS="-flto -O3 -march=native" -DCMAKE_C_FLAGS="-flto -O3 -march=native -pipe -fno-plt" -DCMAKE_LD_FLAGS="-flto -O3 -march=native"

The CMake config above will statically build aomenc while disabling docs (which requires Doxygen), extra tunes, tests, and decoders. While also applying native CPU optimizations to help speed up the encoder.

Compile the encoder
make -j$(nproc)

The resulting binary will be in the same folder you are on (aom_build).

Or, optionally, you can install it to your system, which may need elevated permissions.

make install

Alternatively, a precompiled AVX2-optimized binary can be installed for Linux via rAV1ator CLI.

Usage​

AV1 Encoding​

info

The way aomenc was developed requires 2-pass to take full advantage of its efficiency which include better rate controls and encoding features. So always use 2 passes when encoding.

Simple Y4M input with CQ 22, 1 pass, and raw ivf bitstream output
aomenc --end-usage=q --cq-level=32 --bit-depth=10 --passes=1 --ivf -o output.ivf input.y4m
Pipe from FFmpeg
ffmpeg -v error -i input.mkv -f yuv4mpegpipe -strict -1 - | aomenc - --end-usage=q --cq-level=32 --bit-depth=10 --passes=1 --ivf -o output.ivf
Pipe from FFmpeg, 2-pass, pass 1
ffmpeg -v error -i input.mkv -f yuv4mpegpipe -strict -1 - | aomenc - --end-usage=q --cq-level=32 --bit-depth=10 --passes=2 --pass=1 --fpf-log=aom-pass.log  --ivf -o output.ivf
Pipe from FFmpeg, 2-pass, pass 2
ffmpeg -v error -i input.mkv -f yuv4mpegpipe -strict -1 - | aomenc - --end-usage=q --cq-level=32 --bit-depth=10 --passes=2 --pass=2 --fpf-log=aom-pass.log  --ivf -o output.ivf

AVIF Encoding​

Using aomenc through avifenc is widely considered to be the best way to encode AVIF images, as SVT-AV1 only supports 4:2:0 chroma subsampling, rav1e isn't fast enough for still images, & the libaom team have put more effort into intra coding than the teams responsible for producing the other prominent open source AV1 encoders. A sample command for encoding AVIF looks like this:

avifenc -c aom -s 4 -j 8 -d 10 -y 444 --min 1 --max 63 -a end-usage=q -a cq-level=16 -a tune=ssim [input] output.avif

Where:

  • -c aom is the encoder
  • -s 4 is the speed. Speeds 4 & below offer the best compression quality at the expense of longer encode times.
  • -j 8 is the number of threads the encoder is allowed to use. Increasing this past 12 will sometimes hurt encode times, as AVIF encoding via aomenc doesn't parallelize perfectly. Test using a speed benchmark to verify which value works best for you.
  • -d 10 is the bit depth. Specifying a value below 10 isn't recommended, as it will hurt coding efficiency even with an 8-bit source image.
  • -y 444 is the chroma subsampling mode. 4:4:4 chroma subsampling tends to provide better compression than 4:2:0 with AVIF, though on some images 4:2:0 chroma subsampling might be the better choice.
  • cq-level=16 is how you specify quality. Lower values correspond to higher quality & filesize, while higher values mean a smaller, lower-quality output is desired. This is preceded by -a because it is an aomenc option, not an avifenc one.
  • tune=ssim is how the encoder handles RDO (rate-distortion optimization). This may be redundant with the default aomenc parameters, but specifying doesn't hurt to avoid an unintended change if a default is modified sometime in the future.

Recommendations​

aomenc unfortunately lacks the ability to take advantage of multiple threads, so therefore a tool like Av1an will be needed for parallelization. The parameters shown will be biased towards Av1an and aom-av1-lavish usage, so if you plan on using standalone aomenc then adjust as needed.

Here are some recommended parameters:

--bit-depth=10 --cpu-used=4 --end-usage=q --cq-level=24 --threads=2 --tile-columns=0 --tile-rows=0 --lag-in-frames=64 --tune-content=psy --tune=ssim --enable-keyframe-filtering=1 --disable-kf --kf-max-dist=9999 --enable-qm=1 --deltaq-mode=0 --aq-mode=0 --quant-b-adapt=1 --enable-fwd-kf=0 --arnr-strength=1 --sb-size=dynamic --enable-dnl-denoising=0 --denoise-noise-level=8

Now let's break it down.

  • --bit-depth=10 We're using 10bit because weird linear algebra allows the video to become smaller and reduces banding.

  • --cpu-used=4 This is the preset which ranges from 0-9, you can go to 3 if you want more efficiency, 2 if you have a lot of time, 4 is the sweet spot, and 6 if you want speed. Don't go above 6 (Worst efficiency) or even 0 (It would take WEEKS to finish).

  • --end-usage=q --cq-level=24 This specifies that we are going to use a knockoff version of CRF level similar to x264/x265 encoders, in this case CRF 24.

  • --tile-columns=0 --tile-rows=0 This is the tiles options, where the encoder splits the videos into tiles to encode faster. See the image below (Yellow lines):

Tiling
Tile usage

Do NOT use tiles for 1080p and below, use 1 tile-columns at 1440p (2K), 2 tile-columns and 1 tile-rows for 2160p (4K).

If you would like an easy way to calculate the necessary number of tiles for your video, you can use the AV1 Encoding Calculator online or run this local tile calculator.

  • --lag-in-frames=64 Similar to x264/x265 rc-lookahead. Sets a number of frames to look ahead for frametype and ratecontrol, allowing for better compression decision making. Setting to a value greater than 64 is generally not considered useful.

  • --aq-mode=0 adaptive quantization mode, a mostly debatable area nowadays. 0 is better most of the time but some say 1 is also good.

  • --tune-content=psy --tune=ssim As the name suggests they are tunes that affect the video output, for the better, and for the worst.

info

Do not use tune-content=psy if you encode live action above cq-level=30.

info

If you use any of the VMAF tunes, you need to specify --vmaf-model-path= to where you put VMAF models in.

  • --enable-keyframe-filtering=1 We're setting it to 1 because of compatibility reasons, 2 is more efficient but there are seeking issues and FFmpeg can't input it.

  • --sb-size=dynamic Allows the encoder to use 128x128 block partitioning besides 64x64 which gives an efficiency boost.

  • --deltaq-mode=0 set to 0 b its better

  • --arnr-strength=1 Controls how strong the filtering (smoothing) will be, always been a hot topic. Most agree on the default of 4. Others think 1 is good for 3D Pixar CGI-like and 2D animation and 4 for live action content, and a higher value for lower bitrate encodes.

  • --disable-kf --enable-fwd-kf=0 We're disabling keyframes cause Av1an already did scene detection, so we wont have to. Plus it speeds things up.

  • --kf-max-dist=9999 Maximum keyframe interval, we're setting it at the highest possible value since Av1an's scene detection keyframe interval is already 240 by default

  • --enable-chroma-deltaq=1 --enable-qm=1 --quant-b-adapt=1`` Parameters that give you free efficiency boost, ignore it.

  • --enable-dnl-denoising=0 Disables the encoder's built-in denoising technique when grain synthesis is enabled, you can optionally set it to 1 when you have a pretty noisy video since it works quite well (NLMeans is the denoiser used).

  • --denoise-noise-level=8 AV1 grain synthesis, which is a technique where the encoder puts fake grain in so it looks more natural and potentially hiding video artifacts (cause grain is hard to encode and explodes bitrate usage because of their randomness). Don't attempt to use it at high values (>12) since it creates noticeable grain patterns.

info

You can use photon noise tables as an alternative via --film-grain-table, which is also conveniently available in Av1an as --photon-noise=X

Tips & Tricks​

  1. Use --butteraugli-resize-factor=2 if you use any of the butteraugli-based tunes to speed it up without much losses (lavish, butteraugli) and --butteraugli-intensity-target=250 to match the content light level.
  2. Use --arnr-maxframes to set max reference frames that will be used to filter the encode, higher values would make the video blurrier at high fidelity but look better at lower bitrates.