Ubuntu Linux で qaac.exe を使って AAC/ALAC にエンコードする

日本国内外問わず商用および販売目的での画像の使用は固くお断りします。転載の際は出典とクレジットを明記し、改変を行わないでください。
The use of images for commercial and sales purposes, both within Japan and internationally, is strictly prohibited. When reposting, clearly indicate the source and give credit, and do not make any modifications.
无论在日本国内还是国际范围内，严禁将图像用于商业和销售目的。在转载时，请明确注明来源并给予适当的荣誉，不要进行任何修改。一律拒绝在淘宝上发布。

2023 年 8 月 20 日以降のフィギュア写真はすべてのサイトへの転載を禁止します。
Reposting of figure photographs on all websites is prohibited for photos taken after August 20, 2023.
2023年8月20日之后的手办照片禁止在所有网站上转载。

脱 Nero AAC Enc を目指して SoX とか色々使いつつもなんか違うなぁ〜と試行錯誤。（そもそも libav の aac コーデックが好きじゃないなら何やっても無駄じゃ…）

もう諦めてやっぱり CoreAudio を使う方向にシフト。しかし、ここ1年はもっぱらペンギン族で林檎マシンを起動することもなくなったのでエンコードのために林檎マシンを起動するのが面倒くさい。ここはなんとか Linux 機でどうにかしたいところだが、Linux 版 iTunes なんてものはないし、そんなものは自分が生きている間にリリースされるかどうかもわからない。

だがしかし、手間は少しかかるものの、Linux でも CoreAudio を使って AAC を作成することはできる。

WINE と 7z（p7zip）のインストール
iTunes6464Setup.exe の入手
AppleApplicationSupport64.msi の抽出
各種 dll の抽出
qaac64.exe の入手

apt-get install wine p7zip

CoreAudioToolbox.dll などの抽出

その前に 32-bit 版と 64-bit 版の qaac はどちらが良いのか？

自分の環境では、64-bit 版の方が AAC のエンコードが 2.5倍早かった。（マジで）

なので 64-bit 版をベースに記載する。なお、iTunes6464Setup.exe には 32-bit 版のライブラリも含まれているので 64-bit 版の iTunes をダウンロードしておけばどちらにも対応ができる。

7z でまずは AppleApplicationSupport64.msi を取り出す。

7z e iTunes6464Setup.exe AppleApplicationSupport64.msi

次に、ライブラリだけを取り出す。いくつか出力されるのでここでは出力先を AAS64 とする。

※シェルが苦手で、コマンドラインからライブラリのファイル名を変更したりするのが難しい場合は msiexec /a AppleApplicationSupport64.msi /qn で MSI パッケージをインストールしてしまった方が楽。

7z e -oAAS64 AppleApplicationSupport.msi \*.dll

ここで出てきたライブラリは頭に x64_AppleApplicationSupport_ が付いているのでこれを取り除く必要がある。ここではシェルの置換展開を使ってファイル名の頭の x64_AppleApplicationSupport_ だけを取り除く。

cd AAS64
for f in x64_*.dll; do mv ${f} ${f##*_}; done

置換展開が使えない場合は何らかのコマンドを併用して x64_AppleApplicationSupport_ を取り除くか、手作業でリネームする。_ を区切り文字として cut を使った例を載せておく。

for f in x64_*.dll; do mv ${f} `echo ${f} | cut -d_ -f3`; done

プレフィックスを取り除いたダイナミックライブラリの種類。AAC の変換に必要なのは多分3つくらい。

ApplePushService.dll
AppleVersions.dll
ASL.dll
AVFoundationCF.dll
CFNetwork.dll
CoreADI.dll
CoreAudioToolbox.dll
CoreFoundation.dll
CoreGraphics.dll
CoreLSKD.dll
CoreMedia.dll
CoreText.dll
CoreVideo.dll
Foundation.dll
icudt49.dll
JavaScriptCore.dll
libcache.dll
libdispatch.dll
libexslt.dll
libicuin.dll
libicuuc.dll
libtidy.dll
libxml2.dll
libxslt.dll
main.dll
MediaAccessibility.dll
objc.dll
pthreadVC2.dll
QuartzCore.dll
SafariTheme.dll
SQLite3.dll
WebKit.dll
WebKitQuartzCoreAdditions.dll
WTF.dll
YSCrashDump.dll
YSUtilities.dll
zlib1.dll

これらのダイナミックライブラリを qaac64.exe と同じディレクトリに入れておくか、C:\windows\system32 に相当する ${WINEPREFIX}/drive_c/windows/system32（一般的に ~/.wine/drive_c/windows/system32）に入れておけばよい。

WINE で qaac を使うための準備

https://sites.google.com/site/qaacpage/ の cabinet から qaac を入手する（執筆時点での最新版は qaac_2.59.zip）。

64-bit 版だけを取り出す。

7z e -oqaac qaac_2.59.zip qaac_2.59/x64/\*

先ほどのダイナミックライブラリを移動またはコピー。

cp AAS64/*.dll qaac

試しに実行。qaac のバージョンと CoreAudioToolbox のバージョンが出力されれば使用可能。もし、ERROR: CoreAudioToolbox.dll: Module not found. のようなメッセージが出るのであればライブラリが呼び出せていない。

wine qaac/qaac64.exe --check

qaac 2.59, CoreAudioToolbox 7.9.9.6 libsoxconvolver 0.1.0 libsoxr-0.1.1

qaac.exe を実行すると WINE の fixme メッセージが出るため WINEDEBUG=fixme-all しておくといいかもしれない。

export WINEDEBUG=fixme-all

qaac64 として呼び出せるように ~/.bashrc に alias を追記。--threading は常用だし入れておいて問題ないと思う。

alias qaac64='WINEDEBUG=fixme-all wine ~/Applications/qaac_2.59/x64/qaac64.exe --threading'

CUE シートを使った AAC エンコード

cdrdao、toc2cue で作成した CUE シートから AAC を作成する。cdrdao で作成したバイナリはデフォルトではビッグエンディアンになっているので s16b の指定が必要になる（忘れると砂嵐というかホワイトノイズ）。また、toc2cue で作成した CUE シートは（恐らく）UTF-8 なので --text-codepage 65001 を指定しないと ERROR: 80004001: mlang->DetectCodepageInIStream(0, GetACP(), stream, encoding, &nscores) となる。CUE シートはカレントディレクトリになくてもよい。

qaac64 --raw --raw-format s16b --text-codepage 65001 album.cue

アルバムをトラックごとに分割しない場合は --concat オプションを併用する。

qaac64 --raw --raw-format s16b --text-codepage 65001 --concat album.cue

Mac OS で afconvert を使ってたときも思ったけど、品質の設定がわかりづらい。--tvbr は 127 が最も高品質だが、実際、内部ではある程度決められた段階の値に丸められる。--abr などは 0 を指定すると最高となり、最大ビットレートは自動的に選択される。--quality は 0 が低品質で 2 が高品質だが、ファイルサイズは 2 の高品質の方が小さくなる。

リサンプル

44.1kHz -> 48kHz などのリサンプルは --rate で 48000 を指定する。--native-resampler オプションで品質を指定することができ、bats,127 が最高品質（127 は 96 に丸められる？）。このオプションはスペースではなく、= で繋がなければならない。

qaac64 --rate 48000 --native-resampler=bats,127 infile.wav

cdrdao、toc2cue で抽出したオーディオ CD の最高品質変換だとこんなところだろうか。

qaac64 \
    --threading \
    --raw \
    --raw-format s16b \
    --text-codepage 65001 \
    --cvbr 0 \
    --quality 2 \
    --rate 48000 \
    --native-resampler=bats,127 \
    album.cue

品質比較

オリジナルの CD 品質の WAV ファイル。

f:id:mattintosh4:20160828001143p:plain

デフォルトの設定でエンコードしたもの。これは暗黙的に --quality 2。（TVBR q91, Quality 96）

f:id:mattintosh4:20160828001236p:plain

--quality 0。（TVBR q91, Quality 32）

f:id:mattintosh4:20160828001639p:plain

--quality 0 と --quality 2（デフォルト）は並べてみるとわかりづらいが、重ねて見ると --quality 0 の方がデータが損失している。

f:id:mattintosh4:20160828003945g:plain

スクリプト

多分まだ書いてる途中。

問題点

--fname-format 内で ${artist}、${album} といったタグが呼び出せない。唯一、呼び出せるのは ${track} か ${tracknumber} くらい？

仕方ないのでシェル側で埋め込むしかない。

qaac 2.59 ヘルプ

qaac 2.59 Usage: qaac [options] infiles.... "-" as infile means stdin. On ADTS/WAV output mode, "-" as outfile means stdout. Main options: --formats Show available AAC formats and exit -a, --abr <bitrate> AAC ABR mode / bitrate -V, --tvbr <n> AAC True VBR mode / quality [0-127] -v, --cvbr <bitrate> AAC Constrained VBR mode / bitrate -c, --cbr <bitrate> AAC CBR mode / bitrate For -a, -v, -c, "0" as bitrate means "highest". Highest bitrate available is automatically chosen. For LC, default is -V90 For HE, default is -v0 --he HE AAC mode (TVBR is not available) -q, --quality <n> AAC encoding Quality [0-2] --adts ADTS output (AAC only) --no-smart-padding Don't apply smart padding for gapless playback. By default, beginning and ending of input is extrapolated to achieve smooth transition between songs. This option also works as a workaround for bug of CoreAudio HE-AAC encoder that stops encoding 1 frame too early. Setting this option can lead to gapless playback issue especially on HE-AAC. However, resulting bitstream will be identical with iTunes only when this option is set. -d <dirname> Output directory. Default is current working dir. --check Show library versions and exit. -A, --alac ALAC encoding mode -D, --decode Decode to a WAV file. --caf Output to CAF file instead of M4A/WAV/AAC. --play Decode to a WaveOut device (playback). -r, --rate <keep|auto|n> keep: output sampling rate will be same as input if possible. auto: output sampling rate will be automatically chosen by encoder. n: desired output sampling rate in Hz. --lowpass <number> Specify lowpass filter cut-off frequency in Hz. Use this when you want lower cut-off than Apple default. -b, --bits-per-sample <n> Bits per sample of output (for WAV/ALAC only) --no-dither Turn off dither when quantizing to lower bit depth. --peak Scan + print peak (don't generate output file). Cannot be used with encoding mode or -D. When DSP options are set, peak is computed after all DSP filters have been applied. --gain <f> Adjust gain by f dB. Use negative value to decrese gain, when you want to avoid clipping introduced by DSP. -N, --normalize Normalize (works in two pass. can generate HUGE tempfile for large piped input) --drc <thresh:ratio:knee:attack:release> Dynamic range compression. Loud parts over threshold are attenuated by ratio. thresh: threshold (in dBFS, < 0.0) ratio: compression ratio (> 1.0) knee: knee width (in dB, >= 0.0) attack: attack time (in millis, >= 0.0) release: release time (in millis, >= 0.0) --limiter Apply smart limiter that softly clips portions where peak exceeds (near) 0dBFS --start <[[hh:]mm:]ss[.ss..]|<n>s|<mm:ss:ff>f> Specify start point of the input. You specify either in seconds(hh:mm:ss.sss..form) or number of samples followed by 's' or cuesheet frames(mm:ss:ff form) followed by 'f'. Example: --start 4010160s : start at 4010160 samples --start 1:30:70f : same as above, in cuepoint --start 1:30.93333 : same as above --end <[[hh:]mm:]ss[.ss..]|<n>s|<mm:ss:ff>f> Specify end point of the input (exclusive). --delay <[[hh:]mm:]ss[.ss..]|<n>s|<mm:ss:ff>f> Specify amount of delay. When positive value is given, silence is prepended at the begining to achieve specified amount of delay. When negative value is given, specified length is dropped from the beginning. --no-delay Compensate encoder delay by prepending 960 samples of scilence, then trimming 3 AAC frames from the beginning (and also tweak iTunSMPB). This option is mainly intended for resolving A/V sync issue of video. --num-priming <n> (Experimental). Set arbitrary number of priming samples in range from 0 to 2112 (default 2112). Applicable only for AAC LC. --num-priming=0 is the same as --no-delay. Doesn't work with --no-smart-padding. --gapless-mode <n> Encoder delay signaling for gapless playback. 0: iTunSMPB (default) 1: ISO standard (elst + sbgp + sgpd) 2: Both --matrix-preset <name> Specify user defined preset for matrix mixer. --matrix-file <file> Matrix file for remix. --no-matrix-normalize Don't automatically normalize(scale) matrix coefficients for the matrix mixer. --chanmap <n1,n2...> Rearrange input channels to the specified order. Example: --chanmap 2,1 -> swap L and R. --chanmap 2,3,1 -> C+L+R -> L+R+C. --chanmask <n> Force input channel mask(bitmap). Either decimal or hex number with 0x prefix can be used. When 0 is given, qaac works as if no channel mask is present in the source and picks default layout. --no-optimize Don't optimize MP4 container after encoding. --tmpdir <dirname> Specify temporary directory. Default is %TMP% -s, --silent Suppress console messages. --verbose More verbose console messages. -i, --ignorelength Assume WAV input and ignore the data chunk length. --threading Enable multi-threading. -n, --nice Give lower process priority. --sort-args Sort filenames given by command line arguments. --text-codepage <n> Specify text code page of cuesheet/chapter/lyrics. Example: 1252 for Latin-1, 65001 for UTF-8. Use this when bogus values are written into tags due to automatic encoding detection failure. -S, --stat Save bitrate statistics into file. --log <filename> Output message to file. Option for output filename generation: --fname-from-tag Generate filename based on metadata of input. By default, output filename will be the same as input (only different by the file extension). Name generation can be tweaked by --fname-format. --fname-format <string> Format string for output filename. Option for single output: -o <filename> Specify output filename --concat Encodes whole inputs into a single file. Requires output filename (with -o) Option for cuesheet input only: --cue-tracks <n[-n][,n[-n]]*> Limit extraction to specified tracks. Tracks can be specified with comma separated numbers. Hyphen can be used to denote range of numbers. Tracks non-existent in the cue are just ignored. Numbers must be in the range 0-99. Example: --cue-tracks 1-3,6-9,11 -> equivalent to --cue-tracks 1,2,3,6,7,8,9,11 --cue-tracks 2-99 -> can be used to skip first track (and HTOA) Options for Raw PCM input only: -R, --raw Raw PCM input. --raw-channels <n> Number of channels, default 2. --raw-rate <n> Sample rate, default 44100. --raw-format <str> Sample format, default S16L. Sample format spec: 1st char: S(igned) | U(nsigned) | F(loat) 2nd part: Bitwidth Last part: L(ittle Endian) | B(ig Endian) Last part can be omitted, L is assumed by default. Cases are ignored. u16b is OK. Options for CoreAudio sample rate converter: --native-resampler[=line|norm|bats,n] Arguments are optional. Without argument, codec default SRC is used. With argument, dedicated AudioConverter is used for sample rate conversion. '--native-resampler' and arguments must be delimited by a '=' (space is not usable here). Arguments must be delimited by a ','(comma). First argument is sample rate converter complexity, and one of line, norm, bats. line: linear (worst, don't use this) norm: normal bats: mastering (best, but quite sloooow) Second argument is sample rate converter quality, which is an integer between 0-127. Example: --native-resampler --native-resampler=norm,96 Tagging options: (same value is set to all files, so use with care for multiple files) --title <string> --artist <string> --band <string> This means "Album Artist". --album <string> --grouping <string> --composer <string> --comment <string> --genre <string> --date <string> --track <number[/total]> --disk <number[/total]> --compilation[=0|1] By default, iTunes compilation flag is not set. --compilation or --compilation=1 sets flag on. --compilation=0 is same as default. --lyrics <filename> --artwork <filename> --artwork-size <n> Specify maximum width or height of artwork in pixels. If specified artwork (with --artwork) is larger than this, artwork is automatically resized. --copy-artwork Copy front cover art(APIC:type 3) from the source. When --artwork is also given, this option is ignored. --chapter <filename> Set chapter from file. --tag <fcc>:<value> Set iTunes pre-defined tag with fourcc key and value. 1) When key starts with U+00A9 (copyright sign), you can use 3 chars starting from the second char instead. 2) Some known tags having type other than UTF-8 string are taken care of. Others are just stored as UTF-8 string. --tag-from-file <fcc>:<path> Same as above, but value is read from file. --long-tag <key>:<value> Set long tag (iTunes custom metadata) with arbitrary key/value pair. Value is always stored as UTF8 string.