Archiving Music via youtube-dl

An educational guide to using youtube-dl for music archival

NOTE: Since I’m really slow at typing things up, the entire youtube-dl DMCA incident happened somewhere in the middle of this post.

Intro

Ok, let me set the stage.

There’s this song that I’ve been listening to for a week straight and I want to buy it so I have a local copy (yes, I buy music).

I go to Bandcamp / Amazon Music / Patreon and find out that it isn’t listed.

So… send a donation to the content creator so that I can sleep at night and proceed to re-learn how to use youtube-dl.

Since I’ve encountered this very specific circumstance several times, I decided to write a guide on it.

The song I’ll be using in the example is Giga & KIRA - GETCHA! (Self Cover ver.). The self-cover is not available for purchase at this time.

If the song ever becomes available for legitimate purchase, I will change the example to feature another song which is unavailable for purchase. I will proceed to do this until I no longer need to download music this way.

By the time this is posted, I will have purchased the original song through legitimate means.

I’ll be using the following sources for downloading:

  • YouTube
  • NicoNico
  • Soundcloud (maybe in a future update)

YouTube

Check formats

$ youtube-dl -F https://youtu.be/IxIfymd5g5Y
[info] Available formats for IxIfymd5g5Y:
format code  extension  resolution note
249          webm       audio only tiny   54k , opus @ 50k (48000Hz), 1.38MiB
250          webm       audio only tiny   72k , opus @ 70k (48000Hz), 1.82MiB
140          m4a        audio only tiny  131k , m4a_dash container, mp4a.40.2@128k (44100Hz), 3.55MiB
251          webm       audio only tiny  141k , opus @160k (48000Hz), 3.60MiB
394          mp4        256x144    144p   90k , av01.0.00M.08, 30fps, video only, 1.99MiB
278          webm       256x144    144p   97k , webm container, vp9, 30fps, video only, 2.52MiB
160          mp4        256x144    144p  152k , avc1.4d400c, 30fps, video only, 2.87MiB
395          mp4        426x240    240p  178k , av01.0.00M.08, 30fps, video only, 4.13MiB
242          webm       426x240    240p  227k , vp9, 30fps, video only, 5.65MiB
133          mp4        426x240    240p  239k , avc1.4d4015, 30fps, video only, 5.06MiB
396          mp4        640x360    360p  367k , av01.0.01M.08, 30fps, video only, 8.63MiB
243          webm       640x360    360p  413k , vp9, 30fps, video only, 10.20MiB
134          mp4        640x360    360p  560k , avc1.4d401e, 30fps, video only, 10.00MiB
397          mp4        854x480    480p  672k , av01.0.04M.08, 30fps, video only, 15.51MiB
244          webm       854x480    480p  787k , vp9, 30fps, video only, 17.35MiB
135          mp4        854x480    480p  876k , avc1.4d401f, 30fps, video only, 15.30MiB
398          mp4        1280x720   720p 1410k , av01.0.05M.08, 30fps, video only, 31.02MiB
247          webm       1280x720   720p 1553k , vp9, 30fps, video only, 31.15MiB
136          mp4        1280x720   720p 1626k , avc1.4d401f, 30fps, video only, 27.97MiB
399          mp4        1920x1080  1080p 2567k , av01.0.08M.08, 30fps, video only, 55.40MiB
248          webm       1920x1080  1080p 2687k , vp9, 30fps, video only, 66.27MiB
137          mp4        1920x1080  1080p 4430k , avc1.640028, 30fps, video only, 94.87MiB
18           mp4        640x360    360p  733k , avc1.42001E, 30fps, mp4a.40.2@ 96k (44100Hz), 20.09MiB (best)

Best quality is probably opus @160k or 251

$ youtube-dl -f 251 https://youtu.be/IxIfymd5g5Y -o '%(id)s.%(ext)s'
[download] Destination: IxIfymd5g5Y.webm
[download] 100% of 3.60MiB in 00:01

Check details using ffprobe

$ ffprobe -i IxIfymd5g5Y.webm
ffprobe version 4.3.1 Copyright (c) 2007-2020 the FFmpeg developers
Input #0, matroska,webm, from 'IxIfymd5g5Y.webm':
  Metadata:
    encoder         : google/video-file
  Duration: 00:03:49.92, start: -0.007000, bitrate: 131 kb/s
    Stream #0:0(eng): Audio: opus, 48000 Hz, stereo, fltp (default)

ffmpeg has shown to mess with bitrates when copying between container formats, so I’ve found it better to just use mkvextract to make sure no loss happens during the copy

$ # ffmpeg -i IxIfymd5g5Y.webm -vn -acodec copy IxIfymd5g5Y.opus
$ mkvextract "IxIfymd5g5Y.webm" tracks 0:"IxIfymd5g5Y.opus"

Niconico

Check formats

$ youtube-dl -F https://www.nicovideo.jp/watch/sm37579910
[info] Available formats for sm37579910:
format code                extension  resolution note
h264_360p_low-aac_64kbps   mp4        640x360     364k video@ 300k, audio@ 64k
h264_360p_low-aac_192kbps  mp4        640x360     492k video@ 300k, audio@192k
h264_360p-aac_64kbps       mp4        640x360     664k video@ 600k, audio@ 64k
h264_360p-aac_192kbps      mp4        640x360     792k video@ 600k, audio@192k
h264_480p-aac_64kbps       mp4        854x480    1664k video@1600k, audio@ 64k
h264_480p-aac_192kbps      mp4        854x480    1792k video@1600k, audio@192k
h264_720p-aac_64kbps       mp4        1280x720   2064k video@2000k, audio@ 64k
h264_720p-aac_192kbps      mp4        1280x720   2192k video@2000k, audio@192k
h264_1080p-aac_64kbps      mp4        1920x1080  4064k video@4000k, audio@ 64k
h264_1080p-aac_192kbps     mp4        1920x1080  4192k video@4000k, audio@192k (best)

I’m going to choose h264_360p_low-aac_192kbps, the reason being that the audio quality is the highest while the video quality is the lowest. This is important because niconico doesn’t have any cdn’s close to the US (where I am), so we’re downloading this from Japan.

We want to get the best audio while making this as fast as possible.

$ youtube-dl -f h264_360p_low-aac_192kbps https://www.nicovideo.jp/watch/sm37579910 -o '%(id)s.%(ext)s'
[download] Destination: sm37579910.mp4
[download] 100% of 14.79MiB in 00:30

Check details using ffprobe

$ ffprobe -i sm37579910.mp4
ffprobe version 4.3.1 Copyright (c) 2007-2020 the FFmpeg developers
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'sm37579910.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2mp41
  Duration: 00:03:49.97, start: 0.000000, bitrate: 539 kb/s
    Stream #0:0(und): Video: h264 (Constrained Baseline) (avc1 / 0x31637661), yuv420p(tv, bt709), 640x360 [SAR 1:1 DAR 16:9], 299 kb/s, 29.97 fps, 29.97 tbr, 30k tbn, 59.94 tbc (default)
    Metadata:
      handler_name    : VideoHandler
    Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 236 kb/s (default)
    Metadata:
      handler_name    : SoundHandler

ffmpeg has shown to mess with bitrates when copying between container formats, so I’ve found it better to just use mp4box to make sure no loss happens during the copy.

You’ll notice that the ffmpeg command outputs an m4a file while the mp4box outputs an aac file. The reason for this is that the quality of aac output from ffmpeg has more quality loss for some reason.

$ # ffmpeg -i sm37579910.mp4 -vn -acodec copy sm37579910.m4a
$ mp4box -raw 2 sm37579910.mp4 -out sm37579910.aac

Note: NicoBox, an official music-only app from Niconico. It seems like NicoNico does muxing in the backend, and the app has a call which skips this step. I need to set-up a MITM proxy between my phone and my PC or decompile the app for more info.

Future Work

  • Add Notes for SoundCloud

    I believe this involves some fun with cookies iirc

  • Add Notes for mp3tag

    I usually use the GUI tbh, but it might be fun to learn how to use the tool from the cli.

  • Run a mitm on NicoBox

    Don’t get your hopes up.