Downloading certain videos of a certain university

This is a rather amusing story of my successful attempt at downloading certain videos from a certain university. I have to say it has been a rather educational experience.

The video for The Degree That Matters

Well, it is called The Degree That Matters, because two of my good friends are in that ceremony. I also happen to know a lot of people who study that degree for some reason.

The certain ceremony for this degree happens to be the first ceremony during the certain week – it happened the first thing on Monday morning. The university promised to provide live stream of the ceremony. Unfortunately the third-party provider was underprepared. They did not expect the number of people who attempted to stream the ceremony, so their server crashed. This was really disappointing, because I got up early to watch the certain stream live in the lecture theatre. I and another friend sat in the lecture theatre for 1.5 hrs, and we saw nothing. This was because the lecture theatre used the same public live stream URL provided by that third-party. Effectively I got up early for nothing. However this did lead to the decision of providing an online video recording of live stream.

A quick inspection of the source code for the web page with the video recording does not reveal where the source video was located. So I decided to fire up the Developer tools of Google Chrome. I immediate realised that the full video was split into multiple segments, and a Javascript was sending out request for individual video segments. It is possible to replay the HTTP request sent from the browser to the server, by right clicking the individual network event, and click copy as cURL. The copied text is a command for ``cURL`` to replay the chosen HTTP request. However, I still had to download the video segment by segment, then merge the whole video.

So I seeked to the beginning and the end of the video, recorded the segment filename. I wrote a for loop in Bash, which enumerated all the ``cURL`` commands necessary for downloading every segment. I then merged the video segments together using ``ffmpeg``, the details of I merged the video fragment together is described in the next section.

I have to say in some ways, it was great that the video live stream failed, otherwise I would not have been able to download the video stream - the best thing I could have done was doing a screen capture. I did not figure out how to download live stream until Wednesday morning. My housemates had their ceremony on Tuesday morning - all I could do for them was doing a screen capture.

Downloading other certain videos

Another one of my friend had her certain ceremony on Friday. After obtaining videos for two separate ceremonies, I wonder if I could take my art further. I felt the screen captures I did for my housemates was not good enough.

I thought about capturing my own network traffic, then extract the video fragment from the network traffic dump. There are two problems with this approach:

  1. The network traffic dump will contain traffic irrelevant to video capture.
  2. The website uses HTTPS.

To solve problem 1), we use a virtual machine to achieve network isolation. The virtual machine cannot see the network traffic that it did not generate. To solve problem 2), we launch our browser with the environmental variable $SSLKEYLOGFILE in order to log the TLS master secret.

The rest of this section details the setup of my capture environment. We assume you are running Debian Buster [1].

Setting your environment for processing the network dump

I install the following packages:

   wireshark tshark ffmpeg

We need Wireshark [2] to configure the SSL decryption settings. We need TShark [3] to extract video fragments from the HTTP packets. We need ffmpeg [4] to merge the video fragments together.

Setting up the virtual machine

I decided to use Oracle Virtualbox [5] as my virtual machine. Again, I used Debian Buster as the guest operating system. Please make sure you have a desktop environment installed in your guest operating system, because you need the GUI to run the browser. I also installed the following extra packages:

  tcpdump chromium

We need chromium [6] to play the certain video, it honours the $SSLKEYLOGFILE environmental variable. We need tcpdump [7] to capture the network traffic.

You also need to set up a shared folder between your virtual machine and the host. Please follow the guide here [8].

Configuration for SSL decryption

Please review the information this link [9]. It contains information on setting up $SSLKEYLOGFILE environmental variable so the browser generates the Key Log File which captures the pre-master secret. It also shows the necessary configuration required for Wireshark / TShark to decrypt HTTPS traffic.

Please note that from my own experience, despite setting the $SSLKEYLOGFILE environmental variable, the Firefox [10] came with Debian refused to capture the pre-master secret. If you insist on using a browser that does not honour $SSLKEYLOGFILE, you might want to try mitmproxy [11], which can generate its own Key Log File.

Finally, TShark does not actually accept $SSLKEYLOGFILE, I configure its location in Wireshark's GUI.

Capturing the data

In your virtual machine, launch chromium, and verify that the Key Log File is being generated. (Please note that if you are making a new capture, the old Key Log File should be deleted.)

Run the following command to start the capturing network traffic:

      sudo tcpdump -i enp0s3 -nn -s0 -vvv port 443 -w dump.pcap
      

After the video ended, press Ctrl+C to terminate tcpdump, and close Chromium. Copy dump.pcap and the Key Log File to the host.

Processing the network traffic dump

The network traffic dump must be processed in the host, because TShark uses a lot of memory (8GB!!!).

Run the following command to extract video segments from the HTTPS packets:

  tshark -r dump.pcap --export-objects "http,destdir"

The above command creates a new directory named destdir. I suppose you can attempt doing that in Wireshark GUI, however I can guarantee you that it is extremely painful for you [12].

We can then merge the video fragments together using the following two commands:

  for i in `ls destdir/*.ts* | grep -v \( |sort -V`; do echo file $i >> list; done
  ffmpeg -safe 0 -f concat -i list -c copy -bsf:a aac_adtstoasc output.mp4

The first command generates the list of the video fragments to be concatenated. Note the -V option in sort, by using that option, the filenames are sorted in “natural sort”. So if you have numbers “1 3 10 2”, it gets sorted into “1 2 3 10” rather than “1 10 2 3”. Normally sort sorts texts character-by-character.

Ethical Statement

Jura V jnf na haqretenq, V jnf gbyq gung V unir gb jevgr rguvpny fgngrzrag sbe zl svany lrne cebwrpg, bgurejvfr V jbhyq ybfr znexf. Fb V guvax V cebonoyl fubhyq nqq na rguvpny fgngrzrag sbe guvf jro cntr.

Dhvgr senaxyl, V qba'g guvax vg vf snve gb punetr arj tenqhngrf gjragl cbhaqf sbe gur QIQ irefvba bs gur prerzbal, be gjragl-svir cbhaqf sbe gur UQ irefvba bs gur prerzbal ba n HFO zrzbel fgvpx. N ybg bs havirefvgl cebivqrf guvf xvaq bs ivqrb sbe serr, sbe rknzcyr Havirefvgl bs Lbex. Ehzbhe fnlf Havirefvgl bs Ongu nyfb cebivqrf vg sbe serr sbe gurve tenqhngrf.

Bar bs gur zber nzhfvat pbairefngvba V unq jnf jvgu n TC. Ur nfxrq zr jul V qba'g punetr svsgrra cbhaqf sbe gurfr ivqrbf, zl ercyl gb uvz jnf gung “V ungr pncvgnyvfz”. 1)V qb unccra gb oryvrir punetvat fb zhpu sbe gur tenqhngvba ivqrb vf rkcybvgngvir. Naq lrf, ba guvf bppnfvba, V jbhyq yvxr gb qrabhapr pncvgnyvfz. YBAT YVIR PBZZHAVFZ!!!

Naljnl, gur crbcyr jub qb Gur Qrterr Gung Znggref unir gb qb n FWG. V qrsvavgryl jbhyq abg cnff gur PF rdhvinyrag bs FWG.

Other notes

I have no idea why the certain ceremony video file for The Degree That Matters is bigger than other (2.7GB vs 1.1GB). I don't know if it actually has more entropy compared to other certain ceremony videos, or if whoever made it used a lower compression settings. To be fair, the standard variant of The Degree That Matters is 5 years compared to 3 years for a normal degree. Perhaps the file size reflects that.

1)
Jryy, V jvyy qrsvavgryl trg zlfrys vagb gebhoyr vs V fgneg fryyvat gurfr ivqrbf ng n purncre cevpr! V znl or qhzo, ohg V'z abg fghcvq.
  • public/downloading_certain_videos_of_a_certain_university.txt
  • Last modified: 2019/07/25 17:54
  • by fangfufu