As lazy afternoons often go, I found myself searching for some movie on a reputable streaming service going by a red logo. Being out of luck there, I risk myself into the darker “free (totally legit) streaming” websites. This is where I stumble upon a very amateur looking website that seems to churn bitrates faster than you could hand out candy bars in the local school playground.
Given my background as a programmer, I was miles away from expecting this kind of performance from this small and cheap website. Therefore, I decided to peek behind the scenes of this dollar store content farm. It actually turned out pretty dope!
To know how a website is coded, I prefer to start looking at the XHR requests; it gives you an overview of where and what content is hosted.
After a few requests, I noticed a weird pattern: there were multiple requests to some Google servers!
That really bugged me.
Doesn’t Google care about intellectual property and copyright? Is that not the sole reason behind the Content-ID system for Youtube?
To answer these questions, we need to understand how the site is streaming to your PC.
How it’s made
I will try to explain it in the same order that I discovered it:
First, I saw a request of type:
Being a web programmer, I wasn’t really familiar with the concepts behind the m3u8 file format. In short, Wikipedia explains it as: a text file indexing the locations of all the video chunks.
At that point you might think:
WTF, what chunks? What location? Why is that so complicated?
If you display a picture on a website, you need it to be completely displayed as soon as the website is loaded. Video content bypasses this restriction by buffering a few minutes of content and wait for the user to consume it before requesting more.
This allows you to decrease bandwidth if the user doesn’t stream the whole video, and your website will load faster.
Now, we’ve seen that it’s meaningful to split a video into chunks in order to have smaller packets to send to the user. Now let’s get to their location, because that’s more interesting than it sounds.
If someone asks me to design a system to send over chunks of video, I would probably end up with the following link architecture:
//or something like this
Which is completely fine if you rely on your own architecture and servers.
BUT, pirates do not, and here is some context to explain why.
Infrastructure is complex and very expensive
Having a $10 VPS by DigitalOcean is pretty easy, hosting videos of your cat for your family on it is doable albeit debatably annoying. Try to send the same cat video thousand times simultaneously all over the world reliably, and your system will need some more meatballs on its pasta if it is supposed to handle that heavy load.
Pirates have the same problem: they need to reach and entertain as many people as possible but they eventually encounter 3 problems:
- Infrastructure for video is expensive: you need to have lots of storage, lots of servers shared around the globe, lots of redundancy, compression (and therefore CPU) power, bandwidth management and so much more…
- Infrastructure for video is complex: building a good, available video streaming system requires pretty sharp ops (engineers specialised in keeping the systems running anytime even under heavy traffic) and programming skills which are pretty rare and expensive on the market.
- A legal issue: if you start hosting illegal content on your servers, you will get strongly worded letters from the right-holders and their lawyers…
Just host pictures on Drive
The secretly repressed pirate apprentice in you should be pretty desperate by now and think that it will never become the new Kim Dotcom because we haven’t explain how to scale your infrastructure. Easy enough if you have cash to burn, but if you don’t, Google will deliver that promise for next to nothing: they have a horde of distributed servers, they are experts at what they are doing, they are never down and they can handle any volume. You just need to be decently smart (at least a bit smarter than their robots for the moment).
What I found was that the content available at the previous URL (https://lh3.googleusercontent.com/d/xxxxxxxxx), is a 1x1 pixel PNG and weights 4.8MB (it’s a huge pixel!).
I was a bit confused because I’m not working in the cyber security field but a friend of mine (cheers Axel Soll) told me: “now that’s getting interesting!”.
So let’s just open the (decoded) binary:
At first, we see the
.PNG magic code — it signals the start of the image — and the
IEND indicates that the 4 next bits will be the CRC and you would then expect the image file to end. But it doesn’t, and what comes after contains video related keywords like
What happens if I remove the PNG Binary code?
Let’s do it:
And push the Frankenstein file to VLC:
Let’s summarize this
Getting around as an illegal streaming host is becoming so complex that it has got simpler to try to hide the content on third party servers of the internet giants (I found the same behaviour on VK servers without the PNG trick) which offer great infrastructure at a reasonable price.
The recipe is pretty straightforward:
1. Create an HLS from your MP4 source video. This means, take the original video, cut it in ~300 chunks of 5MB and create an index for these files.
I found an example of this command here:
ffmpeg -i input.mp4 -profile:v baseline -level 3.0 -s 640x360 -start_number 0 -hls_time 10 -hls_list_size 0 -f hls index.m3u8
2. For each chunk, add this 1x1 pixel before the actual data in order to hide the real content. Congrats, the video chunk has been transformed into an image! 🎉
3. Upload every “image” to Google servers. I guess you could send it via the Google Drive API but I’m not sure of that.
4. Update your
m3u8 index file with the links to each chunk the Google API saved them at. This is why I highlighted that working with unpredictable chunk file locations was more interesting than it sounded.
5. Host the
m3u8 text file anywhere and link the video player to it.
I find it fascinating how this cat and mouse game is evolving. *Grabs Popcorn*
BUT, I have to express my disappointment.
Yes that’s a smart move to embed the video payload behind a PNG but you can do so much more as a mouse! 🐭
I did some tests with encryption and embedded the payload as the image content itself, resulting in cooler scrambled pictures. I will try to write an article about all that if you find this interesting.
On the other hand, I also wrote some algorithms to detect these suspicious behaviours and I am preparing an article on that as well. That will be the 🐱 part !
I’d like to give some kudos to
Sibelius Seraphini who took the time to review the article as well as my “Tech Crew” (Matthieu Bulté, Axel Soll) who reviewed the article and brought me ideas + motivation to write the two next articles which will be a bit more technical.
If you liked the article, you can follow me on Twitter