This is an old revision of the document!


The Tale of HTTPDirFS

HTTPDirFS is a FUSE filesystem which allows you to mount a HTTP directory listing. It has a very interesting beginning.

The story starts with a conversation with the admin of the-eye.eu, which is a website containing a lot of questionable content in terms of copyright. Below is the chat log copied from Discord.

fangfufu 07/15/2018
could you enable webdav please? lol
i know i asked this before and got denied
httpfs2 doesn't work
i want to browse your collection
i don't want to have to download everything
or install the server side script for this? https://github.com/cyrus-and/httpfs
GitHub
cyrus-and/httpfs
httpfs - Remote FUSE filesystem via server-side script

-Archivist 07/15/2018
nope
why do you have to be the one in millions that doesn't want to view the site like a normal person

fangfufu 07/15/2018
is there any ways to "stream" the website?
well because i don't want to download the whole website
it would be nice if i can mount it locally

-Archivist 07/15/2018
it's a website, its provided as is, we already do enough extra shit

fangfufu07/15/2018
ok nvm then, oh well :frowning:

I thought it would be funny to actually write a software that allows me to mount a HTTP directory listing locally, and throw that in the Archivist's face. The project turned out to be fairly difficult. Mainly libfuse is multithreaded, and I got about 40% in the concurrency course work in Principles of Programming Languages, back when I was an undergraduate in York. I am reliably told by a computer science postdoc that nobody likes dealing with race conditions.

Obviously I wrote in the README of the project that I dedicated the project to the Archivist, and people on Reddit find it funny.

It is kind of crazy how far this project has come - this software is now available on Debian. It is interesting enough to attract a Debian Developer who packaged and uploaded it.

Finally, researchers in Germany have decided to incorporate HTTPDirFS in their research software framework, for importing data. Their publication record so far suggests that the project is primarily used for biomedical research.

I really don't know what to feel or what to say about this one - this project was originally designed to annoy someone on the Internet. It was not meant to be useful or helpful. It feels really strange that some researchers on the Internet are taking it seriously. Because I am in UEA Triathlon Club, I have a lot of friends who study medicine, I do enjoy being around them. But I find it highly weird that HTTPDirFS somehow winds up helping out with biomedical research - when will people who are somehow related to medicine leave me alone? (Only joking of course!) My dad does biomedical research, so I suppose it feels great to indirectly contribute to the field. :-)

So overall, I am not sure if this project has been a success or failure, in the sense of whether it fulfilled its original purpose. I am not sure if the Archivist is annoyed. However I believe I have provided ample entertainment for the Redditors in his own subreddit.

What is certain is that I am really proud of this project - it feels great that people on the Internet take your toy project seriously, especially when it wasn't meant to be serious at all. Using badly learnt knowledge learnt from undergraduate days in real life brought me great satisfaction. Thank you for teaching me about concurrency, Professor Alan Burns.

  • public/the_tale_of_httpdirfs.1556614387.txt.gz
  • Last modified: 2019/04/30 09:53
  • by fangfufu