I recently created Full Hacker News, a website that displays the full content of all Hacker News articles, in one single page. I created it because I needed a simple and easy way to read interesting content during my daily commute. I live in Paris, and there’s no Internet connection in the metro. With Full Hacker News, I can simply load the page before going under ground and read it while offline. I wanted something very simple and here are some design decisions I made in the process.
Use good old HTML
Web performance
- create a bucket
- configure it to serve files
- you can even have a nice domain name if you create a CNAME that matches the bucket name (in my case www is a CNAME for http://www.fullhn.com.s3-website-eu-west-1.amazonaws.com)
RaspberryPI + Amazon S3
As S3 only serves static files, another machine has to do the job of generating those files. In my case, this job is done by a RaspberryPi, running at home on my fiber Internet connection.
The RaspberryPi seems to be appropriate for this job:
- it runs Debian and all the packages I needed where available (PHP)
- it’s small, silent and consumes little power. I can plug it to my router and leave it alone.
The RaspberryPi is powerful enough to download a few pages, process them and merge them into one single HTML document. The task runs every 10 minutes, which is long enough to finish the job. When it’s done, the RaspberryPi upload the document to S3.
The app uses a filesystem cache to avoid unnecessary computations. However, it might cause problems on the long term. The cache never expires and will probably fill up the available space. Many writes will also probably burn the SD Card.
Hopefully, if the RaspberryPi goes down, I can easily git clone the code on another machine, add my S3 credentials and run the app again.
Cost
Amazon services cost is always hard to predict. I’ll see in a month how much the whole setup costs. If it’s too expensive, I might move away from S3 and migrate to a cheap shared hosting with fixed price. All I need is to serve static files. It shouldn’t be too difficult to find.
One reply on “Building Full Hacker News”
Interesting project, I like it.
S3 doesn’t have any logic at all so it can’t gzip things on the fly. What you can do, though, is gzipping your files before uploading them and add appropriate metadata on S3: “Content-Type: text/javascript” or “Content-Type: text/css” and “Content-Encoding: gzip”.
From my experience, S3 is very cheap. The most expensive thing is the fee my bank charges to pay in US currency (2-3€).