Categories
Technology

Building Full Hacker News

I recently created Full Hacker News, a website that displays the full content of all Hacker News articles, in one single page. I created it because I needed a simple and easy way to read interesting content during my daily commute. I live in Paris, and there’s no Internet connection in the metro. With Full Hacker News, I can simply load the page before going under ground and read it while offline. I wanted something very simple and here are some design decisions I made in the process.

Use good old HTML

I wanted to be able to use the app, even with a spotty 3G connection. The navigation is very simple: at the beginning of the document there’s a table of content with anchors to the corresponding articles. To go back to the table of content, simply tap the status bar on an iPhone, which scrolls the page to the top. It’s basic, but it works even if the javascript or the stylesheets fail to load.
When the browser manages to load the javascript files, the app offers a better navigation mechanism, with “Index”, “Previous” and “Next” buttons that let you skip through the articles quickly.

Web performance

As I am mainly using the site from my mobile phone, I needed it to be fast. One way to do that is to only serve static files, and that’s exactly what I did. The whole page is a static HTML page that is generated periodically (every 10 minutes). Another benefit of this decision is that the whole website can be served by Amazon S3.
Serving files with Amazon S3 is really easy:
  • create a bucket
  • configure it to serve files
  • you can even have a nice domain name if you create a CNAME that matches the bucket name (in my case www is a CNAME for http://www.fullhn.com.s3-website-eu-west-1.amazonaws.com)
All files are served from Amazon servers in Ireland (the closest location from France), except jQuery which is served from Google CDN. I noticed that S3 doesn’t gzip transfers which is sad. I tried to minify the HTML, but the difference was too little to matter.
I haven’t optimized CSS and JS files yet, because the app is still usable if they are not loaded.

RaspberryPI + Amazon S3

The RaspberryPi that runs fullhn.com, next to my Airport Express
The RaspberryPi that runs fullhn.com, next to my Airport Express

As S3 only serves static files, another machine has to do the job of generating those files. In my case, this job is done by a RaspberryPi, running at home on my fiber Internet connection.

The RaspberryPi seems to be appropriate for this job:

  • it runs Debian and all the packages I needed where available (PHP)
  • it’s small, silent and consumes little power. I can plug it to my router and leave it alone.

The RaspberryPi is powerful enough to download a few pages, process them and merge them into one single HTML document. The task runs every 10 minutes, which is long enough to finish the job. When it’s done, the RaspberryPi upload the document to S3.

The app uses a filesystem cache to avoid unnecessary computations. However, it might cause problems on the long term. The cache never expires and will probably fill up the available space. Many writes will also probably burn the SD Card.

Hopefully, if the RaspberryPi goes down, I can easily git clone the code on another machine, add my S3 credentials and run the app again.

Cost

Amazon services cost is always hard to predict. I’ll see in a month how much the whole setup costs. If it’s too expensive, I might move away from S3 and migrate to a cheap shared hosting with fixed price. All I need is to serve static files. It shouldn’t be too difficult to find.

Conclusion

So far, I’m happy with what I built. It fills my need for interesting stuff to read offline. It’s seems to be efficient and runs well. Let’s see if it lasts. If you like it, feel free to use the website but don’t expect it to work reliably. You can also get the code from github and run your own.

One reply on “Building Full Hacker News”

Interesting project, I like it.

S3 doesn’t have any logic at all so it can’t gzip things on the fly. What you can do, though, is gzipping your files before uploading them and add appropriate metadata on S3: “Content-Type: text/javascript” or “Content-Type: text/css” and “Content-Encoding: gzip”.

From my experience, S3 is very cheap. The most expensive thing is the fee my bank charges to pay in US currency (2-3€).

Comments are closed.