23 Jul 14


A few months ago a friend of mine (Mike Miller - CSO of Cloudant) had a great idea for a project. Cloudant does a lot of outreach via Hacker News, so Mike wanted to be able to analyze the performance of their posts over time - kind of like you would do in Google Analytics. I had just re-learned how to code, finished my first project, and was looking for something to work on, so it was a good match.

The idea was simple - scrape HN’s front page to get time series data. Store that data in a Cloudant database (with a public REST interface). Build a few visualizations to kick-start the project. Then turn it over to the Hacker News community and see what cool stuff people come up with.

The result is hind-cite.com. It’s got a lot of deficiencies and it only scratches the surface of what can be done with the data, but I think it’s a good start.

I’d like to thank Cloudant for sponsoring the project and hosting the data, and Mike for the initial idea, impetus, and feedback along the way.

Open Source

Assuming that people care, I think hind-cite is just a beginning. I’m a novice, self-taught programmer, so could use more help than most. Specifically, here are some things I’d particularly like help on:

  • Visual design & CSS - this is UGLY. If someone wants to redo the look and feel, that would be great. But even just some good CSS would help and I’d be happy to implement it.
  • Code review - I tried my best to write good code for this project, but I’m new at this and I’m the only one who has looked at the code. So I’d love some feedback (but be nice - if I did something stupid or bad, I just don’t know better). Pull requests are great, but even just constructive feedback with guidance would be appreciated.
  • General usability feedback - The main post page has a fairly complicated set of controls. I thought they were necessary, but I recognize they could also be confusing. Let me know if you have suggestions for improvement. (I suspect some layout / design work could help this a lot).
  • New charts - This is where it gets exciting. Have at the data, do some cool visualizations, and we’ll add them to the site.
  • General analysis and blog posts - The data is yours. I can imagine all sorts of cool analyses and blog posts that come out of this. Go for it.

Repositories




I wanted to have two different blogs on my site (one for business topics and one focusing on development). Jekyll provides categories to make this work, but figuring out exactly how to move through posts in a given category was non-trivial.

A few key points:

  • A for loop for an object gives you an array, where the first elment is a name. So, you generally need to treat that element differently.
  • { % for category in site.categories % } gives you: [[cat0name, cat0posts], [cat1name, cat1posts]…] (or something like that!
  • { % for catposts in category % } gives you all the posts in the category: [post1, post2, post3]
  • { % for catpost in catposts % } gives you the actual post objects, so you can do things like
  • Note: Above I have an extra space between the { and % - that’s because otherwise jekyll tries to process the liquid tag! (Below I put my code in a { % raw % } … { % endraw % } block.
{% for category in site.categories %}
    {% if category[0] == page.rcategory %}
        {% for catposts in category %}
            {% if forloop.first %}
                {% continue %}
            {% endif %}
            {% for catpost in catposts %}

                <p>{{ catpost.title }}</p>

            {% endfor %}
        {% endfor %}
    {% endif %}
{% endfor %}

Other Jekyll tidbits

  • jekyll serve –watch: If you change your _config.yml, be sure to restart your jekyll serve --watch - I can’t tell you how many times I wasted minutes tracking down fake bugs because I forgot!
  • ‘content’ vs. ‘page.content’: Uggh - this was a big pain. In a layout file use { % content % } and not { % page.content | markdownify % }. If you do the latter, it will mostly work, but you’ll get strange errors, like { % highlight % } blocks not working. Like most bugs, this one was obvious - ex post!

    Content - In layout files, the rendered content of the Post or Page being wrapped. Not defined in Post or Page files.

  • pipe (|) : - Weird - pipe causes problems in the markdown. Type &#124; or \|
  • {% .. %}: - These are interpreted as liquid templates. So either put a space between the {} and % or wrap in { %raw% } { %endraw% }
  • Github jekyll: If you are going to host your jekyll blog on github pages, you need to be using the same version of jekyll as Github to make sure what you are testing locally is what they are displaying. Follow the instructions here: Using Jekyll with Github Pages


ARCHIVE


23 Jul 14