The hackeRnews package was created in order to simplify
the process of getting data from Hacker News. Hacker News is
a user-generated content website that focuses on stories related to
computer science. The website is composed of user submitted stories
where each one provides a link to the original data source. Moreover,
users have the ability to upvote a story if they have found it
interesting. Each story contains a comment section which allows users to
discuss about the presented subject. Besides news stories Hacker News
contains the following sections:
The Hacker News API official documentation can be found here. The API serves data
in JSON format. The hackeRnews package allows the retrieve
this data in form of convenient R objects. Each object (story, comment,
…) has a unique id and can be retrieved using this id. The API also
provides a way to fetch up to 500 top and new stories, latest best
stories, ask stories, show stories and job stories.
Examples of using the hackeRnews package to retrieve
data from the official Hacker News API are presented below:
To fetch best/new/top stories the user can use the
get_*_stories function. Each function takes one optional
argument max_items that limits the number of returned
stories.
For example to fetch the top 5 best stories:
There is a method that allows to fetch just raw ids of best/new/top
stories as well get_*_stories_ids()
Similar to news stories. There are get_latest_*_stories
that returns latest * stories and get_latest_*_stories_ids
that returns latest * stories ids.
For example to fetch the 3 latest ask stories:
To fetch data about user ‘jl’ just use the
get_user_by_username function:
It’s possible to iterate over latest items by fetching the id of the
latest item by using the get_max_item_id function and then
walking backwards to discover latest items. Using that method it’s
possible to fetch all items on Hacker News.
For example to fetch 10 latest items:
comments
The discussion in story threads is represented as system of comments. Each story has top level comments ids stored under the
kidsproperty. Each comment post can have it’s own set of comments ids underkidsproperty (sub-comments) and so on. In order to retrieve all of the comments of a specific story, just use theget_commentsfunction.