Intrepid first steps in R …

Here is a print screen of the R Language Course Completion proof:


I was hoping that with the R project it would be simple enough to install R and R studio on my beloved Mac. Luckily (unlike MS SQL server installation) it proved to be straightforward if you followed the guidelines on CRAN –

Next I installed, then loaded ggpolt2 using R studio onto my Mac. I ran this using the binary code and did not download the optional source code for a Mac.

How I created a graphic based on a downloaded dataset

The first step was to find and then download a data set from the CSO. I selected a data set which provided high level house price data for the years 2005-15.

I saved it as a CSV file and kept only the data for the years 2005-15 inclusive (I deleted the monthly data.) I removed some of the columns I didn’t need, and kept the period, RPPI base (residential Property price index), and Percent Change. I removed all spaces from column headers.

Moving to R Studio, I created one data frame called hprices, and verified that the data from my csv file was inserted into the data frame as expected. That worked fine. Initially I created a very basic bar graph similar to one on the R-Cookbook website. The bar chart printed in black but I made some changes to it.

I had noticed that I got a warning dialog as follows:

Removed 1 rows containing missing values (position_stack).

Upon further research I discovered that the error was due to the fact that I had a null value for the Y axis for the year 2005. What I did was remove the year 2005 from my bar chart as I didn’t really want a value of zero to appear in the chart for the year 2005 as this would not be correct.. ( I wanted to see percentage increases/decreases from the previous year)

I also added some cosmetic enhancements to the graph, for example, colour of the bars, updating the titles of the X and Y axes, and I gave the chart a title. I also changed the orientation of the text labelling on the X axis to enhance readability.

There was a second warning ‘Stacking not well defined when ymin != 0’ .

I think this one relates to negative values (in other words deflation) for a number of the years between 2008 -12, so can be disregarded.

Information I gleaned from the dataset.

It is clear from the graphing that Ireland as a whole suffered dramatic drops to house prices in the period 2008-12. In 2013 there was a small turnaround, and 2014 has seen quite a dramatic pick up in prices from the previous year.

Other Ideas/concepts that could be represented via R Graphics if there was more time.

If I had more time, it would be nice to include a data set with data for Dublin only versus the rest of the country. Anecdotally, house prices were slower to pick up outside of Dublin and it would be interesting to view this in the diagram. Other things that I would include is possibly enhancing the ticks on the Y to see 5%, 10%, -5 and -10% displayed on the axis. I would also like to print the actual rate of inflation/deflation on the bar for the year. Finally, possibly a line graphing the rate of the fall or rise in each year would also be of interest to a viewer.

Image of my Final Graph:



Published by

Data Hothead

Student of Data Analytics, and erstwhile Oracle Applications Consultant.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s