Market News


R Programming: The Underdog or The Leader in the Statistical World?

Big Data, Statistical Analysis, etc. are some of the booming industries today.  Experts say that if 90’s were the era of computer science then similarly next decade will be of analytics.  It’s completely upcoming industry with lot of hopes, aspiration and solutions.  Nowadays, companies from each and every sectors are giving importance and also motivation to drive their employees towards data analysis.  Obviously, analysis of huge data, whether it is of sales or profit of different products, with various parameters like region, cost, availability will help the companies to maintain sustainable growth as well as profit.  In short, we can say that analytics plays vital and inevitable role in business decisions.  For this process, we require some strong software tools and languages.  Tools such as SAS, R, SPSS and Stata are prominent and most powerful software among others.

R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows and MacOS.  R is basically an implementation of S programming language.  Looking at the history of R programming, it was designed by Ross Ihaka and Robert Gentleman in 1993 at University of Auckland in New Zealand.  R software is primarily written in C, C++, FORTRAN and it is much more similar to software like MATLAB and visually to LaTeX.  High end of R codes is mostly derived from C, C++, .NET, JAVA and Python.

It’s easy to implement R codes for statistical tests, graphical representations, text analytics, data visualization, machine learning and many more.  R provides variety of packages making it a unique feature in a crowd of statistical software.  In R , different statistical packages are available for both linear and nonlinear regressions, and also for regression tests like Z- test, ANOVA, chi square, parametric and nonparametric test etc.  Packages like ggplot2, googleVis and rCharts helps to design the best possible graphs.

Due to its open source, R language has got a huge fan following which includes companies like Facebook, Google, FDA, and New York Times among others who are using it.  Text analytics is one of the main specialty of R.  For example, if there is huge text data or comments in Facebook or tweets in the micro-blogging site Twitter or huge articles regarding a global leader’s speech, one can’t sit and study each line to conclude what the article or the speech is stressed on.  That would seem to be impractical as well as illogical to do so.  By using some R codes, one can easily make out the key things in the speech, what the person is relating to with a particular word, etc.  And if it is Facebook comments, one can find out about what the majority of people think about and many more.  One can also create a word cloud using this.

R Programming

This word cloud is an example of R software having huge number of texts as back-end data.  Words which are shown in bigger sizes have found mention many times in a given text.  Looking at this collaboration of words, one can easily make out the most stressed on topics of the article.  Here, in this content, the most repeated words are social, media, networking, sites, information, private etc.  So, we can say that the article referred to in the above example is about social media and its facilities like networking, photos, privacy of the site, etc.  It’s basically nothing but a summary of the whole scenario in a single picture.

R is an extremely powerful statistical free software available in the market.  However, many companies still prefer to use SAS or its macros applications as it is more friendly and easy in terms of usability, unlike R’s huge bundle of codes.  Still R growth is continuing as its available for free and also because of its conjunction quality of connecting SAS and many other tools in the code.  R has the most advanced graphical capabilities among all others.  There are numerous packages in R which provides advanced graphical capabilities.  Clearly, there is no winner in this race yet.  Let’s hope R will come out of all its difficulty and become a king of statistical software dynasty.