CRAN Task View: Statistics for the Social Sciences

Maintainer:John Fox
Contact:jfox at

Social scientists use a wide range of statistical methods. To make the burden carried by this task view lighter, I have suppressed detail in some areas that are well covered by related task views (e.g., the Spatial task view for spatial statistics), and have pointed to those task views instead.

Most statistical data analysis in the social sciences is covered by the facilities in the base and recommended packages, which are part of the standard R distribution. In the package descriptions below, I identify base and recommended packages on first mention; packages that are not specifically identified as "R-base" or "recommended" are contributed packages.

One area of central interest to social scientists that I do not cover here is statistical graphics, even though this is one of the great strengths of R: Basic R graphics, trellis graphics (in the recommended lattice package), dynamic 3D graphs (via the rgl package), and the many packages that include facilities for various statistical graphs are just too extensive to detail here. Fortunately, a Graphics task view is currently in preparation.

If I have omitted something of importance, or if a new package or function should be mentioned here, please let me know.

Linear and Generalized Linear Models:

Univariate and multivariate linear models are fit by the lm function, generalized linear models by the glm function, both in the R-base stats package. Beyond summary and plot methods for lm and glm objects, there is a wide array of functions that support these objects:

Analysis of Categorical and Count Data:

Binomial logit and probit models, as well as Poisson-regression and loglinear models for contingency tables (including models for "over-dispersed" binomial and Poisson data), can be fit with the glm function in the stats package. For over-dispersed data, see also the aod package and the glm.nb function in the recommended MASS package (associated with Venables and Ripley, Modern Applied Statistics in S, Fourth Ed. , Springer, 2002), which fits negative-binomial GLMs. The multinomial logit model is fit by the multinom function in the recommended nnet package, and ordered logit and probit models by the polr function in the MASS package. Also see the MNP package for the multinomial probit model, and multinomRob for the analysis of overdispersed multinomial data.

There are other noteworthy facilities for analyzing categorical and count data:

Other Regression Models:

It is possible to fit a very wide variety of regression models with the facilities provided by the base and recommended packages, and a much wider variety of models with contributed packages:

Other Statistical Methods:

Here is a brief survey of implementations in R of other statistical methods commonly used by social scientists:

Collections of Functions:

There are some packages that are so heterogeneous that they are difficult to classify, yet contain functions (typically in multiple domains) that are potentially of interest to social scientists:

CRAN packages:

Related links: