About

The lyf so short, the craft so long to lerne.

– Geoffrey Chaucer

This book serves as a collection of

  • tools for package development,
  • good practices for programming,
  • and my most frequently used packages.

The material treated here certainly does not cover all of R, but rather serves as my personal list of essential things related to programming in R that I found useful to know. The document is partly opinionated and subjective, so feel free to open up an issue if you feel some parts should be clarified or reformulated.

The book is no introduction on how to program functionally, procedurally or in an object-oriented way, how to write code in general, or how to speed it up. The interested reader is referred to:

  • Robert Martin: Clean Code,
  • Andrew Hunt: The Pragmatic Programmer,
  • Gang of Four: Design Patterns,
  • Colin Gillespie: Efficient R programming,
  • Patrick Burns: The R Inferno,
  • Hadley Wickham: R packages,
  • Hadley Wickham: Advanced R,
  • Dirk Eddelbuettel: Seamless R and C++ Integration with Rcpp,
  • Thomas Cormen: Introduction to Algorithms,
  • Dan Gusfield: Algorithms on Strings, Trees and Sequences,
  • Donald Knuth: The Art of Computer Programming,
  • a comment by Peter Norvig,

Much of R’s popularity is due to its fantastic ecosystem. For data analysis or statistics R is a good choice for various reasons:

  • high-level,
  • easy to install packages,
  • can be extended to C++ without needing knowledge of linking or C++ build-systems,
  • probably the best plotting facility of any language,
  • Bioconductor, machine learning and statistics libraries,
  • small standard library.

There are a few things, however, R is not so great at:

  • in general slow,
  • not really suited for large projects,
  • poor object orientation,
  • inconsistent function names,
  • very limited threading capabilities,