Skip to main content


Showing posts from January, 2011

R for beginners and intermediate users 2: extracting subsets of data

For my second post on R, I think I will address how to extract subsets of data based on some selection criterion like taxon names. For instance, I have a huge dataset of morphometric variables for at least 36 species of cats (living and fossil). Sometimes I'd like to do some stats on a subset of this dataset, like all the living cats or just on the Panthera lineage species (Panthera and Neofelis). Till recently, I've been doing most of my dataset manipulation in Excel by filtering out certain taxa from the spreadsheet and copy-pasting to a text file, which I read into R. However, you can select subsets of data in R based on taxon names.

In my dataset that I call cat, I have a column labelled Taxa which contains all my taxon names. So typing cat$Taxa would be the way to call up my taxon names.

Let's say I want to extract from my dataset cat just the data for the lion Panthera leo. The associated taxon names in cat$Taxa would be Panthera_leo. So to extract that portion of th…