######################### Data wrangling in R ######################
#### Based on the data carpentry ecology lessons: 
####       http://www.datacarpentry.org/R-ecology-lesson/03-dplyr.html

#installing packages

#loading data 



############################## The Verbs! ################################

### Select


### Filter
 

###pipes



#Exercise #1: 
### Using pipes, subset the survey data to include individuals collected 
###     before 1995 and retain only the columns year, sex, and weight.



               ### OR ###




### Mutate



##### Exercise 2 
#Create a new data frame from the survey data that meets the following criteria: 
#1. Contains only the species_id column and a new column called hindfoot_half
#2. hindfood_half contains values that are half the hindfoot_length values.
#3. Only include records from 1990 and after

#Hint: think about how the commands should be ordered to produce this data frame!



################# Summarizing data by categories ###############################
#Overall mean weight


#mean weight by sex



#mean weight by sex and species id


################### Removing missing values with filter ########################

#Output is a T/F vector


#Can be used as in put to filter



#The not operator (!) changes true to false and false to true


#filter out missing values rather than ignoring them


#print only first 15 lines


#calculate multiple summary statistics

       

#Tally: get a count of records in each category



################################# Exercise 3 ####################################################################
#How many individuals were caught in each plot_type surveyed?
       #Use group_by() and summarize() to find the mean, min, and max hindfoot length for each species (using species_id).
       #What was the heaviest animal measured in each year? Return the columns year,  genus, species_id, and weight.
       #You saw above how to count the number of individuals of each sex using a combination of group_by() and tally(). 
       #How could you get the same result using group_by() and summarize()? Hint: see ?n.

#Individuals per plot type


#hfl by species


#heavist animal measured in each year

########################### Spread ############################################

############################### Gather ######################################### 


################### Exercise 4
#Goal: look at the relationship between mean values of weight and hindfoot 
#length per year in different plot types. 

#Step 1: Use gather() to create a dataset where we have a key column called 
#measurement and a value column that takes on the value of either 
#hindfoot_length or weight. 


#Step 2: Calculate the average of each measurement in each year 
#for each different plot_type. 


       
#Step 3: spread() them into a data set with a column 
#for hindfoot_length and weight. 


########################### Data Cleaning ################################################################

# remove incomplete records



## Extract the most common species_id 



## Only keep the most common species 


################################# Exporting data #######################################################

