Classify behaviours using Expectation-Maximization Binary Clustering.
In this post, you will learn how to:
- Classify behavioural
states during foraging trips from test data from Masked boobies.
Load data. This test data is from masked boobies.
To access the
data you have to install the package sula:
devtools::install_github(“MiriamLL/sula”)
#devtools::install_github("MiriamLL/sula")
library(sula)
GPS_raw<-as.data.frame(GPS_raw)
GPS_ind<-subset(GPS_raw,GPS_raw$IDs=='GPS01')
The data I am using does not include locations when the animal was on the nest.
To remove the nest locations, in the package sula there is this function recortar_periodo which allows you to filter your data and keep the period when the bird was at sea.
For the example, I know the time when this individual (GPS_01) started and ended its trip and keep the locations inside this time range.
If you dont have this information, you might need to first identify the start and end of trips.
GPS_ind<-recortar_periodo(GPS_data=GPS_ind,
inicio='02/11/2017 18:10:00',
final='05/11/2017 14:10:00',
dia_col='DateGMT',
hora_col='TimeGMT',
formato="%d/%m/%Y %H:%M:%S")
ES: El track original contenia 1038 filas y el track editado contiene 986 filas
EN: The original track had 1038 rows, and the edited track has 986
Next, I create a new column named tStamp and transformed it to the corresponding class.
GPS_ind$tStamp<-paste(GPS_ind$DateGMT,GPS_ind$TimeGMT)
GPS_ind$tStamp <- as.POSIXct(strptime(GPS_ind$tStamp,"%d/%m/%Y %H:%M:%S"),"GMT")
class(GPS_ind$tStamp)
[1] "POSIXct" "POSIXt"
Check that you have the right class for your columns.
GPS_ind$lon<-as.numeric(GPS_ind$Longitude)
GPS_ind$lat<-as.numeric(GPS_ind$Latitude)
GPS_ind$id <- as.factor(GPS_ind$IDs)
Then I select only columns that will be important in the analyses.
id tStamp lon lat
17 GPS01 2017-11-02 18:11:57 -109.4530 -27.20092
18 GPS01 2017-11-02 18:16:04 -109.4531 -27.20096
19 GPS01 2017-11-02 18:20:10 -109.4530 -27.20096
20 GPS01 2017-11-02 18:24:16 -109.4530 -27.20094
21 GPS01 2017-11-02 18:28:24 -109.4530 -27.20096
22 GPS01 2017-11-02 18:32:29 -109.4531 -27.20099
I strongly recommend to read this article by Garriga et al. 2016 and to check the vignette.
As in vignette: ‘The Expectation-maximization binary clustering (EMbC) is a general purpose, unsupervised, multi-variate, clustering algorithm, driven by two main motivations: (i) it looks for a good compromise between statistical soundness and ease and generality of use - by minimizing prior assumptions and favouring the semantic interpretation of the final clustering - and, (ii) it allows taking into account the uncertainty in the data. These features make it specially suitable for the behavioural annotation of animal’s movement trajectories.’
You need to have the package installed otherwise you need to install it using the install.packages(‘EMbC’).
library(EMbC)
In the package EMbC, the function stbc performs the standard velocity/turn clustering of the trajectory.
In the script, [2:4] means that you are including from the 2nd to the 4th column. These columns should be 2th - tStamp, 3th - lon, and 4th - lat.
The -1 in info means that any iteration is supress.
Data_bc<-stbc(Data[2:4],info=-1)
[1] 0 -0.0000e+00 4 986
[1] ... Stable clustering
In the package EMbC, the function stts offers a compact view of the parameter set.
kn is the marginal distribution of the cluster in absolute and percentage values
stts(Data_bc)
You can also check the mimimum and maximum velocity and radians (turning angle).
Data_bc@R
Here is important that you check that these values make sense. Turning angle should be between 0 and 3.14. Velocity varies according to your species.
For boobies this information can be translated to:
You can also compare your results with Table 2 in Mendez et al. 2017, or in methods from Lerma et al. 2020
To see this information graphically, you can use the function sctr from the package EMbC.
sctr(Data_bc)
To see how the track looks according to the classification you can use the view function from the package EMbC.
EMbC::view(Data_bc)
To export the delimiters you can extract information into a data frame and rename the columns.
Class_bc<-as.data.frame(Data_bc@R)
names(Class_bc)<-c('velocity.min','radian.min','velocity.max','radian.max')
Originally, the values of velocity are in m/s, therefore you can convert them to km/h to compare with the literature more easily.
Class_bc$velomin_km<-Class_bc$velocity.min*3.6
Class_bc$velomax_km<-Class_bc$velocity.max*3.6
You can use the function write_csv to export the delimiters.
In the example below, I use the function here and define the folder where I want the file to be.
You can add the information of velocity and turning angle into you GPS data.
Data$Beha<-Data_bc@A
It returns numbers which correspond to:
If you want to add a column explaining what this numbers mean you can use the following functions.
or
You can add the velocity in a column of your GPS data frame.
head(Data_bc@X)
Velocity is in the first column, therefore we use the [,1]. Also, to transform it to km/h multiply per 3.6.
Data$Velocity_ms<-Data_bc@X[,1]
Data$Velocity_kmh<-Data$Velocity_ms*3.6
Heading direction is on the second column, therefore we use the [,2].
Data$HeadingDirection<-Data_bc@X[,2]
Finally you can check the range to coincide with your delimiters.
You can plot the GPS tracking data according to their behaviours using the function ggplot
If you are not satisfied with the classification and the delimiters, you can set your own behavioural classification using the turning angle and velocity.
For example, using data from Mendez et al 2017.
Data<-mutate(Data, New_beha = ifelse(Data$Velocity_kmh >= 0 &
Data$Velocity_kmh <= 4 &
Data$HeadingDirection >= 0 &
Data$HeadingDirection <= 0.30,
"Resting",
ifelse(Data$Velocity_kmh >= 0 &
Data$Velocity_kmh <= 14 &
Data$HeadingDirection >= 0.30 &
Data$HeadingDirection <= 3.14,
"IntenseForaging",
ifelse(Data$Velocity_kmh >= 4 &
Data$Velocity_kmh <= 90 &
Data$HeadingDirection >= 0 &
Data$HeadingDirection <= 0.31,
"Travelling",
ifelse(Data$Velocity_kmh >= 14 &
Data$Velocity_kmh <= 90 &
Data$HeadingDirection >= 0.31 &
Data$HeadingDirection <= 3.14,
"Relocating",
"Unknown")))))
EMbC
Boobies
Other Species
Alternative options