Chapter 15 2) Variable importance plot
Although machine learning algorithms are often considered as a black box, with RF is possible to plot a sample tree (selected randomly) to analyse its structure and investigate how decisions have been made. In addition RF provides two metrics allowing to assess the importance of each variables in the model: the mean decrease in accuracy (MDA), and the mean decrease in Gini index. Higher values indicate the most important variables.
# Display the plot with the relative importance of each variable
importance(RF_LS)
varImpPlot(RF_LS)
15.1 2.1) Partial dependence plot
In addition, the Partial Dependence Plot (PDP) allows to estimate, for each single variable, the relative probability of prediction success over different ranges of values. It gives a graphical depiction of the marginal effect of each variable on the class probability over different ranges of continuous or discrete values. Positive values are associated with the probability of occurrence of the phenomena (i.e., landslides presence), while negative vales indicate its absence.
# Slope
partialPlot(RF_LS, LS_train, x.var = slope, rug = TRUE,
which.class = RF_LS$classes[2],xlab= "Slope [°]",
main = "", ylab = "PDP")
# Elevation
partialPlot(RF_LS, LS_train ,x.var = DEM, rug = TRUE,
which.class = RF_LS$classes[2],xlab= "Elevation [m]",
main = "",ylab = "PDP")
# Profile curvature
partialPlot(RF_LS, LS_train, x.var = profCurv, rug = TRUE,
which.class = RF_LS$classes[2],xlab= "Profile curvature [1/m]",
main = "", ylab = "PDP", xlim = c(-0.1,0.1))
# Plan Curvature
partialPlot(RF_LS, LS_train, x.var = planCurv, rug = TRUE,
which.class = RF_LS$classes[2],xlab= "Plan curvature [1/m]",
main = "", ylab = "PDP", xlim = c(-0.1,0.1))
# Distance to road
partialPlot(RF_LS, LS_train, x.var = distRoad, rug = TRUE,
which.class = RF_LS$classes[2],xlab= "Distance to road [m]",
main = "", ylab = "PDP")
# Topographic wetness index
partialPlot(RF_LS, LS_train, x.var = TWI, rug = TRUE,
which.class = RF_LS$classes[2],xlab= "TWI [-]",
main = "", ylab = "PDP")
# Geology
partialPlot(RF_LS, LS_train, x.var = geology, rug = TRUE,
which.class = RF_LS$classes[2],xlab= "Geology",
main = "", ylab = "PDP")
# Land cover
partialPlot(RF_LS, LS_train, x.var = landCover, rug = TRUE,
which.class = RF_LS$classes[2],xlab= "Land Cover",
main = "", ylab = "PDP")