Indicator Species Analysis
By- Rounak Choudhary
By- Rounak Choudhary
Indicator species are:
“A species whose status provides information on the overall condition of the ecosystem and of other species in that ecosystem. They reflect the quality and changes in environmental conditions as well as aspects of community composition.” - United Nations Environment Programme (1996)
Pre Requirements:- The data should be prepared as follows: Column 1 will have the Site IDs, Column 2 will have the variable and Column 3 onward will have the abundance of each species at each site.
# --------------------------------------------
# Indicator Species Analysis and Visualization
# --------------------------------------------
# 1. Install and load required packages
install.packages(c("indicspecies", "ggplot2", "dplyr", "tidyr", "vegan")) # install once
library(indicspecies)
library(ggplot2)
library(dplyr)
library(tidyr)
library(vegan) # for Hellinger transform and ordination
# 2. Read your data
pc <- read.csv("C:/Users/Rounak Choudhary/Desktop/build.csv", header = TRUE)
# 3. Extract the environmental gradient (aridity or built-up values)
# Convert it into a numeric vector
build <- as.numeric(pc$arid) # replace 'arid' with 'BuiltP' if needed
# 4. Extract species abundance data (assume species columns start from column 3 onward)
abund <- pc[, 3:ncol(pc)]
# 5. Create groups along the environmental gradient using quantiles
# Cut into 3 groups: Low, Medium, High
build_cat <- cut(build,
breaks = quantile(build, probs = c(0, 0.33, 0.66, 1), na.rm = TRUE),
labels = c("Low", "Medium", "High"),
include.lowest = TRUE)
# 6. Run Indicator Species Analysis
# 'r.g' is a correlation-based function, suitable for gradients
# 999 permutations ensure statistical robustness
set.seed(123)
inv <- multipatt(abund, build_cat, func = "r.g", control = how(nperm = 999))
# 7. Summarize the results
summary(inv)
The output will look like this
Explanation:
This output is the result of running an Indicator Species Analysis using the multipatt() function from the indicspecies package in R. It's used to find which species are associated with specific environmental groups or habitat types, like "Low", "Medium", or "High" aridity or built-up area.
This means the method used to find species-habitat relationships is "r.g", which stands for group-equalized point biserial correlation.
It measures how strongly a species is associated with one group compared to others, adjusting for group sizes.
🔹 Significance level (alpha): 0.05
You're using a 5% threshold to decide what's statistically significant.
Any species with a p-value < 0.05 is considered to show a meaningful association with a group.
🔹 Total number of species: 119
This is how many different species were included in your analysis.
🔹 Selected number of species: 5
Out of 119 species, only 5 showed a statistically significant preference for one of the groups.
The rest didn’t show strong or consistent patterns across groups.
🔹 Number of species associated to 1 group: 5
All 5 species are clearly associated with just one group (either Low or High), not spread across multiple.
No species were found to be shared indicators of a combination of groups (like both Low and Medium, or Medium and High).
List of species associated to each group
This is the heart of the output. It tells you which species prefer which environment.
(Species more commonly or strongly found in Low-built/arid areas)
Species stat (strength) p-value
Black Drongo 0.555 0.001 (***) Very strong indicator!
White-bellied Minivet 0.405 0.046 Moderate but significant (*)
Interpretation:
Black Drongo is a very strong indicator of low-built-up/arid environments.
White-bellied Minivet also prefers low conditions but not as strongly.
🟥 Group: High
(Species more strongly found in High-built/arid areas)
Species stat (strength) p-value
Spotted Dove 0.449 0.005 Strong indicator (**)
Streak-throated Swallow 0.426 0.036 Moderate significance (*)
Plain Prinia 0.422 0.040 Moderate significance (*)
🧠 Interpretation:
These 3 species are more associated with high urbanization/aridity.
Spotted Dove shows a stronger signal than the other two.
This is the indicator value statistic.
It ranges from 0 to 1.
A higher value = the species is more exclusively and frequently found in that group.
For example:
0.555 = very strong indicator.
0.405 = moderately good indicator.
What does p-value mean?
The p-value tells you how likely it is that the association happened by chance.
A p-value < 0.05 means it's statistically significant (unlikely to be random).
Significance codes:
*** Very strong (p < 0.001)
** Strong (p < 0.01)
* Moderate (p < 0.05)
In short:
This analysis shows that 5 bird species are good indicators of either Low or High aridity/urbanization.
These species could help you monitor environmental changes or define conservation priorities for those habitat types.
Let me know if you want a graphical visualization of these results next (e.g., barplot or dotplot with species, group, and indicator strength).