Warning! The directory is not yet complete and will be amended until the beginning of the term.
300135 VO+UE Beginner's guide to analyzing complex datasets in (microbial) community ecology using R (2021S)
Continuous assessment of course work
Labels
Registration/Deregistration
Note: The time of your registration within the registration period has no effect on the allocation of places (no first come, first served).
- Registration is open from Th 11.02.2021 08:00 to Th 25.02.2021 18:00
- Deregistration possible until We 31.03.2021 18:00
Details
max. 12 participants
Language: English
Lecturers
Classes
Lectures and final presentations will be held on site if possible (seminar room Unit Limnology; UZA 1, room no. 3.112), or alternatively will be organized digitally as online events if necessary (time slots will stay the same regardless of the mode of the course). Students will be informed in due time if the course changes from on-site to digital mode.
Course schedule:
- Lectures:
April 12, 2021; 2:00-4:00 PM
April 19, 2021; 2:00-4:00 PM
April 26, 2021; 2:00-4:00 PM
May 3, 2021; 2:00-4:00 PM
May 10, 2021; 2:00-4:00 PM
- Independent work on final assignments:
May 11 to May 30, 2021
- Oral presentations for final assessment:
May 31, 2021; 2:00-4:00 PM
Information
Aims, contents and method of the course
Assessment and permitted materials
Students will showcase their acquired skills in a final assignment by independently analyzing a microbial community dataset from real-life case studies using the introduced techniques. To this end, students will work in small groups (~3 students per group). The final assessment will consist of an oral presentation (~15 minutes per group) during which each group will present their results. In addition, each group will hand in their R-script used for the analysis. In the script, students will have to show their ability to properly document the steps carried out for the analyses and demonstrate a basic understanding of the functions they used. Students will have access to the lecture slides and are free to consult any type of resource (R package help pages, online tutorials, books, scientific publications, etc.) to help them with their analyses. In addition, students can reach out to the course instructor for help at any time.
Minimum requirements and assessment criteria
Regular attendance to the lectures is required. The evaluation will be based on the final group assignments, i.e. oral presentation and documentation of the analyses in the R-script. A total of 100 points (pts) can be achieved, for which the individual elements will be weighted as follows:
- Oral presentation (80 pts in total):
* Appropriateness and creativity of the analysis (50 pts).
* Appropriateness and clarity of figures (20 pts).
* Presentation skills and discussion (10 pts).
- Documentation of the analysis in the R-script (20 pts in total):
* Logical organization (10 pts).
* Use of comments to explain what is done at each step (10 pts).
Final grades will be given based on the total number of points:
- 1 (excellent): 100-90 pts.
- 2 (good): 89-81 pts.
- 3 (satisfactory): 80-71 pts.
- 4 (sufficient, pass): 70-61 pts.
- 5 (unsatisfactory, fail): 60-0 pts.
All students from the same group will receive the same grade, assuming that each student contributed equally to the group’s work. It will be the students’ responsibility to organize the groups and see to it that workloads are divided evenly. Students will also carry the responsibility to reach out to the course instructor in case an individual student strongly violates group agreements and does to meet his or her responsibilities within the group.
- Oral presentation (80 pts in total):
* Appropriateness and creativity of the analysis (50 pts).
* Appropriateness and clarity of figures (20 pts).
* Presentation skills and discussion (10 pts).
- Documentation of the analysis in the R-script (20 pts in total):
* Logical organization (10 pts).
* Use of comments to explain what is done at each step (10 pts).
Final grades will be given based on the total number of points:
- 1 (excellent): 100-90 pts.
- 2 (good): 89-81 pts.
- 3 (satisfactory): 80-71 pts.
- 4 (sufficient, pass): 70-61 pts.
- 5 (unsatisfactory, fail): 60-0 pts.
All students from the same group will receive the same grade, assuming that each student contributed equally to the group’s work. It will be the students’ responsibility to organize the groups and see to it that workloads are divided evenly. Students will also carry the responsibility to reach out to the course instructor in case an individual student strongly violates group agreements and does to meet his or her responsibilities within the group.
Examination topics
All concepts and tools that are to be applied to the analysis of the case study datasets for the final assignments will be introduced during the lectures, i.e. organizing and documenting data projects and analyses, calculation of community diversity and differences in community composition, statistical tests including multivariate analyses of relationships between community composition and environmental parameters, and designing appropriate figures. Particular emphases will be highlighted for each individual dataset during the introduction of the case studies.
Reading list
The lecture slides and accompanying scripts will be the primary resource for this course. Useful but not mandatory literature for further reading are:
- Borcard, D., Gillet, F., and Legendre, P. (2011). Numerical Ecology with R. Springer, New York, NY, USA. ISBN: 978-1-4419-7975-9.
- Buttigieg, P.L., and Ramette, A. (2014). A Guide to Statistical Analysis in Microbial Ecology: a community-focused, living review of multivariate data analyses. FEMS Microbiol. Ecol. 90:543-550. (GUSTAME; https://mb3is.megx.net/gustame).
- Chang, W. (2013). R Graphics Cookbook. O'Reilly Media, Sebastopol, CA, USA. ISBN: 978-1-449-31695-2.
- Greenacre, M., and Primicerio, R. (2013). Multivariate Analysis of Ecological Data. Fundación BBVA, Bilbao, Spain. ISBN: 978-84-92937-50-9.
- Legendre, P., and Legendre, L. (2012). Numerical Ecology. Elsevier, Amsterdam, NL & Oxford, UK. ISBN: 978-0-444-53868-0.
- Quinn, G.P., and Keough, M.J. (2002). Experimental Design and Data Analysis for Biologists. Cambridge University Press, Cambridge, UK. ISBN: 978-0-521-00976-8.
- Borcard, D., Gillet, F., and Legendre, P. (2011). Numerical Ecology with R. Springer, New York, NY, USA. ISBN: 978-1-4419-7975-9.
- Buttigieg, P.L., and Ramette, A. (2014). A Guide to Statistical Analysis in Microbial Ecology: a community-focused, living review of multivariate data analyses. FEMS Microbiol. Ecol. 90:543-550. (GUSTAME; https://mb3is.megx.net/gustame).
- Chang, W. (2013). R Graphics Cookbook. O'Reilly Media, Sebastopol, CA, USA. ISBN: 978-1-449-31695-2.
- Greenacre, M., and Primicerio, R. (2013). Multivariate Analysis of Ecological Data. Fundación BBVA, Bilbao, Spain. ISBN: 978-84-92937-50-9.
- Legendre, P., and Legendre, L. (2012). Numerical Ecology. Elsevier, Amsterdam, NL & Oxford, UK. ISBN: 978-0-444-53868-0.
- Quinn, G.P., and Keough, M.J. (2002). Experimental Design and Data Analysis for Biologists. Cambridge University Press, Cambridge, UK. ISBN: 978-0-521-00976-8.
Association in the course directory
MEC-5
Last modified: Fr 26.02.2021 13:09
This course invites all students interested in community ecology and data analysis. While prior knowledge of R is an advantage, it is absolutely no prerequisite for attending this course! Although concepts, examples, and case studies will mainly revolve around microbial community data derived from high-throughput marker gene sequencing, the introduced methods are directly applicable also to other types of ecological communities. The principal goal of this course is to familiarize students with R and the analysis of larger datasets (and show them that this is nothing to be afraid of). Students will learn how to organize data projects, document their analyses, and create publication-quality figures for effective and efficient data visualization. In addition, students will be introduced to standard concepts and tools for analyzing community data in ecology. (Note: in this course, we will not deal with the bioinformatic processing of raw sequence data).
For the first 5 consecutive weeks of the course, each week will start with a 2 hour lecture followed by time for self-study to familiarize with the introduced topic for the rest of the week. As a final assignment, students will independently analyze real-life case study datasets in small groups and present their results in oral presentations. The course structure is as follows:
- Week 1: Introduction to R and R-Studio (software interface, data types, basic commands, etc.); setting-up and organizing data projects and documenting steps of the analyses.
- Week 2: Introduction to univariate and bivariate statistical analyses (correlation, regression, location tests, ANOVA, etc.).
- Week 3: Introduction to data types frequently used in community ecology (taxa abundance tables, phylogenetic trees, accompanying environmental data); overview and calculation of measures of community diversity and differences in community composition; visualization of multidimensional differences in community composition using ordination plots.
- Week 4: Multivariate statistical tests for analyzing differences in community composition and effects of environmental parameters.
- Week 5: Data visualization and figure design using ggplot2.
- Week 6-7: Independent analysis of case study datasets for final assignment.
- Week 8: Final students’ presentations.