Research IT

Data Mining Student Engagement Patterns

Back in 2019, two Research Software Engineers (RSEs) were co-authors on a paper which aimed to motivate users to reflect on their search behaviour, and to experiment with different search functionalities. This work has now been extended using data mining.


Research Software Engineers (RSEs) Ann Gledson and Aitor Apaolaza are co-authors on a recently published paper which looks at using data mining to characterise student engagement patterns and predict learning outcomes on online learning platforms. This work was recently accepted for UMAP 2021 (the 29th Conference on User Modeling, Adaptation and Personalization).

Previous research has used high-level user activities such as posts in forums, downloads of learning materials and time spent watching videos, to categorise users into known engagement types. Ann and Aitor worked with Markel Vigo from the Department of Computer Science (University of Manchester), Sabine Barthold and Franziska G√ľnther (Technical University of Dresden), to extend such detection methods by clustering sequences of low-level events from user interaction logs, such as mouse and keyboard use, to isolate meaningful interactive behavioural markers, indicative of state-of-the-art engagement metrics.

To do this, low level user interaction events were grouped into 'n-grams', short interaction sequences, which were then hierarchically clustered to classify students into learning types. This methodology was applied on a MOOC (massive open online course) that ran for four weeks and involved 224 students. The resulting classifications found using this new technique supported those found using traditional methods, but with the added benefit of being applicable to all online learning sites, as there is no need for prior knowledge of site content/structure. This means that same process can be used by other researchers on other online learning sites.

In practical terms, modelling interactive behaviours using streams of events enables the detection of attrition, thus allowing interventions to prevent student drop-outs. Because these behaviours are highly interpretable through n-grams, this opens up new research avenues into detecting the exhibition of such behaviours in real-time. Scripts could be injected into the MOOC to keep track of particular sequences of user interface events and deliver interventions when detecting a particular behavioural pattern that might be an indicator of a certain (dis)engagement mode.

Where Research IT expertise was invaluable to the research was through Ann and Aitor contributing to the design of the study by performing the data pre-processing and the machine learning methods to visualise user engagement patterns and on the evaluation methods used.

This research was undertaken as part of the MOVING EU Project. The vision of the MOVING project is to develop an innovative training platform that enables people from all societal sectors (companies, universities, public administration) to fundamentally improve their information literacy by training how to use, choose, reflect and evaluate data/text mining methods in connection with their daily research tasks.

If you are interested in having a Research Software Engineer work with you on your research project please get in touch with us for an initial consultation.