aamc.org does not support this web browser.

    GIR Member Viewpoint - June 2012

    Learning Analytics: The Possibilities and the Unknowns for Medical Education

    Dr. Janet Corral, Faculty, Educational Informatics, Academy of Medical Educators and Department of Internal Medicine University of Colorado School of Medicine

    Data analytics are widely used in business and in computing, though only recently have been applied to the educational context. Learning analytics is an emerging field for “the measurement, collection, analysis and reporting of data about learners and their contexts, for purposes of understanding and optimizing learning and the environments in which it occurs” (Siemens et al., 2011, p. 4). The general approach is to mine data from one or various digital systems related to student learning activities, and provide visualized reports in order to guide or improve student performance. Given that the systems from which learning analytics data are drawn - such as learning management systems (LMS) or learning content management systems (LCMS) - may be used for both learning and administrative purposes, learning analytics must be carefully distinguished from academic analytics. Academic analytics is a practice focused on the analysis of administratively strategic data sets, such as student retention and graduation (Campbell & Oblinger, 2007).

    The allure of analytics would be that we might predict student performance using data readily available in digital learning systems. While such an approach may be conceptually attractive, there exist particular issues that require exegesis before implementation. This GIR viewpoint will examine learning analytics, including challenges particular to our medical education community, but also broadly within education.

    One Case of Learning Analytics: Purdue University’s ‘Course Signals’

    A popular example of analytics in practice is the Course Signals system at Purdue University. Through data mining of student activities and grades in the LCMS, Course Signals is able to assign green, amber or red signals to students, visible on faculty and student dashboards. The signals serve as flags that faculty may intervene. Initial results demonstrate that student grades do improve:

    “Overall, students in courses using Course Signals receive more Bs and Cs and fewer Ds and Fs than previous sections of the course that did not utilize courses Course Signals. As and Bs have increased by as much as 28% in some courses. In most cases, the greatest improvement is seen in students who were initially receiving Cs and Ds in early assignments, and pull up half a letter grade or more to a B or C” (Purdue University, 2011).

    Notably the visual reported forms of the analytical result are only a trigger; the key intervention is the emails sent by the instructors to students to open a discussion about the reasons for a particular green, amber or red rating. In order meet the aim of guiding or improving student performance, learning analytics need to be accompanied by pre-selected strategies and administrative support.

    The Course Signals system focuses on pre-medical programs of study. This approach to learning analytics has been replicated elsewhere (García, Romero, Ventura, & Castro, 2009; Romero, Ventura, & Garcia, 2008). Within undergraduate medical education, and arguably within a more limited number of graduate and continuing medical education programs, LMS and LCMS systems might be leveraged in similar ways to Purdue’s pre-med programs. However, where medical education curricula and evaluation are distributed across systems (e.g. clinical evaluation system, examination system, logbooks, etc.), other ways of achieving learning analytics need to be considered.

    Limitations of Learning Analytics

    There are several limitations to learning analytics, at least in the current popularly implemented forms. First, learning analytics presents a deterministic view that learners’ web navigation and time spent on pages equates with intentional activity. One problem with this view is the same data may also represent accidental movements, interest in “seeing what is there” within the web, or less thoughtful clicking. A related concern is the difficulty of capturing and describing affective states, interests, or disposition of learners. Independent data points do not reflect the “individual actions within a sequence of interactions (that) are often highly dependent upon one another” (Shute, 2011, p. 509). Particularly in the training of health sciences professionals, team activities and interprofessional experiences are not well captured by what is essentially navigational data.

    Second, as a relatively new field drawing upon pre-existing approaches from business (e.g. customer retention, sales) and social network analysis, learning analytics has few tools and methods of its own. Some of the existing analytics services are free (e.g. Google Analytics), others are open source software (e.g. MERC, NodeXL, ProM, Cloudera), and others are commercial (e.g. Knewton, StatsMix). Each may engage different terminology and/or definitions by which online actions are understood and analyzed into reported forms; each may also engage different underlying data mining approaches as well as engage different mathematical formulas for analyses (Siemens et al., 2011). While in medical education research there exist prespecified research methods, learning analytics is too new and therefore there is little standardization. The responsibility, therefore, falls to the teams involved in the analysis to be certain of what the data truly means, and then to define, analyze and report the data in ways that will honor teachers, learners, and learning.

    Third, issues related to monitoring and profiling raise concerns of the power differential between students and the administrative staff that run the digital systems for learning, and usher in deliberations of the true pedagogical usefulness of tracking students’ activities online (Johnson, et al., 2011; Lazakidou & Retalis, 2010; Retalis, Papasalouros, Psaromiligkos, Siscos, & Kargidis, 2006).

    Lastly, medical education data is derived from a relatively small number of users, leading to limited data set sizes for data mining. In comparison, large corporations (e.g. Facebook, Google) have sufficient numbers to identify common patterns and aberrant activities. While the business world is wading in pedabytes of data, medical education would have to combine data across schools in order to better understand the emerging picture of what defines “success” in medical education. Open learning analytics has been suggested to share data into a repository for further mining (Siemens, et al., 2011).

    The “Learning” in Learning Analytics

    The learning management system is but one source of data on student learning. Students’ navigations across the web and social media leave a trail of breadcrumbs which may be gathered to build a composite picture of learning activity (Retalis, Papasalouros, Psaromiligkos, Siscos, & Kargidis, 2006). Privacy issues notwithstanding, new open source systems such as Tin Can may solve some of the data collection issues across web and mobile applications by providing the means to capture and aggregate learner activities across user names and web-based activities.

    Underlying these discussions is an acknowledgement that learning is a complex and messy task. Analytics could reduce learning to information gathering and web pages viewed, despite rich mentorship, resilience, critical thinking and collaboration skills that comprise the constellation of behaviours and activities associated with learning. Foundational work to more clearly define the types of learning, what is learned, as well as the protocols and infrastructure connections required, is underway (Buckingham Shum, 2012; Siemens, et al., 2011). These authors argue that learning analytics should be explored as subsets of skills, knowledge, and aptitudes, layered in with metrics to identify challenges, difficulties, failures, feedback and other learning elements to reflect the complexity, adaptation and perseverance of learners. The complexity of learning belies the simplicity of data mining.


    Conclusions and Next Directions

    The role and true impact of learning analytics in medical education is, at present, unclear. However, as granting agencies such as the Bill and Melinda Gates Foundation are increasingly focused on analytics as a way to improve the quality of education, it is likely that learning analytics is a trend we should at least attempt to understand, if not proactively define for our community. Simultaneously, the ethics of data mining and reporting requires further investigation, including how to best report analytics in support student success. As our medical education community navigates learning analytics, the extent to which we may be able to claim success should be related in significant part to the depth to which we chose to ask questions about what it means to be a learner and how we choose to define and measure learning. Let us not look back years from now and realize we let our excitement over mining data detract us from our real roles in the support of educational excellence of physicians.


    Buckingham Shum, S. and Deakin Crick, R. . (2012). Learning Dispositions and Transferable Competencies: Pedagogy, Modelling and Learning Analytics. . Paper presented at the Proc. 2nd Int. Conf. Learning Analytics & Knowledge. , 29 Apr-2 May, 2012, Vancouver, BC.

    García, E., Romero, C., Ventura, S., & Castro, C. (2009). An architecture for making recommendations to courseware authors using association rule mining and collaborative filtering. User Modeling and User-Adapted Interaction, 19(1), 99-132.

    Purdue University. (2011). Course Signals  Retrieved November 7, 2011, from https://dl.acm.org/citation.cfm?id=2330666.

    Retalis, S. , Papasalouros, A., Psaromiligkos, Y., Siscos, S. , & Kargidis, T. (2006). Towards Networked Learning Analytics – A concept and a tool. Paper presented at the Networked Learning Conference, Lancaster, UK.

    Romero, C, Ventura, S, & Garcia, E. (2008). Data mining in course management systems: Moodle case study and tutorial. Computers & Education, 51(1), 368-384. doi: 10.1016/j.compedu.2007.05.016

    Shute, V. J. . (2011). Stealth assessment in computer-based games to support learning. In S Tobias & J.D.  Fletcher (Eds.), Computer games and instruction (pp. 503-524). Charlotte, NC: Information Age Publishers.

    Siemens, G, Gasevic, D, Haythornthwaite, C, Dawson, S, Shum, S.B., Ferguson, R., . . . Baker, R.S.J.D. (2011). Open Learning Analytics: an integrated & modularized platform. Retrieved from SOLAR Resources website: https://www.solaresearch.org/resources/