In accordance with the research missions of HarvardX and MITx, both Harvard and MIT have convened groups charged with advancing research. Harvard University Provost Alan Garber convened the HarvardX Research Committee, a body comprised of 15 faculty members from around the university. MIT President Rafael Reif announced the Office of Digital Learning, whose mission includes a commitment to research on digital learning. Together, these two bodies have been working to understand, organize, analyze, interpret, and disseminate the data that edX delivers to its founding institutions.

With hundreds of thousands of registrants, open registration, asynchronous use of course resources, and a rapidly evolving platform, familiar educational variables like “enrollment” and “achievement” were challenging to operationalize. Registration into HarvardX or MITx courses hosted on the edX platform requires nothing more than a few keystrokes and a few clicks. Course registrants are accountable to no one and use course resources whenever and however they wish. Terms like “student,” “grade,” and “course” nonetheless bring to mind conventional analogs in higher education, and related terms like “enrollment” and “completion” similarly trigger specific interpretations. We emphasize and demonstrate that this educational data context differs substantially from that of any course where course registration costs more time and money than a few seconds and zero dollars. This perspective is consistent with some of the growing literature on MOOCs. [1]

With these challenges in mind, we offer caution in the form of four common fallacies that we perceive as particular threats to the interpretation of data from large open online courses.

1) We have all the data we could want.

The edX platform collects a large amount of data, approximately 20 GB of data per course. However, many variables that may interest researchers were not collected systematically in this first year. Examples include socioeconomic status, prior knowledge, motivations for enrolling in particular courses, detailed video interaction behaviors, and externally validated assessments of student learning. These variables are also rare in on-campus college courses, and one of the promises of HarvardX and MITx is the potential for more rigorous research on learning for on-campus courses as well. Online systems may make it easier to collect relevant data for research; however, the ability to log detailed online interactions does not necessarily confer upon the data any educational or policy relevance.

2) A small percentage is a small number.

In a new data context, interpreting the magnitude of numbers is challenging and subject to “framing”: the tendency for interpretations to differ depending upon an initial frame of reference. If the number of certificate earners in an open online course is 1,000, is that a large or small number? From an on-campus frame of reference, a professor may take years or decades to teach 1,000 students. From an online frame of reference, 1,000 is vanishingly small compared to the sizes of many online populations.

Percentages seem to address this problem by providing a frame of reference for comparison. If 100,000 students register, then one might expect that 100,000 have the opportunity to become certified. If 1,000 are ultimately certified as completers, then 1,000/100,000=1%, and this seems small. We argue that this is misleading. There is no doubt that course certification numbers are important indicators of the impact of an open online course offering, however the diversity of possible uses of open online courses make certification percentages problematic.

As one of many anecdotes that illustrates the problem with certification percentages, consider the evening of July 24, 2013, when Anant Agarwal, the president of edX, appeared on the Colbert Report, a satirical news show hosted by the comedian Stephen Colbert on the Comedy Central television network. Figure 1 plots day-to-day registration cohorts as a solid thick line and shows that enrollment in HarvardX courses [2] more than tripled after the broadcast, with 406 registrations on Wednesday, July 24 (UTC) to 1356 registrations on Thursday, after the Colbert Report broadcast. The numbers of these registrants who ultimately become certified in a course are shown as a thin solid line. The five-day average before the broadcast was 12 certified registrants per day, and the five-day average after the broadcast was 24 certified registrants per day, a doubling of certification numbers.

Of course, if certification doubles but registration triples, certification rates will drop. The bottom half of Figure 1 illustrates this slight drop, from 3.2% to 2.5% in the five-day average. Clearly the courses did not suddenly change in quality, rather, the audience changed in average composition. Yet we do not think that any instructor, researcher, or policymaker should begrudge Stephen Colbert for tripling registration and doubling certification. An increase in the number of registrants who are not ultimately certified can decrease certification rates, but if it is accompanied by an increase in the absolute numbers of registrants who learn, we argue that it should be regarded positively.

Figure 1. Daily number of registrations in HarvardX courses from June 24 to August 23, with the broadcast date of Anant Agarwal’s appearance on The Colbert Report shown (July 24, 2013). The number and percentage of these registrants who become certified are also shown.

Figure 1

3) Certification indicates learning.

While certificates are easy to count, certification is a poor proxy for the amount of learning that happens in a given course. Many registrants engage in courseware without choosing to complete the assessments for credit. And certification is difficult or impossible for registrants who register late or after the course closes. This is part of the explanation for the low certification rates shown in Figure 1, particularly in the August timeframe, when certification for most courses was no longer possible. That so many registrants register and participate in courses without hope of earning a certificate illustrates how limited certification and certification rates are at describing learning.

Noncertified registrants may have learned a great deal from a course, and certified registrants may have learned little. Some registrants may already be experts and may merely wish certification, a situation that is rare in residential education because of the larger monetary and opportunity cost students bear when registering for a course. More generally, instructors have limited assessment capabilities and grading options compared to residential courses. And protections against academic dishonesty are still limited. These challenges do not render the assessment of learning an impossible challenge, but they should limit interpretations of certification or certification rates as a proxy for registrant or course-wide learning.

4) A course is a course is a course.

This report reviews courses that differ dramatically on multiple dimensions. Beyond the most obvious difference of course content, there are structural differences in the design and duration of courses. There are essential contrasts in the philosophy of the instructors and the expectations of the registrants. Instructors took dramatically different approaches to video design and distribution. Approaches to assessments and criteria for certification differed widely. Although MITx courses have more in common among them, structurally, than HarvardX courses, we emphasize that the diversity among HarvardX and MITx courses reflects the diversity of the curriculum of their parent institutions and is considerable.

We intend comparisons of certification rates, gender ratios, grade distributions, and relative activity to reflect the variation in course content and design, as well as variation in registrant background and intention. These metrics should not be misinterpreted to indicate that a course, its instructors, and its registrants are somehow “better” than others on any dimension. Such comparisons are at best unsupported by the data and at worst obviously incorrect.