Our research question, ‘What is the topic of students’ posts and the role assumed by posters on an open discussion forum?’ provided the conceptual framework for this study. Our approach entailed classifying the topics into meaningful categories so that they could be quickly located and clustered in ways that would facilitate future research. To accomplish this, we applied codes1 that described the primary themes of the posts.

Our first task was to gain perspective regarding the appropriate number and types of categories that should be used to describe the posts. This inductive methodology, referred to as open coding, is commonly used at this stage to identify, describe, or categorize phenomena found in qualitative data (Corbin & Strauss, 1990). A committee of five researchers performed open coding on a series of randomly selected threads to gain a general awareness of the type of communication that occurred in this space. Our findings from this examination were wide-ranging; we noted student exchanges containing advice for studying course material, questions and responses related to understanding course content or accessing course materials, commiseration with fellow students about difficulties experienced in solving homework problems or in using the course technology, requests for group formation in other online sites, e.g., Facebook, and even comments related to students’ physiological states, indicating hunger or sleep deprivation.

Following this first review, we developed a tentative list of descriptive codes that would encompass the topics and roles we had noted. Creation of codes that possess conceptual and structural order is important to the viability of a coding schema. Miles and Huberman (1994) emphasize that codes should relate to one another in coherent, study-important ways. We chose codes that we posited to be associated with our student outcomes of interest—persistence and achievement. We began with four codes for topic—content, user interface/course structure, community-building/interpersonal, and nonspecific affective—and two codes for role—help-seeker and help-giver.

After defining or operationalizing our codes, we trialed them on randomly selected posts. It immediately became apparent that we needed to expand this framework and refine our definitions. We added the code ‘tangential topic’ to describe posts in which students appeared to be transferring their understanding of circuits and electronics to other contexts. We also combined the two codes—community-building/interpersonal and nonspecific affective—to create one code—social/affective—because we found these posts to be closely related. We developed separate codes for user interface/course structure, which we renamed course website/technology and course structure/policies because there appeared to be two distinct types of posts and we hypothesized that each type may show different association with our outcomes of interest. We developed the codes ‘missing data,’ for posts had been removed by course staff, and ‘non-English,’ for posts that appeared in other languages. For role of the poster, we added ‘other’ as a code to be applied to posts in which students did not ask for or receive help or information, but instead expressed an opinion.

We elaborated upon our former and newly added definitions, and then trialed the revised framework on an additional 500 posts. This trial was conducted so that each of the five researchers coded a total of 200 posts, with 100 of those overlapping posts coded by another researcher, and the remaining 100 overlapping those coded by a different researcher. This type of procedure, known as check-coding, allows researchers to identify code definitions that need elaboration, or even codes that need to be re-defined (Miles & Huberman, 1994). It also allowed us to evaluate inter-coder reliability or degree of agreement between coders. For this round of coding, the inter-coder reliability, calculated via percent agreement, ranged from 75% to 91% between the five possible combinations of researchers for topic of the post, and from 76% to 86% for role of the poster.

Following this trial, we each met with the two other researchers who coded the same posts to discuss discrepancies in our coding. We continued to refine our code definitions until we reached agreement on a framework that contained well-defined codes for all of the types of posts we encountered. During this process, we changed the ‘tangential topics’ code to ‘other coursework’ to categorize posts in which students made references to courses other than 6.002x. We found that identifying posts that discussed topics tangential to course content was difficult; we reasoned that identifying posts in which students referred to other courses may also allow us to determine some transfer of course understanding to other contexts. To further clarify the meaning of our codes, we found specific examples of posts related to each one and incorporated all of the information into a printed document that we referred to as our ‘codebook.’

The product of our efforts was a two-dimensional coding framework that would classify posts according to topic of the post and role that the poster assumed. We defined eight subcategories for topic and three subcategories for role of poster as shown in Tables 1 and 2 (See Appendix A for further detail).

Table 1. Topic of Post

Code Definition
1. Content Posts specifically addressing circuits and electronics material
2. Other coursework Posts discussing courses other than circuits and electronics
3. Social/affective Posts addressing social, emotional, or community-building aspects of the class
4. Course website/technology Posts that addressed the online interface
5. Course structure/policies Posts regarding the course organization, guidelines, or requirements
6. Other Posts conveying anything not related to class content, other courses, social aspects of the class, course website or technology, or course requirements
7. Missing data Posts in which data had been censored by course staff (in these cases, particular thread numbers could not be located)
8. Non-English Posts written in other languages


Table 2. Role of Poster

Code Definition
1. Help-seeker (or Information-seeker) Posts in which the poster asked for help, information, pointers, etc.
2. Help-giver (or Information-giver) Posts in which the poster gave help, insight, or provided information
3. Other Posts in which the poster was not explicitly seeking, declaring, or providing information, such as an opinion

As we trialed each iteration of the coding framework, we had to make several important decisions about our coding protocol. First, we found it necessary to allow dual codes to encompass the complexities of the topic and role dimensions. This meant that two codes could be applied to categorize the topic of one post, and two additional codes could be applied to describe multiple roles assumed by the poster. In a single post, it was not uncommon to find a response to another student’s question about course content, an expression of emotion about understanding or not understanding that content, as well as a general question to other students regarding whether they would give a different answer to the question. In that illustration, the student’s response to another student’s question about content would be coded as ‘help-giver’ for role and ‘content’ for topic; their expression of emotion would be coded as ‘social/affective’ as a second topic; and their question about alternate answers would be coded as ‘help-seeker’ as a second role.

Our second important decision was that we would code posts independently of the thread in which they appeared. When students began a conversation about a particular topic, they started a new ‘thread’ on the forum. Each thread contained a series of posts, in which other students commented on or responded to the original question or statement. Our preliminary work involved coding only the first post of each thread because we believed this would identify the topic of that particular conversation. However, we found that one thread often contained multiple topics and that the topics were not always accurately identified in the first post. Often students’ original statement of the problem was a misdiagnosis. The most common examples of this were threads in which students attributed difficulties with homework to technology errors that were actually their own errors in calculation or in understanding course content. In order to gain a more accurate sense of the number of posts in each of our coding categories, we elected to code at the individual post level rather than at the thread level. This decision also allowed us to gain more information regarding help- or information-givers because the first posts in threads were typically initiated by students asking questions, and the help- or information-giving role became more evident in subsequent posts within the same thread.