OUP user menu

★ Research and applications ★

Utilization of two web-based continuing education courses evaluated by Markov chain model

Hao Tian , Jin-Mann S Lin , William C Reeves
DOI: http://dx.doi.org/10.1136/amiajnl-2011-000287 489-494 First published online: 1 May 2012


Objectives To evaluate the web structure of two web-based continuing education courses, identify problems and assess the effects of web site modifications.

Design Markov chain models were built from 2008 web usage data to evaluate the courses' web structure and navigation patterns. The web site was then modified to resolve identified design issues and the improvement in user activity over the subsequent 12 months was quantitatively evaluated.

Measurements Web navigation paths were collected between 2008 and 2010. The probability of navigating from one web page to another was analyzed.

Results The continuing education courses' sequential structure design was clearly reflected in the resulting actual web usage models, and none of the skip transitions provided was heavily used. The web navigation patterns of the two different continuing education courses were similar. Two possible design flaws were identified and fixed in only one of the two courses. Over the following 12 months, the drop-out rate in the modified course significantly decreased from 41% to 35%, but remained unchanged in the unmodified course. The web improvement effects were further verified via a second-order Markov chain model.

Conclusions The results imply that differences in web content have less impact than web structure design on how learners navigate through continuing education courses. Evaluation of user navigation can help identify web design flaws and guide modifications. This study showed that Markov chain models provide a valuable tool to evaluate web-based education courses. Both the results and techniques in this study would be very useful for public health education and research specialists.

  • CDC
  • computer science
  • continuing education
  • Markov chain
  • online education
  • public health
  • user behavior
  • web usage mining

The present study used a Markov chain model to evaluate the usage and design efficiency of two chronic fatigue syndrome (CFS) web-based continuing education courses offered through the US Centers for Disease Control and Prevention (CDC) CFS web site, course WB1032: CFS—Diagnosis and management (http://www.cdc.gov/cfs/education/wb1032/), intended for physicians, nurse practitioners and physician assistants, and course WB3151: CFS—A primer for allied health professionals (http://www.cdc.gov/cfs/education/wb3151/), intended for non-physicians. Our objectives in the present study were to evaluate the web structure of the two CFS continuing education courses and assess the effects of web site modifications.


Healthcare providers' continuing professional development requires they understand advances in medicine and changes in delivery of care. Continuing medical education (CME) traditionally involved attendance at conferences and courses, but various distance learning technologies and methods have become increasingly important (eg, audio teleconferencing, compressed videoconferencing and CD-ROM). More recently, web-based continuing education has been found to provide a convenient, cost-effective and self-directed learning opportunity,13 and may be superior to live interactive workshops.4 There is evidence that web-based CME enhances physicians' professional competence, performance and patient outcomes.5 ,6 However, most evaluative research on web-based education has focused on participant satisfaction evaluated by questionnaires, telephone interviews and email feedback.79 Some studies have used controlled trials to evaluate the effectiveness of internet-based CME activities.3 ,4 ,10 These focused on knowledge and/or performance improvement in physicians after they completed certified CME activities.

Data in the 2009 Accreditation Council for Continuing Medical Education annual report11 show that web-based enduring materials accounted for approximately 30% of certified CME activities. Equally important, many physicians use online CME materials without seeking formal credits.12 A limitation of satisfaction-based or controlled trial-based evaluation methods is that they focused on the participants who usually sought CME credits, with the implicit assumption that those taking a course for a credit spend more effort learning than the non-credit-seeking visitors to the CME materials. In the case of open access web-based continuing education courses, participants can be anyone who is interested in the content. Therefore, it is particularly important to include all types of visitors, both credit and non-credit seeking visitors, in the evaluation. Students' use of a course is reflected by the manner in which they navigate through it, which reflects their learning behavior and is recorded during web education sessions. Knowledge concerning how the structure and content of a web-based continuing education course can sustain the interest of visitors is important.

While substantial research has measured the impact of web-based CME materials and activities on physicians' performance, only few studies have considered the design and usability of web-based medical-related continuing education courses. Optimal web design and usability can improve the course usefulness by facilitating the communication of information to learners. The present study concentrated on understanding how readers actually use a course (ie, learning behavior). We also applied a new approach to measure and improve the usability of open access web-based continuing education courses. A study with similar focus assessed the usage of a German CME web site by 59 general practitioners in terms of visiting frequency, preferred days and hours and most-viewed content,13 but it did not evaluate their navigation paths to the web site.

How visitors use online continuing education courses can be described by modeling their behavior during sessions. These models should help course providers better present course content and will assist them in selecting the appropriate level of interactivity, and determining the needs for developing more complex learning environments such as adaptive and personalized continuing education courses. Markov chain models provide such a description14 by identifying the states of a sequential process at successive times. Markov chain models have been used to describe navigation patterns,15 to predict web page access in web recommendation systems,16 to identify groups of visitors with similar navigation behaviors,17 to represent the conceptual structure of a web site,18 and to test and evaluate the reliability of web applications.19 However, we are not aware of any published study utilizing a Markov chain model to examine web structural design and to evaluate the improvement of web modifications for web-based continuing education courses in public health.


Continuing education courses

The two CFS continuing education web-based courses we evaluated were released on the CDC CFS web site (www.cdc.gov/cfs/education/index.html) in December 2007, course WB1032: CFS—Diagnosis and management, and course WB3151: CFS—A primer for allied health professionals. They were open access; no registration was required. Both courses had a simple static linear instructional design. They had the same web class structure (figure 1) comprising eight components: syllabus, introduction, three core content chapters, appendix, case studies and references. Those seeking continuing education credits need to take a test hosted on a separate CDC web site that does not record web usage data from its users. As a result, we could not link the two sites and determine who took the course for general interest, specific information, or formal credit.

Figure 1

Web class structure of chronic fatigue syndrome continuing education (CE) courses.

This study included two phases. First, we utilized a Markov chain model to evaluate the web structure of the two courses and to identify the main navigation patterns. Findings identified issues amenable to modification and improvement. In the second phase, we quantitatively assessed the effects of modifications in the first-order Markov chain models, by comparing the transition probability changes, and explored the detailed effects in second-order Markov chain models.

Data collection

Web usage data were collected by the Omniture web tracking system (Adobe Systems Incorporated, San Jose, California, USA). We initially evaluated navigation paths over the first year following the release of two continuing education courses (January 1, 2008 to December 31, 2008). Based on this evaluation we modified one course accordingly in February 2009. Then, we analyzed the navigation paths for the next 12 months (March 1, 2009 to February 28, 2010). We chose 12-month time periods to minimize possible short-term impacts of other events on web site traffic.20

One limitation that needs to be pointed out here is that we could not distinguish the returned visitors from new visitors because no persistent cookies are allowed on the target continuing education web pages.

Markov chain model

A Markov chain model can be represented as a quadruple M=(qinitial , qfinal , Q, T), where qinitial and qfinal are the initial and final states of all sequential processes, respectively; Q = {qinitial , qfinal} ∪ {q1 , q2, … qn} is a set of states representing the steps of a sequential process; T is a transition probability matrix, where tij=P(St=qj | St1=qi) is the probability of transiting from state qi to state qj for all time points t>0. As the two target continuing education courses, WB1032 and WB3151, shared the same web structure, we defined the same Q, the set of states, for both of them.

The CFS continuing education Markov chain model was defined as M=(qstart , qexit , Q, T), where

  • Q={qstart, qexit} ∪ {qhomepage, qsyllabus, qintroduction, qchapter1, qchapter2, qchapter3, qappendices, qcasestudy, qreference},

  • qstart and qexit are virtual states that indicate the start and end of a visit, respectively. They do not map to any specific web page on the site,

  • qhomepage represents the Homepage of the CFS CE web site, and

  • qsyllabus, qintroduction, qchapter1, qchapter2 , qchapter3, qappendices , qcasestudy , and qreference represent the eight components in each course, each of them including one or more web pages

  • For each tij in transition matrix T,

tij=1nm=1nNumber of views toqifollowed byqjimmediately in themthmonthTotal number of views toqiin themthmonth

In this model, each web navigation path was considered as a sequential process, which always started from qstart and ended at qexit (ie, exit of a continuing education course). A transition in the Markov chain model corresponded to a pair of pages visited in sequence. In other words, when a visitor on web page A navigated to page B, the process state was transited from the current state for web page A to the next state corresponding to page B. For example, an instance of transition qintroductionqchapter1 meant that a visitor moved from the introduction web page to a web page of chapter 1. The probability of a transition was calculated by the monthly average proportion of views to the current state, qi, immediately followed by the next state, qj. By introducing the qexit state, we defined the drop-out probability, Pdropout, of a state as the transition probability from it to qexit(eg, Pdropout(qchapter1) = P(St = qexit | St−1 = qchapter1)).

The model defined above is a first-order Markov chain model, in which the next state depends only on the current state and is independent of past states. This property limits the analysis power in the cases that the history information (ie, web pages visited before reaching the current page) is important, and thus Markov chain models with higher order are needed to detect the details of a transition with regard to its past states. In this study, we defined a second-order Markov chain model M2 with the capability to infer the next state based on both the current and past states. M2 has exactly the same states as in the first-order Markov chain model, but each element t((i,j),k) in the transition matrix T2 ((Q × Q) X Q) of M2 is defined as follows:t((i,j),k)=1nm=1nNumber of transitions(qiqj)followed byqkimmediately in themthmonthTotal number of transitions(qiqj)in themthmonth In the formula above, qi , qj, and qk represent the past, current, and next states, respectively.

Evaluation of web structure and navigation patterns

For examining user navigation patterns, we applied the first-order Markov chain model to all navigation paths collected in 2008. We then compared the web structure of courses to the model to determine: (1) if the learning style defined by course design was reflected in the actual usage model; (2) the extent to which skip hyperlinks in the left navigation panel were used; and (3) if unexpected movements occurred. We then investigated unexpected patterns in the Markov chain model to determine how they might be explained and if modifications to the courses would improve the fit to the model.

Effects of web modifications

To assess the effects of the modifications, we built post-modification Markov chain models from the navigation paths during March 1, 2009 to February 28, 2010 and compared the transition probability difference between the pre and post-modification models (both the first-order and second-order Markov chain models). We used a two-sample t test to examine the mean probability difference between the 12 monthly transition probabilities (ie, overall transition probability) before modification and the ones after modification. The statistical tests were run within WB3151 and WB1032 independently. The significance level was set at 0.05 for a two-tailed test with a degree freedom of 22.


Evaluation of web structure and navigation patterns

The two courses had the same web structure (figure 2A); their central sequential structure is represented by solid arrows and the support for skip transitions by dashed arrows. We built Markov chain models from the navigation paths of 10 157 visits. To focus on major movements, we did not report the transitions with probabilities less than 0.05 (ie, the 75th percentile of all transition probabilities) in the models (figure 2). Figure 2B and 2C portrayed the results from the first-order Markov chain models for each course, respectively. The central sequential design was clearly reflected by the up and down arrows in both models. None of the skip transitions (displayed as the dashed arrows in 2a), however, were captured by either of the models. In other words, skip transitions were not heavily used by the visitors, and as a result their transition probabilities were below the cut-off (ie, <0.05) and are not shown in figure 2. Ninety per cent of skip transitions had probabilities less than 0.03 and the rest were between 0.03 and 0.04.

Figure 2

Chronic fatigue syndrome (CFS) continuing education web infrastructure and web usage models. (A) Web infrastructure of CFS continuing education courses. (B) First-order Markov chain model of WB1032. (C) First-order Markov chain model of WB3151.

The top three entrance pages were homepage (71%), WB1032 syllabus page (11%) and WB3151 syllabus page (12%), which indicated that most visits started the course from the beginning. Only 6% of visits entered a course from the middle of it, which could be represented by a transition from qstart to a middle state for example qchapter1 in the model. This kind of transition can be explained intuitively by assuming some users saved the URL for the pages where they left off and later re-entered the course from there. However, none of them appeared in Figure 2B and 2C due to their very low probabilities (<0.01). There was a 53% chance that a visitor would exit the course from the homepage. Once on the homepage (state qhomepage), 20% moved on to the physician course (WB1032) and 25% to the allied provider course (WB3151). Navigation patterns were similar in the two courses (Figure 2B and 2C) and were sequential, consisting of forward and backward transitions between two neighbor states. For the seven states (from qintroduction to qreference), the transition probabilities of moving forward to the next component were between 52% and 72%, while 10–21% moved back to the previous state. Compared with them, the moving forward transitions on the syllabus page, qsyllabusqintroduction, had relatively low probabilities (31% for WB1032 and 33% for WB3151), and Pdropout(qsyllabus) was high in both courses (47% and 41%).

The high drop-out rates and low moving-forward probabilities at state qsyllabus prompted us to review the syllabus pages in both courses, and we found two possible explanatory design flaws. First, the syllabus page contained more than 120 lines of mandatory CDC continuing education text that was largely irrelevant to those desiring information on CFS. Second, the hyperlink to the course introduction (figure 3A) appeared at the very bottom of the syllabus page buried under the faculty credentials and the copyright statement, which probably caused visitors to get lost or to just give up and exit the course unintentionally. It was possible that some visitors could drop out after viewing the content of the syllabus page because of losing interest in the course. We believe, however, that design flaws, especially the poor position of the hyperlink to the introduction page, caused a significant loss of prospective visitors because they could not readily find the way to continue the course. To resolve this, we added a hyperlink to the introduction page near the top of the syllabus page, immediately following some short important text (figure 3B), so that the new visitors to the syllabus page could more readily transition to the introduction page. In order to compare effects, we made the change only to course WB3151 (allied health professionals).

Figure 3

Screenshots of allied health professionals syllabus web page before and after the modification. (A) WB3151 syllabus page before modification, it has more than 120 lines of words and the hyperlink to the next page, ie, the introduction page, is located at the very bottom of the page, which is difficult for users to find. (B) WB3151 syllabus after modification, although it has more than 120 lines of words, one additional hyperlink to the introduction page is added near the top of the page, which makes it easier for users to find.

Effects of web modification

We modified the syllabus page of allied health professionals in February 2009. To assess the effects of the modification we rebuilt first and second-order Markov chain models from the 5523 navigation paths between March 1, 2009 and February 28, 2010. Comparing the overall effects shown by the first-order Markov chain models (figure 4), transition from the syllabus to the introduction page improved significantly (from 33% to 43%) and drop-outs decreased significantly (from 41% to 35%) in the allied health professionals course (figure 4A) but remained unchanged in the one for physician/nurse practitioner/physician assistant (figure 4B). No behavior change occurred on the allied health professionals introduction page in following the modification.

Figure 4

First-order transition probability changes following web modification. (A) WB3151 with web modification. *p<0.01; **p<0.00001. (B) WB1032 without web modification.

Because it did not consider past states, the first-order Markov chain model in figure 4 could not detect the detailed effects of the modification to the subgroups of syllabus page visits with different previous states: qstart , qhomepage , qintroduction and qreference. This detailed information is important for verifying the overall effects indicated by the first-order Markov chain model because we expected the four subgroups of visits to experience different outcomes from the modification as a result of different visiting history. For example, visitors from the reference page to the syllabus page would probably get little or no help from the modification because they had already finished the course. In contrast to this were the visitors from the homepage, who were likely to be on the first time to the syllabus page and would benefit from the additional hyperlink. As expected, only the subgroup of visits from the homepage showed significant improvements in all three directions: (1) decreased drop-out rate; (2) increased moving-forward probability; and (3) decreased backward move probability (table 1). The subgroups qintroduction and qreference showed no significant change in any of them. Regarding the visits directly starting from the syllabus page, qstartqsyllabus, we observed an increase to qintroduction (from 0.38 to 0.45), which was not statistically significant (p=0.07), a significant decrease in moving backward (to qhomepage), and the same level of drop-out rate.

View this table:
Table 1

The second-order transition probability changes after web modification

Similar to the first-order WB1032 Markov chain model, the second-order model failed to demonstrate any positive behavior changes similar to those observed in WB3151. The only significant change in WB1032 occurred in (qhomepageqsyllabus)→ qhomepage, but was in the wrong direction, which indicated that the situation on the syllabus page of WB1032 was getting worse—more visitors went back to the homepage.


Our findings indicated that users followed the recommended order in taking the web-based courses. The scant utilization of short-cut transitions (skip-hyperlinks in the left navigation panel) suggests that users were generally satisfied with the sequential class design style. The match of navigation patterns and similarity of probabilities between the two continuing education courses implies that the difference in course content had less impact than the web structure on how users navigate through the course.

Re-design of the course WB3151 syllabus page significantly decreased the drop-out rate. The comparison results of the two first-order Markov chain models showed a significant improvement in drop-out and moving-forward probabilities on the course WB3151 syllabus page and no change in the course WB1032 that was not modified. Our findings also indicated that the probabilities of the three transitions outbound from the introduction page (qintroductionqsyllabus , qintroductionqexit, and qintroductionqchapter1) stayed at the same level. This reflects that the probability increase of visitors from the syllabus to the introduction page was not compromised by side effects such as the backward traffic and the increase of drop-out on the introduction page. Therefore, once people entered the introduction page, they were more likely to continue the course, instead of immediately going back or dropping out.

The four subgroups of visits to the syllabus page all received expected outcomes from the modification: no significant change in all second-order transitions whose past state qi = qintroduction or qreference; visitors from the homepage as the targets of the modification significantly benefited in all possible ways; the effects of visits starting directly from the syllabus page were varied. About the non-significant increase in (qstartqsyllabus)→ qintroduction (from 0.38 to 0.45, p=0.07), one interpretation is that people who started the course directly from the syllabus page probably had clearer motivations and thoughts about what they needed. They either quit immediately due to the lack of interest or tried harder to find the way to continue. Therefore, they were only partly affected by the modification. This interpretation also explains why the probability of (qstartqsyllabus)→ qexit was higher and not significantly changed by the modification. The fact that the probability of (qstartqsyllabus)→ qhomepage decreased significantly after modification could be a consequence of the significant increase of transitions to qintroduction. The more visitors entered the introduction page, the less they went back to the homepage. In a nutshell, the results of the second-order Markov chain model confirm that the probability changes in the first-order Markov chain model were from the web modification.

In this study, we used Markov chain offline to analyze the design and utilization of two simple linear static web-based continuing education courses, through which we identified and fixed some design flaws, and quantitatively assessed the improvement in the web modification. This offline application of Markov chain models is determined by the simple web structure of the target continuing education courses. This model can also be applied online to provide real-time recommendations in a more complex education environment such as an adaptive and personalized continuing education course. The latter application will be explored in the future when the target continuing education courses are upgraded to a more interactive learning structure.

As a federal public health agency, CDC's primary goal in creating the two CFS continuing education courses was to disseminate the most current findings and knowledge about CFS relevant to healthcare providers. An open access mode was chosen so that all professionals interested in CFS can access the courses any time from anywhere. Although this design makes it impossible to identify the learners' professional disciplines, our approach focused on the performance of the web utilization, and knowing individual user information is not required in the evaluation because the Markov chain model took the use of navigation paths from all actual visitors to the continuing education web sites.

As we addressed in the Methods section, one limitation of this study is that we were not able to distinguish new visitors from returned visitors (who may have saved the URL for pages so they could re-enter where they left off) because the continuing education web site could not utilize persistent cookies. The consequence of treating the navigation paths of returned and new visitors similarly would confound an evaluation of skip transitions because returned visitors should be more likely to skip pages that they had previously viewed. However, both resulting Markov chain models in this study showed no skip transitions with a probability above the cut-off value, which suggests that this limitation did not affect the results.

We were also unable to identify individuals who received credit from the course and thus could not relate navigation patterns and drop-out rates to the desire to obtain credits or to their responses to the short test that measured knowledge. During the first 6 months of 2008, the CFS continuing education web site received 7022 visits, and only 292 people took the post-test and earned credits. Therefore, we believe that only a very small portion of visitors to the CFS continuing education courses came for credit, whose activities on the web site would not dramatically change the navigation patterns and evaluation results.


Through this paper, we demonstrated how to use a Markov chain model to evaluate the matching between the designed web structure and the actual usage models, how to identify design issues, and how to assess quantitatively the overall effects of web modifications. In the cases that the overall transition probability change might be compromised or canceled by opposite changes from different web traffic routes, we strongly recommend applying a higher order Markov chain model to verify the effect assessment. Both the results and techniques in this paper would be very useful for public health education and research specialists.

Competing interests


Provenance and peer review

Not commissioned; externally peer reviewed.


View Abstract