Elisa Knebel and Jaime L. Jarvis also contributed to this post. 

Until recently, nearly every professional training session ended with a post-session evaluation that invited participants to provide their immediate reactions to classic questions: Were the objectives clear? Were the trainers knowledgeable? Were activities as engaging as they were informative? Were the facilities conducive to learning? Would you recommend this training to your colleagues?

Most training evaluations intentionally stopped there.

A new way of evaluating training

Many training teams followed that traditional approach—what Lesley Ann Harrison calls the use of “smile sheets.” They did so again and again, even though learning industry experts generally agreed that it did little to evaluate the actual purpose of training—equipping adult learners with the knowledge, skills, and attitudes they need to contribute to organizational effectiveness.

However, expectations for training evaluation are changing. For example, the Office of Personnel Management (OPM) is now asking U.S. government agencies to evaluate how participants actually improve their attitudes, skills, and knowledge as a result of a training, and whether and how they ultimately apply what they learned on the job. These regulations form the foundation of the OPM Training Evaluation Field Guide, which uses the New World Kirkpatrick Model of Evaluation and advocates moving beyond smile sheets (Level 1, or “Reaction”) to applying Levels 2, 3, and even 4 of the model.

Tips for implementing the Kirkpatrick Model

In this post, we share tips for helping clients implement Levels 2, 3, and 4 to obtain a more complete understanding of how their training courses affect participants’ individual learning and behaviors, and the impact on the organizations where they work.

kirkpatrick model evaluating training courses

The four levels in the Kirkpatrick Model

Level 2: Choose evaluation methods that truly match the desired learning objective

At Level 2 (Learning), as a training evaluator, you are evaluating the extent to which participants acquired the intended knowledge, skills, attitude, confidence, and commitment from the training. To do this, it’s important to make sure assessments, tests, or qualifications that are used to measure changes align as closely as possible with the training objectives.

Evaluators often gravitate toward multiple choice tests before and after the training because these have been used for decades and are fairly easy to write. While these evaluations have their place in measuring knowledge gains, they are largely irrelevant for most skills and attitudes. Therefore, other methods need to be used. These other methods may include observation rubrics to assess a participant’s display of a list of desired attributes (for example, to evaluate public speaking training), checklists to assess written outputs against predetermined criteria (for example, to evaluate training in technical writing), or responses to a case study (for example, to measure attitudinal shifts after diversity training).

Level 3: Take a whole-systems evaluation approach

pull quote: It's important to make sure the assessments, tests, or qualifications  that are used to measure changes align with the training objectives.

At Level 3 (Behavior), you are measuring the degree to which participants apply what they learned during training when they are back on the job. To do this, you will need to take a whole-systems evaluation approach. This means not only focusing on whether targeted skills, attitudes, and/or knowledge increased because of the training, but also understanding how the entire environment in which the participants work—including the context, people, processes, structures, technology, and patterns of interaction—has been affected. Failing to take a whole-systems evaluation approach can result in focusing on improving the capacities of participants for a system that ultimately might not be able to support these improvements.

Taking a whole-systems evaluation approach means using assessment tools that capture perspectives from colleagues (e.g., interviews with peers or supervisors on the participant’s performance) or document review (e.g., a checklist review of a participant’s written work on the job). It also means (a) establishing baseline data on target participants’ ability to demonstrate key skills, attitudes, and knowledge with a pre-training learning needs assessment survey, and (b) following up with the participants, as well as their peers or supervisors, with another ability survey about the same set of skills at least 3 months post-training.

pull quote: A whole-systems evaluation approach means understanding how the entire environment in which the participants work has been affected.

Lastly, before embarking on a Level 3 evaluation, and preferably during the training design phase, it is necessary to establish a rationale for choosing to evaluate a training at this level. Typically, training courses that are “mission critical” and/or require considerable monetary resources should be considered for Level 3 evaluation.

Level 4: Be realistic about what can be measured and seek to understand institutional supports and challenges to performance

At Level 4 (Results), you are measuring the degree to which the training has contributed to the targeted outcomes. Tracking Level 4 results is extremely difficult and requires the collaboration of course owners, organizational leaders, and HR analytics systems.

Isolating the contribution of training to the organization’s overall performance is almost impossible. An organization’s outcomes may come from other changes in context, new processes, new staffing, or other changes that occur at the same time as the training.

Measuring a training’s contribution to organizational performance also requires extensive planning. Appropriate metrics must be selected and measured before the training occurs. The selected metrics must then be measured over an acceptable period. Sometimes, the contribution to the organization’s performance may not occur for 6 to 12 months. And as more time passes, it becomes harder to isolate the training’s impact on the selected metrics.

However, if you can measure improvements in your key organizational metrics and there is feedback from participants that their training relates to those metrics, then you have somewhat of a link established between the training and organizational outcomes. Even though it may be a proxy or estimated impact, you can at least capture the “sense” of training’s impact on key goals, which is better than nothing.

Supporting Level 3 and 4 evaluations for EnCompass’ clients

pull quote: Sharing the Level 3 evaluation report with all supervisors and their managers encourages further application of supervision skills.

EnCompass’ clients work to reduce poverty, improve gender equality and social inclusion, promote a strong civil society, and keep people healthy and safe. Doing these things means creating meaningful and lasting changes in behavior. As training and facilitation experts supporting our clients toward these aims, we ask training participants and their supervisors: What has changed on the ground as a result of the training? How has the training helped you reach organizational goals?

Doing Level 3 and 4 evaluations also provides an opportunity to evaluate institutional supports and accountability processes that support training implementation. EnCompass examines whether participants are able to apply new skills and integrate new behaviors in their day-to-day work on a sustained basis. If so, what is supporting them? If not, what is challenging them? To answer these questions, we look at a combination of behavioral and environmental factors that influence individual performance, such as participants’ opportunities to apply new learning on the job, level of support from colleagues, and incentives that are in place to apply learning.

For example, during our first Level 3 evaluation of a supervisor development course, we discovered that across both the learning needs assessment and Level 3 survey responses, participants reported their lowest ability with “handling unacceptable performance.” Through additional interview and survey data, we were able to determine a variety of organizational factors that could be contributing to participants’ challenges with this skill. This included lack of adequate incentives to supervise, lack of time to properly supervise, and lack of a culture of accountability, combined with ineffective performance management systems, and lack of understanding about whom to go to for support.

pull quote: When our clients move beyond the standard smile sheet and start evaluating sustained changes... great things happen.

As a result of these findings, EnCompass and its client identified specific ways that the training and the organizational environment needed to change to address the challenges head on. This included more intentional communication with participants’ managers about the goals of this training and their role in it, more tools and resources within the training on handling unacceptable performance, inviting the client’s HR and Employment Law representatives to present at the training, and developing e-learning on the client’s employee and labor relations rules and regulations.

As for organizational improvements, the suggested changes included adding supervision and HR management skills to competency categories for all hiring mechanisms, expanding this type of training to higher-level managers as a way to ensure a growing culture of good supervision and management at the organization, and finally, sharing the Level 3 evaluation report with all supervisors and their managers to encourage further application of supervision skills.

An exciting time of change in training evaluation

When our clients move beyond the standard smile sheet and multiple-choice questions and start evaluating sustained changes in knowledge, skills, and behaviors and the impact training has within the wider organization, great things happen for them and the people they serve. Our most courageous clients are using training evaluation findings to identify and address institutional factors that impede performance and enable success so that training is not performed in isolation, but rather, as part of a wider whole-system approach to producing the best results.

Graphic created by Julie Harris.