OBJECTIVE: To review and analyze evaluation methods currently utilized in health emergency preparedness exercises (HEPE). DESIGN: This study, part of a larger scoping review that systematically collected and reviewed published evidence related to the benefits of HEPE, provides a further analysis of the evaluation methods utilized in such exercises. We separately analyzed discussion-based and operation-based exercises according to their purpose. This addresses a methodological limitation related to the poorly understood relationship between the purpose and context in which a specific evaluation method is selected to be used. RESULTS: In the reviewed 64 studies, a variety of evaluation methods were utilized for HEPE including observations, participants' survey, and post-exercise debriefs. At present, the selection and use of these methods is not guided by any methodology, but seems rather arbitrary. No specific evaluation methods were isolated for any exercise type. CONCLUSIONS: The purpose of evaluation should guide the selection of evaluation methods for HEPE, and these are not context specific. If evaluation is for accountability purposes, such as to test organizational capability to respond, participant feedback should be collected in addition to objective data on performance in an exercise. Advantages of routinely collecting data from exercise participants to study their reactions (exercise feedback, perceptions, satisfaction with the exercise) and routinely conducting post-exercise debriefs (both hot debrief and cold debrief), are discussed to support evaluation for development or learning purposes in any context.