Nurse-led rounding checklists are a common strategy for facilitating evidence-based practice in the intensive care unit (ICU). To streamline checklist workflow, some ICUs have the nurse or another individual listen to the conversation and customize the checklist for each patient. Such customizations assume that individuals can reliably assess whether checklist items have been addressed.
To evaluate whether 1 critical care nurse can reliably assess checklist items on rounds.
Two nurses performed in-person observation of multidisciplinary ICU rounds. Using a standardized paper-based assessment tool, each nurse indicated whether 17 items related to the ABCDEF bundle were discussed during rounds. For each item, generalizability coefficients were used as a measure of reliability, with a single-rater value of 0.70 or greater considered sufficient to support its assessment by 1 nurse.
The nurse observers assessed 118 patient discussions across 15 observation days. For 11 of 17 items (65%), the generalizability coefficient for a single rater met or exceeded the 0.70 threshold. The generalizability coefficients (95% CIs) of a single rater for key items were as follows: pain, 0.86 (0.74-0.97); delirium score, 0.74 (0.64-0.83); agitation score, 0.72 (0.33-1.00); spontaneous awakening trial, 0.67 (0.49-0.83); spontaneous breathing trial, 0.80 (0.70-0.89); mobility, 0.79 (0.69-0.87); and family (future/past) engagement, 0.82 (0.73-0.90).
Using a paper-based assessment tool, a single trained critical care nurse can reliably assess the discussion of elements of the ABCDEF bundle during multidisciplinary rounds.
Notice to CE enrollees:
This article has been designated for CE contact hour(s). The evaluation demonstrates your knowledge of the following objectives:
Develop an assessment tool for measuring the discussion of topics related to the ABCDEF bundle during rounds.
Determine how many critical care nurses are needed to reliably measure the discussion of evidence-based practices during rounds.
Describe situations in which an independent assessment of the discussion of evidence-based practices is desirable.
To complete the evaluation for CE contact hour(s) for activity A2332, visit https://aacnjournals.org/ajcconline/ ce-articles. No CE fee for AACN members. See CE activity page for expiration date.
The American Association of Critical-Care Nurses is accredited as a provider of nursing continuing professional development by the American Nurses Credentialing Center’s Commission on Accreditation, ANCC Provider Number 0012. AACN has been approved as a provider of continuing education in nursing by the California Board of Registered Nursing (CA BRN), CA Provider Number CEP1036, for 1.0 contact hour.
Efficient translation of evidence into practice is an important challenge in the intensive care unit (ICU).1 One strategy for improving translation is the use of reminders,2 such as a checklist of evidence-based practices used during daily rounds.3 Rounding checklists can improve numerous ICU patient outcomes,4 with greater checklist completion associated with better outcomes.5 Unfortunately, checklists are often not completed as intended because they disrupt workflows and burden the rounding team.6
To reduce checklist burden, some ICUs customize rounding checklists to each patient.7-10 Customization can occur in different ways; an example is having an individual serve as the “checklist prompter” who listens to the conversation, eliminates elements as they are addressed, and then prompts the team to consider any remaining unaddressed elements.9,10 These customized approaches can help make checklists more palatable to the team, but they assume that individual prompters can reliably assess whether checklist elements have been addressed.11 If individuals disagree on whether a checklist item is indicated or not yet discussed, then customized checklist applications will not lead to improved quality. To address this issue, we sought to evaluate whether a single individual can reliably assess checklist items on rounds.
We first created an assessment tool for documenting the discussion of information related to the ABC-DEF bundle.12 The ABCDEF bundle is an evidence implementation strategy for the ICU that often involves a rounding checklist related to pain, agitation, delirium, ventilator care, and family engagement.13 Then, we enlisted 2 off-duty critical care nurses to observe rounding discussions and use the paper-based assessment tool. Finally, we applied generalizability theory14 to estimate whether a single nurse could serve as a reliable prompter when determining which ABCDEF bundle elements are addressed during a multidisciplinary rounding discussion. We focused on the ABCDEF bundle because its elements are evidence-based, inconsistently adopted in many ICUs, and the focus of many ICU checklists.12,15 We employed nurses instead of other care providers because they are universally present on rounds and are a cornerstone of delivery of evidence-based practice in the ICU.16,17
Project Design and Setting
This project was performed at 2 ICUs at UPMC, a tertiary referral center in Pittsburgh, Pennsylvania, and an academic affiliate of the University of Pittsburgh. One ICU has 12 beds and specializes in abdominal transplant patients, and the second ICU has 22 beds and specializes in surgical trauma patients. Both ICUs conduct multidisciplinary rounds that typically include some combination of the following provider types: attending intensivists, fellows, residents, critical care nurses, pharmacists, nutritionists, respiratory therapists, physical therapists, and students. The data for this project were collected as part of an ongoing effort to improve hospital performance; thus, the project was designated as a quality improvement initiative by UPMC’s Quality Improvement Review Committee (approval #1792). Additional contributions by D.L.M. were determined to be non–human subject research by the University of Pennsylvania institutional review board (#833938).
Can “checklist prompters” reliably assess whether checklist elements have been addressed?
Development of the Assessment Tool
A paper-based assessment tool was developed for the observers to use when assessing the discussion of various evidence-based practices during rounds. The practices considered were those included in the ABCDEF bundle12 : A = assess, prevent, and manage pain; B = both spontaneous awakening trial (SAT) and spontaneous breathing trial (SBT); C = choice of analgesia and sedation; D = delirium: assess, prevent, and manage; E = early mobility and exercise; and F = family engagement and empowerment.
To develop a tool that met our project needs, we convened an in-person development workshop to split each of the ABCDEF bundle elements into 1 or more binary (yes/no) statements representing whether that element (or portion of an element) was discussed during rounds. For example, element A (assess, prevent, and manage pain) was split into 3 nested questions: Was pain discussed? If so, was pain present? If so, was a plan for addressing pain made? We used an iterative consensus process to draft an initial version of the tool, followed by an in-person workshop where attendees used the draft tool to assess 2 transcribed discussions of patients. The tool was then revised on the basis of feedback from the workshop. The workshop was followed by a field pilot test where 2 nurses used the revised tool during rounds for 1 pilot observation day. Final revisions to the tool were made, resulting in the assessment tool shown in Figure 1.
After the workshop, 2 nurses used the tool during 118 rounding discussions.
For nested items, we created 6 rules to determine agreement and disagreement. Observers agreed when (1) both observers reached the item and selected yes; (2) both observers reached the item and selected no; (3) neither observer reached the item; or (4) 1 observer reached the item and selected no and the other observer did not reach the item. Conversely, observers disagreed when (5) both observers reached the item and disagreed on whether it was yes or no; or (6) 1 observer reached the item and selected yes and the other observer did not reach the item.
Two nurse observers (K.M.P. and J.B.S.) performed in-person observations of multidisciplinary morning rounds on 15 observation days in the fall of 2021. One unit and rounding team was chosen for observation each day, with no requirements made for the number or type of team members present. By design, 4 of the observation days occurred on the weekend, when rounds are usually shorter and involve fewer people. Handoff discussions were not observed. During an assessment, both nurse observers were instructed to watch and listen to the rounding discussion while independently completing the assessment tool. To isolate the content of the discussions, they were instructed not to look at electronic health record data or into patient rooms for visual cues (eg, delivery of invasive mechanical ventilation). To maintain independence of their responses, they were also asked to distance themselves from each other and not confer. All of a rounding team’s discussions during a data collection day were assessed except when the nurse observers were asked by the care team to not observe a discussion (eg, a difficult end-of-life discussion with the patient’s family), a discussion occurred in a patient’s room (rather than in the hallway), or either of the nurse observers had to leave the unit before rounds ended because of scheduling constraints.
On the basis of the measured reliability of our 2 nurse observers, we used generalizability theory to estimate what the reliability would be for each item if there were only 1 nurse observer. Generalizability theory holds that the observed variance in reliability can be divided into different component parts in order to make inferences about how reliability changes when altering 1 of the parts.14 This approach enables determination of the expected reliability with various numbers of hypothetical raters, assuming that each rater has similar training and skill.14 For example, generalizability theory has been used to determine how many radiologists would be needed to evaluate each image when creating a reliable reference standard for evaluating a machine learning model that extracts information from chest radiographs.18
We used a single-part crossed design (discussion× observer), enabling separation of the observed variance into 3 component parts: discussion (σ2discussion), observer (σ2observer), and residual (σ2residual).14 The σ2discussion is equal to the true variance. The sum of σ2observer and σ2residual equals the variance due to error. We then used σ2discussion and σ2residual to calculate the generalizability coefficient for a single observer (ρ1).18 Consistent with past work, we considered a generalizability coefficient of 0.70 or greater to be sufficient support for having a single rater perform the assessments by themselves in the future.18,19 In addition to the generalizability coefficient, we also calculated the Cohen κ statistic,20 which is a more commonly reported reliability statistic, but it may underestimate the reliability of the observers.18
All statistical analyses were conducted in R.21 We describe patient characteristics using standard summary statistics based on data obtained from an existing registry of ICU patients. The Cohen κ and 95% CIs were calculated in R version 3.6.3 using the Kappa. test function from version 0.7.2 of the “fmsb” package. Generalizability coefficients were calculated by using R version 3.5.3 with the gstudy and dstudy functions from version 0.1.2 of the “gtheory” package. Confidence intervals for κ and the generalizability coefficient were produced using the fmsb package and a nonparametric bootstrap with 1000 repetitions,22 respectively.
ICU Population Characteristics
The characteristics of the ICU patients whose rounding discussions were observed during this project are shown in Table 1. In total, 53 different patients were observed, with 33 of them receiving invasive mechanical ventilation. The patients were split almost evenly between the surgical trauma and transplant ICUs. Because ICU admissions often last for multiple days, many patients had multiple discussions observed. In total, the nurse observers used the assessment tool when assessing 118 patient discussions. These dually observed discussions are the basis for calculating reliability and agreement.
Reliability and Agreement
The 17 items on the assessment tool are each related to 1 or more of the elements of the ABCDEF bundle. Each item’s reliability is shown in Table 2, along with an estimate of the number of observers needed to reach the targeted generalizability coefficient, 0.70. Eleven items (65%) had single-observer reliability (defined as generalizability coefficient per observer ≥0.70), and 14 items (82%) had a substantial agreement (defined as κ ≥ 0.61).20 The reliable items have broad coverage across the bundle’s elements, with the partial exception of “B: Both SAT and SBT” because data showed that the generalizability coefficient for a single observer of the “SAT discussed” item was less than our predefined threshold. This means that a hypothetical observer with similar training and experience who is using this assessment tool cannot reliably assess whether an SAT was discussed (ie, this item is unacceptably inconsistent among different hypothetical observers).
One nurse can reliably assess adherence to most of an ABCDEF bundle rounding checklist.
The distribution of responses across all items is shown in Figure 2. Although agreement occurred for the majority of responses, the most common response type (both observers = yes or both observers = no) varied across items. For example, items involving pain were often discussed, whereas “agitation score discussed” was almost never answered yes. The conditional branching on the assessment tool is also clearly represented in the data, where some items were applicable only if the preceding item was responded to with a particular answer. For example, the item “pain present” was applicable only if the parent item “pain discussed” was answered yes.
We found substantial agreement between 2 independent nurses as to whether elements of an ABCDEF bundle rounding checklist were addressed during rounds. These results suggest that a single independent observer, with similar training and experience, of multidisciplinary ICU rounding discussions can reliably assess adherence to an ABCDEF bundle rounding checklist. Our findings have several important implications for performance improvement and quality measurement in the ICU.
First, our results provide supporting evidence that nurses can identify when a rounding checklist element has been addressed and, therefore, might not need to be repeated during a readout of the checklist. This added flexibility enables a shorter, more precise checklist for each patient, which might make using a checklist for every patient more palatable to the team.6
Second, as with empowering the bedside nurse to perform checklist customizations, our results show that critical care nurses are ideal candidates to be independent checklist prompters during rounds.9 Although some quality improvement projects that have included prompts for evidence-based practices have found reductions in mortality,10 others have not.23 The difference may be due in part to how prompting was implemented: during rounds versus later in the day.24
Third, our results support a novel approach for measuring and improving performance with the ABCDEF bundle via audit and feedback.2 The assessment tool we created can be used as the basis for occasional strategic use of an independent nurse observer to measure team performance. Similar tools may also be useful when measuring team performance during other events where team communication is essential, such as emergency response and bedside shift handoff.
The generalizability of these results is limited by a few factors. First, the nurse observers were involved in the creation of the assessment tool. Such involvement may result in them having more insight into how to apply the tool than an outside nurse. Therefore, additional work may be needed to understand the need for user training and the validity of application in practice. This issue is particularly important because conclusions drawn from generalizability theory assume that observers have similar training and experience.14 These concerns were partly countered by the clarifying instructions that the observers added to the assessment tool after the field pilot testing. Furthermore, the observers were instructed not to discuss the tool with each other anymore after it was finalized.
Using a checklist prompter adds flexibility that enables use of a shorter, more precise checklist for each patient.
Second, although the items on the assessment tool were inspired by the ABCDEF bundle, our assessment tool divides the 6 bundle elements into 17 items that were relevant to our local needs. Using only the 6-element bundle as the assessment tool was considered but rejected because doing so would reduce the granularity of the data collected. Future users of this tool or similar tools should give careful thought to the information desired when selecting items for inclusion. This could mean including additional items, such as infection control measures,25,26 that are not part of the ABCDEF bundle.
Third, only rounding discussions that occurred in the hallway were observed. Owing to both institutional norms and the presence of COVID-19, most rounding discussions occurred in the hallway rather than the patient rooms, but the results may differ for units with different rounding practices.
A final consideration is that the units observed in this project were heterogeneous in their use of rounding checklists. One unit uses a custom rounding checklist that includes the ABCDEF bundle elements plus 9 additional elements. The other unit has a unitwide, nurse-led evidence-based practice checklist, but it is not typically referenced during multidisciplinary rounds. Our sample size did not enable subanalyses with sufficient power to be conducted on each unit; however, the diversity of rounding practices can be seen as a strength.
Using a paper-based assessment tool, 1 trained ICU nurse can reliably assess the discussion of elements of the ABCDEF bundle during multidisciplinary rounds. The presence of a few less-reliable items, such as “SAT discussed,” indicates room for improving the assessment tool and a need to further investigate how evidence-based practices are conceptualized among members of the multidisciplinary ICU team.
This article is followed by an AJCC Patient Care Page on page 100.
The work reported in this publication was supported in part by the National Heart, Lung, and Blood Institute of the National Institutes of Health under award number R35 HL144804 and the Pittsburgh Health Data Alliance.
For more about rounding and the ABCDEF bundle, visit the Critical Care Nurse website, www.ccnonline.org, and read the article by Lancaster et al, “Using a Standardized Rounding Tool to Improve the Incidence of Spontaneous Awakening and Breathing Trials” (April 2022).
To purchase electronic or print reprints, contact American Association of Critical-Care Nurses, 27071 Aliso Creek Road, Aliso Viejo, CA 92656. Phone, (800) 899-1712 or (949) 362-2050 (ext 532); fax, (949) 362-2049; email, email@example.com.