Calibrating a chief complaint list for low resource settings: a methodologic case study

Background The chief or presenting complaint is the reason for seeking health care, often in the patient’s own words. In limited resource settings, a diagnosis-based approach to quantifying burden of disease is not possible, partly due to limited availability of an established lexicon or coding system. Our group worked with colleagues from the African Federation of Emergency Medicine building on the existing literature to create a pilot symptom list representing an attempt to standardize undifferentiated chief complaints in emergency and acute care settings. An ideal list for any setting is one that strikes a balance between ease of use and length, while covering the vast majority of diseases with enough detail to permit epidemiologic surveillance and make informed decisions about resource needs. Methods This study was incorporated as a part of a larger prospective observational study on human immunodeficiency virus testing in Emergency Departments in South Africa. The pilot symptom list was used for chief complaint coding in three Emergency Departments. Data was collected on 3357 patients using paper case report forms. Chief complaint terms were reviewed by two study team members to determine the frequency of concordance between the coded chief complaint term and the selected symptom(s) from the pilot symptom list. Results Overall, 3537 patients’ chief complaints were reviewed, of which 640 were identified as ‘potential mismatches.’ When considering the 191 confirmed mismatches (29.8%), the Delphi process identified 6 (3.1%) false mismatches and 185 (96.9%) true mismatches. Significant chief-complaint clustering was identified with 9 sets of complaints frequently selected together for the same patient. “Pain” was used 2076 times for 58.7% of all patients. A combination of user feedback and expert-panel modified Delphi analysis of mismatched complaints and clustered complaints resulted in several substantial changes to the pilot symptom list. Conclusions This study presented a systematic methodology for calibrating a chief complaint list for the local context. Our revised list removed/reworded symptoms that frequently clustered together or were misinterpreted by health professionals. Recommendations for additions, modifications, and/or deletions from the pilot chief complaint list we believe will improve the functionality of the list in low resource environments.


Introduction
The chief or presenting complaint is the reason for seeking health care, often in the patient's own words. While the utility of recording and analyzing chief complaints is well accepted in resource-rich and highly developed emergency care systems, diagnosis-based research has remained the predominant standard when evaluating/quantifying burden of disease (BOD) in most health care settings [1]. In limited resource settings, a standard approach to quantifying BOD is not possible, partly due to both limited availability of an established lexicon or coding system and more limited diagnostics than in resource-rich settings [2]. Furthermore, diagnosisbased research strategies fail to capture an essential element of emergency care: the sorting of patients with undifferentiated symptom-based chief complaints into diagnostic categories and levels of acuity, which in turn guides decision-making on behalf of these patients based on limited symptom-based information and diagnostics.
The patient's chief complaint is a key piece of information that helps direct this process. Additionally, the chief complaint can further be stratified based on acuity and those complaints designated as 'high acuity' have been found to independently predict mortality; enhanced attention and increased resources being made available to those with high risk chief complaints can improve patient health outcomes, and is the standard of care in many resource-rich settings [3]. This inherent value of the chief complaint has led to the development of ontologies of emergency care presenting complaints of varying degrees of sophistication; these have been predominantly derived in wealthy settings and not validated through most of the world [4][5][6]. One intrinsic barrier to researchers' use of chief complaints, rather than diagnoses, has lain in the absence of standardization of chief-complaint nomenclature, terminology and taxonomy for the recording, translating, and cataloguing of complaints among emergency patients presenting to care in these global settings [7]. The few efforts in this direction that do exist lack the imprimatur and validation standards of an international body [8]. Our group worked with colleagues from the African Federation of Emergency Medicine building on work completed by Rice et al. [3] to create a pilot symptom list for use in low resource settings, representing an attempt by an international criterion-setting body in emergency care to standardize this type of collection of undifferentiated chief complaints in the emergency and acute care setting.
The challenge of translating free-text chief complaints to support syndromic surveillance, operational needs and research work has been extensively addressed in high-income settings [4,9,10]. However, and as noted, lists derived in resource-rich settings often have not been validated across national boundaries, nor in health care systems characterized by substantially different patterns of care-seeking, resource-availability, and degrees of development of emergency care. In addition, while chief complaints do not perfectly map to disease burden, a standardized language for recording and analyzing chief complaints allows actors across the spectrum of acute and emergency care, including community, pre-hospital, and hospital-based health-care providers, to effectively communicate and develop system-level priorities based around the signs and symptoms most often experienced by the patients they serve. A straightforward and universal chief complaint list, tested and validated in a global setting, would be of profound benefit to clinicians, researchers, and policymakers world-wide as it would allow for the use of chief complaint data in the quantification, analysis, and evidence-based planning that emergency care in low-resource settings is urgently in need of. This paper presents methodological strategy that can be exported to other settings to refine a local chief complaint list. The authors piloted a draft symptom list against traditional free-text chief compliant recording and thereby sought to calibrate and improve the functionality of the pilot symptom list within an exemplar emergency system in South Africa.

Overview
Incorporated as a part of a larger study on human immunodeficiency virus (HIV) testing in Emergency Departments in South Africa, the pilot symptom list (Appendix 1) was used for chief complaint coding in a large multi-center Emergency Department (ED) based observational study in South Africa wherein study staff (predominantly HIV counsellors or nurses, with research training), collected both free text chief complaints and then made a good-faith effort to match chief complaints to one of the predetermined symptoms from the pilot symptom list [11].
The original prospective observational study, conducted between June 2017 and July 2018, was embedded in the larger Walter Sisulu Infectious Diseases Screening in Emergency Departments (WISE) Study that implemented point-of-care HIV testing in the ED and collected extensive demographic data on ED patients. This study collected data across three EDs in the Eastern Cape Province, South Africa, where each of the three EDs was sampled for a period of 6 weeks. Data was collected on convenience sample of 3357 patients, who enrolled in the WISE study, from across these three hospitals using paper case-report forms (CRFs). For the purposes of this analysis, the free-text chief complaints were then coded using the Medical Dictionary for Regulatory Activities (MedDRA©) nomenclature. Free-text chief complaints were then compared to those identified by study staff using the pilot symptom list and analyzed for clustering using factor analysis. The primary outcome of interest was to assess the adequacy and accuracy of the pilot symptom list in capturing and reflecting patients' presenting complaints. The secondary outcome of interest was to assess redundancy in the pilot symptom list by observing categories that were never selected or selected significantly frequently. A modified Delphi methodology was used to review the outcomes and observations using these to make recommendations for modifications and amendments to the pilot symptom list.

Setting
The WISE study was conducted in the Eastern Cape Province in three hospital-based emergency departments. Nelson Mandela Academic Hospital (NMAH) and Mthatha Regional Hospital (MRH) are located in the rural town of Mthatha, and Livingstone Hospital (LH) is in the city of Port Elizabeth. NMAH and LH are both tertiary care centers, they receive referred patients from regional and district hospitals in addition to providing 24-h trauma care. MRH provides 24-h services for walk-in patients and ambulances, while trauma cases are transferred to NMAH. All hospitals maintain 20-50 beds in the ED and are staffed by 1-2 doctors, but are not staffed by physicians or other providers specializing in Emergency Medicine. Patients are seen on a first-come-first-serve basis unless determined to be critically ill or requiring immediate care. Handwritten logbooks and paper medical files are used to track all patients.

Recruitment and enrolment
Patients presenting for care to the hospital ED during the study period, aged 18 years and older, fully conscious, and clinically stable were eligible for enrollment in the study. Patients who met the inclusion criteria were approached by trained HCT staff as soon as they completed the triage process and were informed of the ongoing study and offered a point-of-care HIV test. Data was also collected on patient demographics, presenting complaint, presenting symptoms, past medical history, and reasons for accepting or declining the HIV test. Written informed consent was sought for all patients. Patients were enrolled 24 h a day throughout the duration of the study.

Data collection
Data were recorded using CRFs. Responses to demographic information, past medical history, and reasons for accepting or declining the HIV test were recorded using predetermined categorical options or as free text, presenting complaint was recorded as free text and later coded using the Medical Dictionary for Regulatory Activities (MedDRA©, MedDRA MSSO, Virginia). Chief complaints were recorded using the pilot symptom list. CRFs were scanned and entered using intelligent character recognition (ICR) DataFax software (DataFax©, Clinical DataFax Systems Inc., Hamilton, Ontario, Canada) and centrally double-verified by independent data technicians.

Data analysis and statistics
Data were analyzed using STATA v.15© (StataCorp, LLC, TX). The pilot symptom list was checked for accuracy against MedDRA-coded chief complaints for each patient. MedDRA-coded chief complaint terms were reviewed by two study team members to determine the frequency of concordance between the MedDRA term and the selected symptom(s) from the pilot symptom list. A 'match' was defined as a patient record with a MedDRA term that matched with the symptom(s) selected on the pilot symptom list. A 'mismatch' was defined as a patient record with a MedDRA term that was either not present in the pilot symptom list or did not align with the symptom(s) selected on the pilot symptom list. These mismatches were further defined as 'true mismatches' aka list errors (when the appropriate symptom(s) matching the MedDRA term did not exist on the pilot symptom list/needed to be added) and 'false mismatches' aka rater errors (when the appropriate symptom(s) matching the MedDRA term was available but not selected from the pilot symptom list). Clusters of chief complaints were identified using an exploratory factor analysis of the chief complaint list (48 complaints). Factorability of the chief complaints was determined by inspecting the correlation matrix (correlations > 0.4), the Kaiser-Meyer-Olkin measure of sampling adequacy (KMO > 0.6), and Bartlett's test of sphericity (p < 0.05). The chief complaints were then subjected to factor analysis with an oblique rotation (oblimin), producing as simple a structure as possible while permitting correlations among factors. Factors were retained based on the Scree test (Cattell, 1966). Factor analysis is a statistical data reduction and analysis technique that strives to explain correlations among multiple outcomes as the result of one or more underlying explanations, or "factors." The technique involves data reduction, as it groups a set of variables based on frequency of concurrence.

Modified Delphi process
Our algorithm for matching the free-text chief complaint (coded using MedRA ©) and the boxes ticked on the pilot symptom list identified potential chief complaints to be discussed further during a modified Delphi process (Fig. 1).
From this review, final decisions regarding changes to the list were reached systematically using the modified Delphi method. Using this method, each reviewer shared reflections from their independent review, in a round robin fashion, which was recorded and reflected on a whiteboard to the entire group, until no new ideas were forthcoming. Thereafter, reviewers had the opportunity to discuss and clarify each comment/idea shared until group consensus was reached. Notes were   kept on rationale for response to each of the mismatches and a detailed review of each of the mismatches, the supporting discussion and resulting recommendations for changes is provided below.

Results
The intent of this analysis was to address adequacy and functionality of this data set to achieve chief complaint data capture in a real-world resourceconstrained emergency medicine setting. There were three distinct activities that were undertaken by the study team to achieve this; firstly, functionality was assessed using staff surveys and feedback via interviews; secondly descriptive data-analysis was used to identify complaints not used, clusters, and mismatches; and thirdly, a modified Delphi approach to discuss observations and recommend changes.

End-user feedback
Functionality was approached through staff surveys and interviews with the HIV counsellors and nurses who completed the CRFs. These staff reports revealed four key concerns. Firstly, patients and staff struggled with the meaning of some of the terminology used-e.g., "GU" referring to genital or urinary complaints. Secondly, the pilot symptom list was presented as an unordered list-the absence of an interpretable sequence to the list such as alphabetic or body-system based ordering made it difficult for staff to rapidly find complaints. Thirdly, staff reported that they found many of the complaints on the pilot symptom list to be too broad (pain being a primary example) and applied to many patients-this meant that multiple boxes were often ticked. Lastly, staff found it difficult to identify if something was an injury or not, including being unsure, for example, how to classify a patient with an injury 2 weeks ago now presenting for a wound complication. This final concern was raised nearly universally across staff, and we suggest that a major modification to the pilot symptom list should be that the condition/symptom sections include both medical and traumatic chief complaints, and staff should collect further trauma data if an injury is present. In discussions of this topic, staff also reported that they found it easier to think in an ordered fashion about trauma, addressing first mechanism, then location and then intent for injured patients-for example, as in a fall downstairs, causing head injury, occurring due to an assault.

Descriptive analysis to identified chief complaints for potential modification
The adequacy and accuracy of the pilot symptom list to capture the richness of the data present in the free-text chief complaint fields, ideally in a single category was assessed by undertaking a descriptive analysis of the chief complaints, factor analysis, and matched analysis against the free-text chief complaint coded using the MedRA dictionary. A total of 3537 patients' chief complaints were reviewed. Several complaints were selected by staff less than 0.5% of visits and were discussed further, these were foreign body from injury, suspected flu/ cold, suspected HIV, blood in urine, confusion/ AMS, dental, and ENT. Two categories were never used (1) "suspected malaria" and (2) "foreign body inhaled." "Pain" was used 2076 times for 58.7% of all patients, therefore was of limited utility in providing discrete additional data. An exploratory factor analysis using the pilot symptom list categories to assess correlation patterns between individual symptoms as selected from the pilot symptom list was completed. Significant chief-complaint clustering was identified with 10 sets of chief complaints that were frequently selected together for the same patient (Table 1).
Based on the match of the chief complaint with the MedRA code for the free-text chief complaint, 640 were identified as 'potential mismatches'. Of these 'potential mismatches,' 191 (29.8%) were confirmed as true mismatches rather than algorithmic errors (i.e., due to coding errors or misreading of data via ICR) when reviewed by the authors. After the Delphi process 6 (3.1%) of the 191 confirmed mismatches were identified as false mismatches aka user errors where an appropriate symptom was available but not checked and 185 (96.9%) were identified as true mismatches aka list errors (Table 2).

Modified delphi
The end-user feedback and descriptive analysis (i.e., lowand-high frequency chief complaints, clusters (Table 1) and mismatches (Table 2) informed the expert-panel modified Delphi analysis and resulted in several substantial changes to the pilot symptom list, as presented in Appendix 2. A total of 10 complaints were removed, 6 were added, and eleven modifications were made. A detailed summary of the observations made and surrounding discussion from the modified Delphi process is available in Appendix 3.

Discussion
Chief complaints are essential to the practice of emergency care and chief complaint data contains a wealth of information to inform clinicians, researchers, and policy-makers as to the nature and diversity of emergency condition presentations, as well as the emergency care resource and training needs associated with them. As such, the absence of a standard chief complaint naming convention, minimum data set, and organizational strategy that retains functionality in a diversity of settings represents a critical tool gap in global emergency care. By classifying chief complaints, the most challenging portion of the labor of taxonomy is complete. Sorting the chief complaints expressed in the patients' own words into an established classification system using lay terminology thereby becomes much easier and with this ease comes reliability and consistency, improving data quality and helping fill the data gap in global emergency care.
As described, the absence of a unified, valid, and functional chief complaint short-list with tested utility in global resource-constrained acute care settings has had significant adverse effects on clinical care, research, and informed public. In this paper, we present a systematic approach to refining a chief compliant list for limited resource settings. Themes that were predominant in this analysis included issues of taxonomy, nomenclature, and frequency. Chief complaints that were removed were often taken out because they were overly generic and were selected by staff in preference to more specific and accurate chief complaints available on the list. Elimination of these complaints will improve the frequency with which a specific single accurate complaint is chosen. Inaccurate selection of chief complaint also occurred frequently on the basis of nomenclature confusion. The initial list included United States-centric jargon and abbreviations e.g., "ENT" and "GU" to refer to otolaryngologic and genitourinary complaints, respectively. This naming convention was confusing, opaque, and incomprehensible to the staff completing patient triage as it was not commonly used terminology/nomenclature in that population and culture. The elimination of jargon and acronyms and the use of simpler language/lay terminology to reflect similar sign/ symptom complexes address this issue and thereby improve legibility and identification of accurate chief complaints but does not address potential translation errors when used in a non-English-speaking setting. An additional taxonomic problem identified in factor analysis was the grouping of chief complaints that have substantial overlap (such as pain and swelling) or associate as a part of cardinal presentations of illness (such as chest pain and cough). This was addressed in some cases by combining chief complaints with substantial overlap into a single chief complaint while allowing individual elements of cardinal presentations to remain as standalone complaints using frequency data to inform these changes. Finally, frequency data and mismatch data were used to sub-divide extant chief complaints as well as to include chief complaints not currently captured in the original list of conditions and symptoms but responsible for notable fractions of presentations. One example of this pattern would be "back pain" which was the presenting concern in 2% of all presentations but was not well captured by any existing complaint. This final process and review of the 185 true mismatches resulted in adding 11 new complaints. Examination of frequency data also resulted in the identification of chief complaints with low frequency related to rarity and unlikely to enhance capture rates substantially through inclusion on the chief complaint list, such as "inhaled foreign body." These latter chief complaints were eliminated from the list.
Strengths of this study include that this is, to our knowledge, the first prospective piloting develop a minimum set of chief complaints intended for use in lowand middle-income settings. One limitation is that establishing this as a minimum set generalizable across geographic and cultural boundaries will require a validation strategy beyond South Africa, including in settings with less mature emergency medical systems and non-English-speaking settings. In recognition of the limitation of testing the pilot symptom list in a single country, the researchers recognize that some of the chief complaints with a low frequency, such as "suspected malaria, " were likely secondary to geographical biases of the database and thus would be beneficial to keep. Additional strengths include the development of a concise 47-item list; the goal of producing a minimum set that will allow for easy aggregation and categorization of patient presentations would not be possible with a more lengthy or more technical list of complaints, such as the Canadian ED Diagnosis Shortlist which captures 99% of all presenting complaints but requires 837 individual chief complaints to do so [10]. The goal of the analysis was to present a rapidly "scan-able" list of complaints that could be used by a triage or check-in provider with little-to-no specialty training in emergency care.
An additional limitation of this study is that there was no process in place to test the impact of the presentation of the pilot symptom list in its original form, including impact of individual item findability due to ordering effects and overall ease of use. We propose changes to the ordering and organization of the chief complaint list based on functionality data provided by in-depth interviews with study staff. However, testing unique variations of the organization of the pilot symptom list in each of the three sites in addition to interviews about use would have provided more robust data on the impact of user experience effects on functionality of the list overall. Lastly, we used similar data from interviews to inform revisions of the trauma categorization but formal usability testing strategies of these recommendations would add strength to claims of improved functionality. Suggested groupings of the chief complaints by body system for usability follow as Figs. 2 and 3.

Conclusion
This study presents a systematic methodological sound approach to refine a chief complaint list for low resources settings, namely user feedback, with descriptive analysis to assess frequency of use of the various chief complaints and factor analysis to assess for grouping. Further studies will require validation of suggested recommendations and application of these changes to improve functionality. Additionally, further research will need to determine generalizability of the proposed list across other LMICs outside of South Africa. A patient's chief complaint is a vital tool, that when easily accessible from a validated list to be wielded by trained emergency medical personnel may help triage patients, streamline emergency care delivery, and improve patient outcomes.