Challenging script concordance test reference standard by evidence: do judgments by emergency medicine consultants agree with likelihood ratios?

Ahmadi, Seyed-Foad; Khoshkish, Shahin; Soltani-Arabshahi, Kamran; Hafezi-Moghadam, Peyman; Zahmatkesh, Golara; Heidari, Parisa; Baba-Beigloo, Davood; Baradaran, Hamid R; Lotfipour, Shahram

doi:10.1186/s12245-014-0034-3

Brief Research Report
Open access
Published: 26 September 2014

Challenging script concordance test reference standard by evidence: do judgments by emergency medicine consultants agree with likelihood ratios?

Seyed-Foad Ahmadi^1,2,
Shahin Khoshkish^1,3,
Kamran Soltani-Arabshahi¹,
Peyman Hafezi-Moghadam⁴,
Golara Zahmatkesh¹,
Parisa Heidari^1,5,
Davood Baba-Beigloo⁶,
Hamid R Baradaran¹ &
…
Shahram Lotfipour⁷

International Journal of Emergency Medicine volume 7, Article number: 34 (2014) Cite this article

3212 Accesses
8 Citations
Metrics details

Abstract

Background

We aimed to compare the clinical judgments of a reference panel of emergency medicine academic physicians against evidence-based likelihood ratios (LRs) regarding the diagnostic value of selected clinical and paraclinical findings in the context of a script concordance test (SCT).

Findings

A SCT with six scenarios and five questions per scenario was developed. Subsequently, 15 emergency medicine attending physicians (reference panel) took the test and their judgments regarding the diagnostic value of those findings for given diseases were recorded. The LRs of the same findings for the same diseases were extracted from a series of published systematic reviews. Then, the reference panel judgments were compared to evidence-based LRs. To investigate the test-retest reliability, five participants took the test one month later, and the correlation of their first and second judgments were quantified using Spearman rank-order coefficient.

In 22 out of 30 (73.3%) findings, the expert judgments were significantly different from the LRs. The differences included overestimation (30%), underestimation (30%), and judging the diagnostic value in an opposite direction (13.3%). Moreover, the score of a hypothetical test-taker was calculated to be 21.73 out of 30 if his/her answers were based on evidence-based LRs.

The test showed an acceptable test-retest reliability coefficient (Spearman coefficient: 0.83).

Conclusions

Although SCT is an interesting test to evaluate clinical decision-making in emergency medicine, our results raise concerns regarding whether the judgments of an expert panel are sufficiently valid as the reference standard for this test.

Introduction

Script concordance test (SCT) has become a recognized tool to assess clinical reasoning in various fields including emergency medicine [1]-[14]. This case-based test consists of short clinical scenarios followed by questions regarding diagnosis or management. The questions are presented in three parts: A) a diagnostic or management option, B) a clinical finding, and C) a scale to capture examinee's decision (Figure 1) [15]. The test is based on measuring the concordance of test-taker judgments with those of a reference panel of experts [15]. Expert physicians usually organize their knowledge regarding diseases in `illness scripts, and when they encounter patients, they effortlessly recall the relevant scripts and promptly recognize the most appropriate courses of action [16]. SCT is indeed an effort to capture how close the scripts of test-takers are with the scripts of experts, and the rationale behind it is the more close to the experts' scripts, the better the decision-making by the test-takers. However, the expert judgments are reported to be frequently incorrect [17] and therefore, the reference standard of the test, which is the expert judgments, seems to be not necessarily valid. The test is mainly used to assess reasoning in uncertain situations [15] in which robust evidence is usually limited. However, that the test reference standard is not necessarily valid is still a critical issue and should be carefully investigated.

The diagnostic value of clinical and/or paraclinical findings is an appropriate context in which expert opinions can be compared with the best current evidence. At one hand, findings can be presented to experts and how such findings would modify the experts' diagnostic judgments, regarding the likelihood of particular diseases, can be measured. On the other hand, the expected effect of the presented findings on the likelihood of the same disease can be sought from the best current evidence. According to Bayes' theorem, the likelihood ratio (LR) of any diagnostic finding is a precise indicator of the expected change in the likelihood of that disease if the suspected individual has that particular finding [18]. Fortunately, LRs for a wide variety of clinical and paraclinical findings are either available or calculable based on robust studies [19]. Hence, we aimed to seek the judgments of a panel of emergency medicine experts regarding the diagnostic value of select clinical and paraclinical findings, acquire the evidence-based LRs for the same findings, and finally compare the judgments against the LRs.

Methods

Study design and setting

We invited all emergency medicine attending physicians (consultants) of the main teaching hospitals of two academic universities (Iran University of Medical Sciences and Tehran University of Medical Sciences) to participate in our study. The two teaching hospitals have an ED yearly census of over 90,000 patients. Participating consultants consented to be enrolled after receiving detailed explanations regarding the purpose and the design of the study. The required sample size was 15 according to the SCT development guidelines [15].

Test development

We developed a test containing six clinical scenarios on the following emergent conditions: 1) meningitis, 2) myocardial infarction, 3) pneumonia (in a child), 4) thoracic aortic dissection, 5) appendicitis, and 6) congestive heart failure. Each scenario was followed by five questions, and each question was intended to measure the judgments of our panel of consultants regarding the diagnostic value of the presence or absence of a clinical, laboratory or imaging finding. To develop the test, two investigators (SK and SFA) studied a series of systematic reviews containing a wide variety of clinical scenarios and the corresponding evidence-based LRs for the related clinical and paraclinical findings [19]. The investigators selected the scenarios and findings that could properly represent diagnostic challenges in the emergency room. Afterwards, they designed a SCT based on those scenarios and findings according to the recommended guidelines [15]. The only variation from the ordinary SCT was using 10-cm visual analog scales (VASs) instead of five-point Likert scales since both tools yield comparable measurements [20],[21] while VAS was also able to quantify the judgments of the reference panel. In addition to the main test, a separate sample scenario with two questions was developed and utilized to familiarize the participants with the test-taking process, so that they could completely understand how the test works before taking the main test (Figure 1).

Test validation

Prior to the main experiment, the content validity of the test was carefully evaluated and confirmed by an emergency medicine expert (PHM) and a medical education expert (KSA). In addition, we invited five expert participants to take the test again one month after the main experiment in order to measure the test-retest reliability. The correlation of the two sets of responses was measured using Spearman rank-order coefficient.

Data collection

After a brief orientation, the consultants received the main test in their private office and answered the questions while having no access to any medical resources. To answer each question regarding the diagnostic value of a finding, they marked a point on the VAS. The numbers equivalent to the VAS markings were identified using an ordinary ruler.

Analytical approach

The numbers representing the consultants' judgments were multiplied by 2 in order to rescale the original range of ?5 to 5 to a range of ?10 to +10. For the LR values, LRs >10 or <0.1 were considered 10 and 0.1, respectively, as we needed to establish a bounded LR range. This conversion seemed rational as an LR?=?10 is considered sufficiently large to rule-in a disease and an LR?=?0.1 is considered sufficiently small to rule-out a disease [22], and whether an LR is 10 or higher, or whether it is 0.1 or lower is not substantially different for clinical reasoning purposes. Subsequently, LR values were converted to `10xlog(LR) in order to convert their naturally geometric scale to an arithmetic scale. We used one-sample t test to compare the transformed mean judgments with the corresponding transformed LRs. In addition, the score of a hypothetical test-taker was calculated if his/her answers were based on evidence-based LRs, and the answers were scored based on the judgments of our consultants as the reference standard. The calculation is described elsewhere [15]. For statistical analysis, IBM SPSS Statistics 19 was used. A P?<?0.05 was considered significant.

Findings

Participant characteristics

Fifteen emergency medicine consultants consented to participate in our study, from which 13 consultants were board certified in emergency medicine and the other two consultants were board certified in internal medicine and pediatrics, respectively, with additional fellowship training in emergency medicine. The mean age, clinical practice experience, and emergency medicine experience were 35.9, 10.3, and 6.6 years, respectively. The Spearman coefficient was 0.83 for the two sets of answers from a subset of five consultants.

Comparison of the reference panel judgments against the evidence-based LRs

We have summarized the results of comparing values representing consultants judgments with evidence-based LRs in Table 1. Our results showed that in 22 out of 30 (73.3%) findings, the mean judgments were significantly different from the corresponding LRs. Our results also demonstrated that consultants overestimated the value of the 9 (30%) findings and underestimated the value of another 9 (30%) findings. In addition to the above discrepancies regarding the magnitude of the diagnostic value, the consultants chose a different direction (regarding ruling in or ruling out) for 4 (13.3%) findings compared to the evidence-based LRs.

Table 1 Comparison of the participants' judgments with likelihood ratios (LRs)

Full size table

Subgroups of positive and negative findings

When positive and negative findings (presence or absence of findings) were considered separately, we noted a significant difference between the consultants' judgments and the LRs in 17 out of 20 (85%) positive findings and 5 out of 10 (50%) negative findings. The diagnostic values of 8 (40%) positive and 1 (10%) negative findings were overestimated and the values of 7 (35%) positive and 2 (20%) negative findings were underestimated by the consultants. Moreover, the judgments were in opposite direction to the LRs in 2/20 (10%) and 2/10 (20%) of the positive and negative findings, respectively.

Subgroups of history, physical examination, and laboratory findings

When we calculated the percentage of significantly different consultants' judgments from the corresponding LRs in subgroups of findings from history, physical examination, and laboratory/imaging findings, we observed comparable percentages for findings of history (6 out of 8: 75.0%), physical examination (10 out of 14: 71.4%), and laboratory/imaging (6 out of 8: 75%). However, physical examination findings were more frequently overestimated (25%, 35.7%, and 25% for history, physical examination, and laboratory findings, respectively) and less frequently underestimated (37.5%, 21.4%, and 37.5% for history, physical examination, and laboratory findings, respectively).

The score of the hypothetical test-taker

The calculated score of a hypothetical test-taker was 21.73 out of 30 based on the consultants' judgments as the reference standard. The categorization of LRs, the score of each category, and the calculated score for each answer are summarized in Table 2.

Table 2 Calculation of the score of a hypothetical test-taker

Full size table

Discussion

In summary, we observed that in a considerable proportion of the questions, the consultants' judgments regarding the value of the findings were significantly different from the corresponding evidence-based LRs; the differences included discrepant magnitude (over/underestimation) and also discrepant direction. Moreover, the value of the physical examination findings was more frequently overestimated and less frequently underestimated. This is possibly due to a popular attitude that the objective clinical findings are more valuable than paraclinical findings in the diagnosis process [23]. Furthermore, we showed that if a hypothetical test-taker had answered the test based on evidence-based LRs and his/her answers were evaluated using the consultants' judgments as a reference standard, the test-taker would get approximately two-thirds of the total score.

Previous studies have investigated aspects of SCT such as comparing the answer keys obtained from panels with different levels of expertise [24], optimizing the answer keys [25], improving the development of the scoring key [26], investigating the effect of variability within the reference panel [27], and validating the test in different clinical fields [1]-[14]. However, to our knowledge, no study had challenged the reference standard of SCT by evidence before our study. Clinical decision-making is a complex process influenced by both clinical knowledge and experience. As physicians collect experience by practicing medicine, their knowledge may be outdated [28],[29]. Therefore, while the judgments of expert physicians benefit the most from valuable experiences, they may suffer from outdated knowledge and also cognitive biases [30]-[32]. A recent review has discussed the potential pitfalls of using SCT as a valid tool to measure clinical reasoning competencies, among which is implicitly discouraging the seeking of empirical evidence for the scoring key since this test assumes no single correct answer for any item [33].

Limitations

Despite the novel idea and methods, this study had the following limitations: A) Although the transformations in the judgment numbers and the LRs made these two entities comparable, the transformations could have introduced bias in the results. Knowing this limitation, we found no alternative approach to compare the consultant's judgments with evidence-based LRs. B) The optimal number of the clinical scenarios and the questions per each scenario is reported to be 20 and 3, respectively [15]. However, we used six scenarios and five questions per scenario because this test structure needed less time and could better address the time limitations of our consultant participants. C) The results were derived from only two centers in Tehran and therefore they cannot be easily generalized to all settings. D) As this study was carried out in emergency medicine context that has substantial differences with other specialties, our findings cannot be directly extrapolated to other fields of clinical medicine.

Conclusions

SCT is an interesting tool to score the clinical decision-making practices of novice trainees based on the judgments of expert physicians. However, experts' judgments may occasionally be inconsistent with evidence. This should raise concerns regarding the validity of the experts' judgments as a valid reference standard for SCT. We suggest future investigators should explore alternative evidence-based approaches to establish more robust reference standards for clinical reasoning tests such as the script concordance test in the field of emergency medicine.

Authors contributions

SFA conceived of the study, participated in the design of the study, performed the statistical analysis, and drafted the manuscript. SK conceived of the study, participated in the design of the study, and contributed to the data collection, statistical analysis, and draft of the manuscript. KSA contributed to conceive and design of the study and critically reviewed and revised the manuscript. PHM participated in design of the study and critically reviewed and revised the manuscript. GZ contributed to the data collection and draft of the manuscript. PH contributed to the data collection and data analysis. DBB contributed to the data collection and draft of the manuscript; HRB contributed to conceive of the study and critically reviewed and revised the manuscript. SL critically reviewed and revised the manuscript. All authors read and approved the final manuscript.

Authors information

SL is a professor of emergency medicine and public health at the University of California, Irvine's School of Medicine. PHM is an associate professor of emergency medicine and deputy for education at Iran University of Medical Sciences School of Medicine. KSA is a professor of medicine and medical education and head of the Department of Medical Education at Iran University of Medical Sciences. HRB is an associate professor of clinical epidemiology and evidence-based medicine at Iran University of Medical Sciences. The other authors were medical students at the time of conducting this study.

Abbreviations

LR:: likelihood ratio
SCT:: script concordance test
VAS:: visual analog scale.

References

Boulouffe C, Doucet B, Muschart X, Charlin B, Vanpee D: Assessing clinical reasoning using a script concordance test with electrocardiogram in an emergency medicine clerkship rotation. Emerg Med J 2013, 31: 313–316. 10.1136/emermed-2012-201737
Article PubMed Google Scholar
Humbert AJ, Besinger B, Miech EJ: Assessing clinical reasoning skills in scenarios of uncertainty: convergent validity for a script concordance test in an emergency medicine clerkship and residency. Acad Emerg Med 2011, 18: 627–634. 10.1111/j.1553-2712.2011.01084.x
Article PubMed Google Scholar
Carriere B, Gagnon R, Charlin B, Downing S, Bordage G: Assessing clinical reasoning in pediatric emergency medicine: validity evidence for a script concordance test. Ann Emerg Med 2009, 53: 647–652. 10.1016/j.annemergmed.2008.07.024
Article PubMed Google Scholar
Park AJ, Barber MD, Bent AE, Dooley YT, Dancz C, Sutkin G, Jelovsek JE: Assessment of intraoperative judgment during gynecologic surgery using the script concordance test. Am J Obstet Gynecol 2010,203(240):240. e241-246
PubMed Google Scholar
Mathieu S, Couderc M, Glace B, Tournadre A, Malochet-Guinamand S, Pereira B, Dubost J-J, Soubrier M: Construction and utilization of a script concordance test as an assessment tool for dcem3 (5th year) medical students in rheumatology. BMC Med Educ 2013, 13: 166. 10.1186/1472-6920-13-166
Article PubMed Central PubMed Google Scholar
Duggan P, Charlin B: Summative assessment of 5th year medical students' clinical reasoning by script concordance test: requirements and challenges. BMC Med Educ 2012, 12: 29. 10.1186/1472-6920-12-29
Article PubMed Central PubMed Google Scholar
Nouh T, Boutros M, Gagnon R, Reid S, Leslie K, Pace D, Pitt D, Walker R, Schiller D, MacLean A, Hameed M, Fata P, Charlin B, Meterissian SH: The script concordance test as a measure of clinical reasoning: a national validation study. Am J Surg 2012, 203: 530–534. 10.1016/j.amjsurg.2011.11.006
Article PubMed Google Scholar
Piovezan RD, Custdio O, Cendoroglo MS, Batista NA, Lubarsky S, Charlin B: Assessment of undergraduate clinical reasoning in geriatric medicine: application of a script concordance test. J Am Geriatr Soc 2012, 60: 1946–1950. 10.1111/j.1532-5415.2012.04152.x
Article PubMed Google Scholar
Bursztejn AC, Cuny JF, Adam JL, Sido L, Schmutz JL, de Korwin JD, Latarche C, Braun M, Barbaud A: Usefulness of the script concordance test in dermatology. J Eur Acad Dermatol Venereol 2011, 25: 1471–1475. 10.1111/j.1468-3083.2011.04008.x
Article PubMed Google Scholar
Humbert AJ, Johnson MT, Miech E, Friedberg F, Grackin JA, Seidman PA: Assessment of clinical reasoning: a script concordance test designed for pre-clinical medical students. Med Teach 2011, 33: 472–477. 10.3109/0142159X.2010.531157
Article PubMed Google Scholar
Kania RE, Verillaud B, Tran H, Gagnon R, Kazitani D, Huy PTB, Herman P, Charlin B: Online script concordance test for clinical reasoning assessment in otorhinolaryngology: the association between performance and clinical experience. Arch Otolaryngol Head Neck Surg 2011, 137: 751–755. 10.1001/archoto.2011.106
Article PubMed Google Scholar
Lambert C, Gagnon R, Nguyen D, Charlin B: The script concordance test in radiation oncology: validation study of a new tool to assess clinical reasoning. Radiat Oncol 2009, 4: 7. 10.1186/1748-717X-4-7
Article PubMed Central PubMed Google Scholar
Lubarsky S, Chalk C, Kazitani D, Gagnon R, Charlin B: The script concordance test: a new tool assessing clinical judgement in neurology. Can J Neurol Sci 2009, 36: 326–331.
Article PubMed Google Scholar
Meterissian SH: A novel method of assessing clinical reasoning in surgical residents. Surg Innov 2006, 13: 115–119. 10.1177/1553350606291042
Article PubMed Google Scholar
Fournier JP, Demeester A, Charlin B: Script concordance tests: guidelines for construction. BMC Med Inform Decis Mak 2008, 8: 18. 10.1186/1472-6947-8-18
Article PubMed Central PubMed Google Scholar
Bowen JL: Educational strategies to promote clinical diagnostic reasoning. N Engl J Med 2006, 355: 2217–2225. 10.1056/NEJMra054782
Article CAS PubMed Google Scholar
Oxman AD, Guyatt GH: The science of reviewing research. Ann N Y Acad Sci 1993, 703: 125–133. Discussion 133-124 10.1111/j.1749-6632.1993.tb26342.x
Article CAS PubMed Google Scholar
Zehtabchi S, Kline JA: The art and science of probabilistic decision-making in emergency medicine. Acad Emerg Med 2010, 17: 521–523. 10.1111/j.1553-2712.2010.00739.x
Article PubMed Google Scholar
Simel DL, Rennie D: Rational clinical examination: Evidence-based clinical diagnosis. McGraw-Hill, Chicago; 2009.
Google Scholar
van Laerhoven H, van der Zaag-Loonen HJ, Derkx BHF: A comparison of Likert scale and visual analogue scales as response options in children's questionnaires. Acta Paediatr 2004, 93: 830–835. 10.1111/j.1651-2227.2004.tb03026.x
Article CAS PubMed Google Scholar
Guyatt GH, Townsend M, Berman LB, Keller JL: A comparison of Likert and visual analogue scales for measuring change in function. J Chronic Dis 1987, 40: 1129–1133. 10.1016/0021-9681(87)90080-4
Article CAS PubMed Google Scholar
Straus SERW, Glasziou P, Haynes RB: Diagnosis and screening. In Evidence-based medicine: how to practice and teach EBM. 3rd edition. Elsevier, London; 2005:67–99.
Google Scholar
Hampton JR, Harrison MJ, Mitchell JR, Prichard JS, Seymour C: Relative contributions of history-taking, physical examination, and laboratory investigation to diagnosis and management of medical outpatients. Br Med J 1975, 2: 486–489. 10.1136/bmj.2.5969.486
Article PubMed Central CAS PubMed Google Scholar
Petrucci AM, Nouh T, Boutros M, Gagnon R, Meterissian SH: Assessing clinical judgment using the script concordance test: the importance of using specialty-specific experts to develop the scoring key. Am J Surg 2013, 205: 137–140. 10.1016/j.amjsurg.2012.09.002
Article PubMed Google Scholar
Gagnon R, Lubarsky S, Lambert C, Charlin B: Optimization of answer keys for script concordance testing: should we exclude deviant panelists, deviant responses, or neither? Adv Health Sci Educ Theory Pract 2011, 16: 601–608. 10.1007/s10459-011-9279-2
Article PubMed Google Scholar
Charlin B, Gagnon R, Lubarsky S, Lambert C, Meterissian S, Chalk C, Goudreau J, van der Vleuten C: Assessment in the context of uncertainty using the script concordance test: more meaning for scores. Teach Learn Med 2010, 22: 180–186. 10.1080/10401334.2010.488197
Article PubMed Google Scholar
Charlin B, Gagnon R, Pelletier J, Coletti M, Abi-Rizk G, Nasr C, Sauve E, van der Vleuten C: Assessment of clinical reasoning in the context of uncertainty: the effect of variability within the reference panel. Med Educ 2006, 40: 848–854. 10.1111/j.1365-2929.2006.02541.x
Article PubMed Google Scholar
Straus SE, Glasziou P, Richardson WS, Haynes RB: Introduction. In Evidence-based medicine: How to practice and teach it. 4th edition. Churchill Livingstone, Edinburgh; 2010:1–12.
Google Scholar
Ramos K, Linscheid R, Schafer S: Real-time information-seeking behavior of residency physicians. Fam Med 2003, 35: 257–260.
PubMed Google Scholar
Graber M, Gordon R, Franklin N: Reducing diagnostic errors in medicine: what's the goal? Acad Med 2002, 77: 981–992. 10.1097/00001888-200210000-00009
Article PubMed Google Scholar
Norman GR, Eva KW: Diagnostic error and clinical reasoning. Med Educ 2010, 44: 94–100. 10.1111/j.1365-2923.2009.03507.x
Article PubMed Google Scholar
Nendaz M, Perrier A: Diagnostic errors and flaws in clinical reasoning: mechanisms and prevention in practice. Swiss Med Wkly 2012, 142: w13706.
PubMed Google Scholar
Lineberry M, Kreiter CD, Bordage G: Threats to validity in the use and interpretation of script concordance test scores. Med Educ 2013, 47: 1175–1183. 10.1111/medu.12283
Article PubMed Google Scholar

Download references

Acknowledgements

We would like to acknowledge Dr. Amir Nejati for his contributions in collecting data for this study.

Sources of funding

This study was the M.D. thesis of SK and was funded by Iran University of Medical Sciences. The authors have not received fund from any other source.

Author information

Authors and Affiliations

Center for Educational Research in Medical Sciences, Iran University of Medical Sciences, Tehran, 14496, Iran
Seyed-Foad Ahmadi, Shahin Khoshkish, Kamran Soltani-Arabshahi, Golara Zahmatkesh, Parisa Heidari & Hamid R Baradaran
Program in Public Health, Department of Population Health and Disease Prevention, University of California Irvine, 653 E. Peltason Dr, Irvine, 92697, CA, USA
Seyed-Foad Ahmadi
Klinik fur Innere Medizin III, Kardiologie, Angiologie und Internistische Intensivmedizin, Universitatsklinikum des Saarlandes, Homburg/Saar, 66421, Germany
Shahin Khoshkish
Department of Emergency Medicine, Iran University of Medical Sciences, Tehran, 14496, Iran
Peyman Hafezi-Moghadam
Department of Neurology, Saarland University Medical Center, Homburg/Saar, 66421, Germany
Parisa Heidari
Kamyar Clinic, Tehran, 51406, Iran
Davood Baba-Beigloo
Department of Emergency Medicine, School of Medicine, University of California Irvine, Irvine, 92697, CA, USA
Shahram Lotfipour

Authors

Seyed-Foad Ahmadi
View author publications
You can also search for this author in PubMed Google Scholar
Shahin Khoshkish
View author publications
You can also search for this author in PubMed Google Scholar
Kamran Soltani-Arabshahi
View author publications
You can also search for this author in PubMed Google Scholar
Peyman Hafezi-Moghadam
View author publications
You can also search for this author in PubMed Google Scholar
Golara Zahmatkesh
View author publications
You can also search for this author in PubMed Google Scholar
Parisa Heidari
View author publications
You can also search for this author in PubMed Google Scholar
Davood Baba-Beigloo
View author publications
You can also search for this author in PubMed Google Scholar
Hamid R Baradaran
View author publications
You can also search for this author in PubMed Google Scholar
Shahram Lotfipour
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shahram Lotfipour.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

To view a copy of this licence, visit https://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Ahmadi, SF., Khoshkish, S., Soltani-Arabshahi, K. et al. Challenging script concordance test reference standard by evidence: do judgments by emergency medicine consultants agree with likelihood ratios?. Int J Emerg Med 7, 34 (2014). https://doi.org/10.1186/s12245-014-0034-3

Download citation

Received: 16 February 2014
Accepted: 30 August 2014
Published: 26 September 2014
DOI: https://doi.org/10.1186/s12245-014-0034-3

Challenging script concordance test reference standard by evidence: do judgments by emergency medicine consultants agree with likelihood ratios?

Abstract

Background

Findings

Conclusions

Introduction

Methods

Study design and setting

Test development

Test validation

Data collection

Analytical approach

Findings

Participant characteristics

Comparison of the reference panel judgments against the evidence-based LRs

Subgroups of positive and negative findings

Subgroups of history, physical examination, and laboratory findings

The score of the hypothetical test-taker

Discussion

Limitations

Conclusions

Authors contributions

Authors information

Abbreviations

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Competing interests

Authors’ original submitted files for images

Authors’ original file for figure 1

Rights and permissions

About this article

Cite this article

Keywords

International Journal of Emergency Medicine

Contact us

Challenging script concordance test reference standard by evidence: do judgments by emergency medicine consultants agree with likelihood ratios?

Abstract

Background

Findings

Conclusions

Introduction

Methods

Study design and setting

Test development

Test validation

Data collection

Analytical approach

Findings

Participant characteristics

Comparison of the reference panel judgments against the evidence-based LRs

Subgroups of positive and negative findings

Subgroups of history, physical examination, and laboratory findings

The score of the hypothetical test-taker

Discussion

Limitations

Conclusions

Authors contributions

Authors information

Abbreviations

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Competing interests

Authors’ original submitted files for images

Authors’ original file for figure 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

International Journal of Emergency Medicine

Contact us