Statistics from Altmetric.com
In accordance with best practice in medical assessment, the examination for Membership of the UK Faculty of Sexual & Reproductive Healthcare (MFSRH) has evolved over recent years, refining the test to be as reliable and valid as possible and to satisfy the UK regulatory body, the General Medical Council (GMC). New evidence-based developments that enhance the robustness of tests become incorporated into the MFSRH examination on the advice of educationalists.1–4
This article introduces the Part 1 Single Best Answer Question (SBA) paper. To assist membership examination candidates further, a more detailed discussion may be found in the examination section of the FSRH website (http://www.fsrh.org).
Achieving the MFSRH qualification shows that the successful candidate is capable of providing sophisticated community sexual and reproductive healthcare (cSRH) at a senior level. With the advent of the new cSRH training programme in 2010, the MFSRH qualification became integral to progression through the various stages of the 6-year ‘run through’ programme. In this, trainees progress from junior levels to a more senior level in a wide-ranging, full-time programme that covers women's health, obstetrics, contraception, sexually transmitted infections, public health, service delivery, ethics and the law, among other topics. The Part 1 examination is a requirement for progression from training programme Year 3 (Specialist Training Year 3 – ST3) to Year 4, while the Part 2 examination is required for progression from Year 5 to Advanced Training in Year 6. Doctors practising in disciplines allied to SRH may also sit for the qualification, and are encouraged to do so, but they should be aware that the assessments are structured to determine the knowledge, skills and attitudes of the cSRH trainee, and the syllabus is derived from the cSRH training programme curriculum.
The SBA question
The Part 1 MFSRH is a written examination of the multiple-choice type, the ‘single best answer’ replacing the former ‘multiple-true/false’ format. An example of a SBA question appears below.
The questions are based around a stem or clinical vignette, in red in the example. The ‘lead-in statement’ is a question, black in the example, and the ‘options’ are five possible answers, shown in blue. Basic sciences are also tested in this way.
The options will all be plausible and all in the same broad category. The four ‘wrong’ (not best) answers are ‘distractors’ and a good SBA question should be correctly answerable by a capable candidate without seeing the options. This is known by educationalists as the ‘cover test’, and examiners are instructed to write questions suited to such a strategy. Examiners are also asked to write, and have training in writing, clear and fair questions at an appropriate level, and to put the detail in the stem and not in the answers. New questions are submitted by examiners and worked on in groups at question-writing sessions using the Delphi technique, modification and improvement by iterative peer review.
It should be noted that the distractors are not in the literal sense ‘wrong’. In fact they may have a certain degree of appropriateness along a continuum from correct to incorrect. In more difficult questions the distractors may lie relatively close to the correct answer but distinct from it. This therefore is the sense of the SBA, in that the sought-for answer is ‘more correct’ than the others. The candidate marks the answer sheet and this is scanned and computer marked automatically. There is no ‘negative marking’.
In the Part 1 MFSRH, the SBA:
Is based on an evaluation of symptoms, signs, results of investigations or basic science
Is designed to test reasoning skills rather than straightforward recall of facts
Uses cognitive processes similar to those used in clinical practice.
Examiners are also trained to avoid writing questions that aid the ‘test-wise’ candidate – a candidate who is good at guessing the answer, not from their knowledge but from their ability to read a question to guess the correct answer from its format.
Case and Swanson put the issue of multiple true/false questions versus SBA rather succinctly:
“While many item writers believe the true/false items are easier to write than one-best-answer items, it is found that they are often more problematic. The item writer had something particular in mind when the question was written, but careful review commonly reveals subtle difficulties that were not apparent to the item author. Often the distinction between ‘true’ and ‘false’ is not clear, and it is not uncommon for subsequent reviewers to alter the answer.”4
Further, there is a reason that is even more compelling: to avoid ambiguity, multiple true/false question writers tend to assess recall of an isolated fact – something the SBA can actively avoid. Application of knowledge, integration, synthesis and judgement issues can better be assessed by the SBA format. The clinical vignette is also used for this very reason – it invokes higher-order thinking with application of knowledge similar to the clinical situation, as opposed to simple recall.
Blueprinting, standard setting and psychometric analysis
Each Part 1 examination is ‘blueprinted’ to ensure adequate testing of topics and domains in the training programme curriculum and in clinical practice (Table 1), and to ensure that an appropriate spread of each syllabus module is represented. The domains of the blueprint represent the processes inherent in clinician–patient interaction and therefore test the thought processes and other actions gone through by the clinician in day-to-day practice.
In practice, the degree of difficulty of the resulting paper will vary. This is inevitable in any assessment and there is therefore no fixed pass mark for the Part 1 examination, nor does a certain percentage of candidates pass. Instead, the paper is ‘standard set’, as is done nowadays in most medical examinations. The pass mark is not an arbitrary figure, but is empirically justified following completion of the final draft of the paper. Of the various strategies available to set the pass mark, the modified Angoff method is used in the Part 1 MFSRH and requires the assembly of a group of subject matter examiners who are asked to evaluate each item and estimate the proportion of minimally competent examinees that would answer the item correctly. Rarely there may be marked disagreement between examiner scores and this is then moderated by discussion under the guidance of a facilitator. The ratings are then averaged across raters for each item to obtain a panel-recommended raw ‘cut score’: this becomes the pass mark for that session of the examination.4 The level of expertise of a practising cSRH consultant becomes irrelevant here. The standard set for the examination is not about high performance in all candidates, but about assessing the level of knowledge of the candidate who is just capable of providing a safe standard of clinical practice.
Psychometric analyses as applied to examinations are well-recognised techniques used as part of good assessment practice and quality control. They comprise a question-by-question analysis, conducted after marking, that tests the marks gained by the candidates against each question. Various properties of a question may be tested, including such criteria as its difficulty, its capacity to discriminate the knowledgeable candidate from the less knowledgeable, whether the group of candidates as a whole tended to guess the answer, and other instances of bias. It also enables the overall reliability of the examination to be calculated (‘Cronbach's α’). The purpose of this analysis is two-fold. First, it enables a decision to be made to see whether any question is so poorly performing that its removal would significantly improve the reliability of the examination. Second, it field-tests the examination's question bank by assessing a sample of the bank, the ‘sample’ being the particular cohort of questions used at that examination session. The identification of poorly-performing questions is uncommon as the original question-writing process is rigorous, but any questions identified in this way may be modified to render them better performing and returned to the bank for use in future examinations, or removed from the bank if they are judged irredeemably poor.
Examiner training, psychometric analysis, peer review and the review of bank questions and papers by external examiners, educationalists and statisticians all contribute to quality assurance. This quality assurance, present at all levels in the examination, together with other evidence-based assessment strategies, aims to deliver a test that candidates, their future employers, the GMC and also patients will see as fair, valid and reliable.
The author wishes to thank Dr Patricia Revest for the question in Example 1 and for Table 1, and Dr Mary Jensen, convenor of the Part 1 MFSRH Examination group, for scrunitising this article and for her suggestions. Finally, thanks go to the members of that group who do so much to ensure that the examination is robust and fair.
Competing interests None declared.
Provenance and peer review Not commissioned; externally peer reviewed.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.