I. INTRODUCTION
The ICAO (International Civil Aviation Organization) has endorsed proficiency in international civil aviation since 2003, considering the causes of aircraft accidents and incidents are mainly related to the inadequate English language skills of the crew (ICAO, 2010). The central government of Republic of Korea has implemented EPTA (English Proficiency Test for Aviation), a requirement that pilots and air traffic controllers communicate in English as of 2006, complying with ICAO standards (Choi and Kim, 2020).
First, this paper briefly reviews the theoretical foundations of EPTA. The theoretical review encompassed such topics as test methods, proficiency levels, test ratings, and qualifications for test raters. Furthermore, this paper's qualitative analysis approach involved collecting data primarily through observation and interview. A survey of aviation experts who took overseas tests in aviation English was conducted. Observations were mostly based on comparing overseas test scores with EPTA scores in Korea. The study results and implications based on the results were discussed to propose ways to improve the validity and processes of EPTA exams in Korea.
II. THEORETICAL BACKGROUNDS
Whether direct or semi-direct testing methods are used, it is important that test-takers are evaluated in their use of language related to routine as well as unexpected or complicated situations as evidence of their level of proficiency.
Both direct and semi-direct tests, if well constructed, can elicit speech samples that may be assessed for proficiency in speaking and listening. Each test method has advantages and disadvantages.
Proficient speakers shall communicate effectively in voice-only (telephone or radiotelephone) and in face-to-face situations. The inflexibility arising from the use of standardized, pre- recorded prompts may result in an important limitation in the scope of evaluation available to semi-direct tests. This limitation may be particularly critical in the ability of the test to assess the full range of abilities covered by the “interactions” descriptors of the ICAO Rating Scale. Role-plays and simulations conducted in this mode may be short, unnatural and restricted to the most routine aspects of language use (ICAO Doc. 9835, 6.2.7.7).
The language proficiency of aeroplane, airship, helicopter and powered-lift pilots, flight navigators required to use the radiotelephone aboard an aircraft, air traffic controllers and aeronautical station operators who demonstrate proficiency below the Expert Level (Level 6) should be formally evaluated at intervals in accordance with an individual’s demonstrated proficiency level, as follows:
-
those demonstrating language proficiency at the Operational Level (Level 4) should be evaluated at least once every three years.
-
those demonstrating language proficiency at the Extended Level (Level 5) should be evaluated at least once every six years.
Formal evaluation is not required for applicants who demonstrate expert language proficiency, for example, native and very proficient non-native speakers with a dialect or accent intelligible to the international aeronautical community.
To fulfill licensing requirements, rating should be carried out by a minimum of two raters. A third expert rater should be consulted in the case of divergent scores. Ideally, an aviation language test will have two primary raters — one language expert and one operational expert — and a third rater who can resolve differences between the two primary raters’ opinions (ICAO Doc. 9835).
Whether rating is conducted “live” during the assessment, or after the test using recordings of the test performance, the rating process should be documented. The test rating process should be documented, and the documentation should include instructions on the extent and nature of evidence that raters should collect (ICAO Doc. 9835). The score reporting process should be documented and scores maintained for the duration of the license. All proficiency tests of speaking ability involving interaction between the test-taker and interlocutor during the test should be recorded on audio or video media.
According to licensing requirements, rating should be carried out by a minimum of two raters. A third expert rater should be consulted in the case of divergent scores. An aviation language test will have two primary raters — one language expert and one operational expert and a third rater who can resolve differences between the two primary raters’ opinions. Initial and recurrent rater training should be documented. The rater training records should be maintained and audits of raters should be conducted and documented periodically (ICAO Doc. 9835).
Raters should demonstrate language proficiency of at least ICAO Extended Level 5 in the language to be tested. If the test is designed to assess ICAO Level 6 proficiency, raters should demonstrate language proficiency at ICAO Expert Level 6. Interlocutors should demonstrate language proficiency of at least ICAO Extended Level 5 in the language to be tested and proficiency at Expert Level 6 if the test is designed to assess ICAO Level 6 proficiency (ICAO Doc 9835).
In the case of semi-direct test prompts (which are pre-scripted and pre-recorded), there should be adequate versions to meet the needs of the population to be tested with respect to its size and diversity. It is not practical to prescribe the number of versions or test prompts required for any specific test situation. The determination of what is adequate in any situation is dependent on specific circumstances.
It is common in large testing initiatives for a testing service to use a version of a test only once before retiring it. In other cases, a testing service develops a number of versions, then recycles them randomly. Test-takers may then generally know the sorts of questions and prompts they will encounter during a test, but will be unable to predict the specific questions and prompts they will encounter during a particular testing interaction.
One security measure that testing organizations may take is to always include at least one completely new prompt or question in every version. A pattern of test-takers achieving high scores on most or all test prompts or questions, but failing the new prompt, may indicate a breach in test security (ICAO Doc 9835).
It is acceptable that a test contains a scripted task in which phraseology is included in a prompt, but the test should not be designed to assess phraseology. An aviation language proficiency test has different aims than a phraseology test. While an aviation language test can include some phraseology as prompts or scene setters, the purpose of the test is to assess plain language proficiency in an operational aviation context.
It is acceptable that a test of plain language in a work-related context could contain a scripted test task or a prompt in which standardized phraseology is included. The test task may be used as a warm-up or as a means of setting a radiotelephony context in which to elicit plain language responses from the test-taker. If phraseology is included in a test prompt, care should be taken that it is used appropriately and that it is consistent with ICAO standardized phraseology.
The compliance with ICAO standardized phraseology is not fully harmonized on a worldwide basis. States publish differences with respect to ICAO Standards. Additionally, users, particularly expert speakers of a language, for all sorts of respectable reasons such as pressure of work, and less respectable reasons such as carelessness and insensitivity, fail to adhere to ICAO standardized phraseology, thereby creating possibilities for misunderstanding in a busy international environment.
As part of the 2006 Aviation Safety Act, the MOLIT (Ministry of Land, Infrastructure, and Transportation) required pilots and air traffic controllers operating between at least two countries among aeronautical communications to obtain a certificates of proficiency in aviation English (MOLIT, 2021). Table 1 displays EPTA's progress. MOLIT began testing pilots and air traffic controllers with EPTA in 2006 using two private testing agencies, G-TELP (General Test of English Language Proficiency) and IAES (International Aviation English Service).
To address the problems that arise in the operation of the aircraft English certification system in 2006, MOLIT reviewed and analyzed the EPTA system. As part of the improvement plan, EPTA will be improved through increased communications with ATC (air traffic control), implementing CBT (computer-based testing) for levels 4 and 5, integrating listening and speaking competencies, and continuing interview procedures for level 6 (MOLIT, 2018).
Following a review of the test system and communications with air traffic controllers, pilots, and consultants, MOLIT has been conducting a CBT EPTA since 2019 (MOLIT, 2021).
Qualitative research refers to a method of inquiry in which the researcher, acting as data collection instrument, seeks to answer questions about how or why a particular phenomenon occurs (Rahman, 2016). Questions regarding of what a phenomenon is comprised may also guide qualitative research.
The most fundamental assumption underlying qualitative research is that reality is something socially constructed on an individual basis. Varied methods of qualitative research exist (Leech et al., 2011).
The most fundamental assumption underlying qualitative research is that reality is something socially constructed on an individual basis. Varied methods of qualitative research exist (Leech et al., 2011). Examples of qualitative methods employed in academic research include grounded theory, phenomenology, ethnography, and qualitative description. Each method has its own assumptions and purposes and an appropriate method is chosen based on the research question. For example, a researcher investigating the process involved in the occurrence of a phenomenon would likely choose grounded theory, while a researcher who are interested in the meaning of the phenomenon would utilize phenomenology. Regardless of method, participants are purposefully enrolled based on their familiarity with the phenomenon (Donath et al., 2011).
Data are generally collected via one or a combination of three mechanisms: interviews, observation, or theory review. Data are analyzed inductively via specific, rigorous techniques and then organized in a manner which best answers the research question (Anyan, 2013; Wang, 2006). The objective of qualitative research is not the accumulation of information, but the growth of understanding about phenomena of concern to EPTA exam.
The specialists chosen by authors were the part of the core evaluation team and participated in all aspects of the qualitative research design, implementation, data analysis and dissemination of EPTA results. On the major technical and implementation issues, the specialists were expected to work independently and to engage with a wide-range of stakeholders at ICAO, airlines and a central government agency in the field sites in EPTA cross-sectional project locations. The qualitative research activities would take place in selected research locations where the specialists had taken an EPTA exam.
Before performing qualitative analysis, such 15 experts in Table 2 including a few senior captains were chosen and assigned to take overseas exams. Researchers in this paper analyzed overseas aviation English scores submitted by the experts including the score obtained by EPTA exam in Republic of Korea.
Test results in Table 2 means that test results of the experts who already have taken overseas aviation English tests were converted into EPTA equivalent score level.
From the interviews with subject matter experts, it appears that CBT EPTA has benefits and areas that need improvement. Table 2 shows the advantages and areas for improvement. As a result of the test being operated by a national organization, the test has been improved in many aspects such as fairness and public confidence. There is much less variation depending on the interviewers since recorded voices remain the same.
A set of sample questions containing increased exercises is also helpful for practicing the test. The level of difficulty is adjusted appropriately depending on the job performance. In addition, MOLIT's Preparation Guide for Aviation English Proficiency Test and Standard Handbook for EPTA have proved extremely useful in preparation. As a result of increased communication among ATC members, the situation development is realistic and valid, and standardized terminology is used more often.
From the experts' interviews, it is clear that the areas needing improvement are listed in Table 3. The number of testing facilities needs to be increased, and promotion should be improved, including test guides, web pages, and scoring criteria. It is necessary to improve tests such as adding machine voices, standardizing phraseologies, allowing general English to be used in emergency situations, and allowing level 6 Interviews to be administered to all levels of evaluation. as in Table 4. Improvement in interaction is necessary.
It is also critical to increase the number of sample questions and answers, with recorded voices, and to establish an educational institution focused on aviation English with certification and public trust for test takers who struggle with tests.
III. CONCLUSION
The purpose of this study is to review improvements for EPTA and the needs of EPTA. In the process of interviewing experts for an effective use of EPTA, we came to the following conclusions regarding EPTA's needs and improvement measures.
Firstly, the subject matter expert agrees that CBT EPTA has been improved and has been helpful in improving aviation safety. The CBT EPTA has been changed to reflect the suggestions of the line and the experts in order to achieve the safety objective. Secondly, we can increase the degree of fairness and public trust because the task is appropriately adjusted for the job performance and the test is administered by a national organization. Thirdly, increasing ATC-related questions will enhance ATC communication. In addition, the participants appreciated the fact that MOLIT has published a Preparation Guide for Aviation English Proficiency Test and a Standard Handbook for EPTA to help with preparation. Lastly, certain improvements need to be implemented, such as increasing the test facilities, improving the test, promoting the web pages, and expanding the sample questions. Education institutions need to be added for Aviation English as well.
EPTA was found to be improved and helpful based on the qualitative analysis. The results of this study are consistent with the results of quantitative studies in previous studies that found EPTA had improved as a result of the survey (Lee and Choi, 2019). If ATC and aviation English capabilities can be integrated, this will improve safety and communication skills as well as continue the improvements on CBT EPTA as safety oriented ICAO member state.
The study is limited in the sense that it is a qualitative study based on interviews with a small group of subject matter experts because there are few aviation specialists on EPTA. It would be more fruitful to increase the number of samples for future research. It is nice to see that the result shows that the progress of EPTA is improved. We hope that the continuous improvement of EPTA will lead to safer air traffic control communication to avoid aviation accidents.