Summer 1961

Walter Kroll


It was the purpose of this study to perform an item analysis on a women's physical education test concerning speed-a-way. The analysis will determine the difficulty and discrimination of each individual item in the test. From this analysis it can be determined: (1) which items discriminate between the high and low ranking groups, (2) which responses in the multiple-choice and matching items do not function properly, and (3) which items in the test should be revised or disregarded in further measurement processes. METHODS OF RESEARCH: The subjects utilized in this study were 279 women students enrolled in the Fundamentals of Physical Education classes at Fort Hays Kansas State College during the fall term of the 1960-61 school year. The entire group of 279 students were administered the speed-a-way test and following appropriate test analysis procedures utilized in employing the Flanagan Technique for item analysis, only the upper and lower twenty-seven percent of the total group were utilized in the item analysis. The test item coverage was compared with a table of specifications, which was drawn up after the test had been constructed. Three women physical-education instructors assigned values to each subject-matter area that they felt should be covered in the test. The mean from these values was the figure utilized in determining the desired test content. After the "poor" items were omitted the test content was a gain compared to the table of specifications. This was performed to find the deviation of actual test content from the desired table of specifications. RESULTS: It was found that of the thirty-two true-false items contained in the test, seventeen of them should be revised or omitted before future testing with this instrument. Of the forty-five multiple choice type items, there were fifteen of them that should be revised or omitted before future testing with this instrument. Of the seventeen matching items contained in the test, there were seven of them that should be omitted or revised before again testing with this instrument. Of the eleven completion items, there were seven of them that should be revised or omitted before future testing with this instrument. It was found that the mean index of discrimination improved from .35 to .41 with the omission of the "poor" items. The mean difficulty rating also improved from 75 to 71 percent. These improvements indicate a slight but positive improvement. It was found that there were too many alternatives in the multiple choice and matching items which did not function properly. If the item had more than two alternatives which did not function properly, in most cases it was found that these items had a high difficulty rating which partially explains the lack of acceptance of the alternative responses. The comparison between the actual test content and the desired content showed a large deviation. The area of position play contained no questions while the area of rules contained too great a portion of the items. With the comparison of the revised test with the desired specifications, it was found that the deviation per area was further increased, but the large deviations were decreased. RECOMMENDATIONS: From the results obtained from this study the following recommendations are made: 1. To ensure content validity for a test a t able of specifications must be utilized to give adequate coverage to each subject-matter area. 2. Before a testing program is carried on the instructor should be familiar with item analysis procedures. 3. The true-false section of the test needs more questions devised which are of the proper discriminating power so they will bring the mean difficulty rating closer to fifty percent. 4. There is a need for better discriminating items in the multiple choice type items, especially in section three. The mean index of discrimination for section three should be increased before using it again. 5. The completion items of this test could be dispensed with and the subject-matter areas covered by these items could be utilized in other type questions. 6. The matching type items should be revised and the response alternatives changed in order to obtain proper functioning of the alternative responses. 7. In the multiple choice items there was a tendency for specific responses to have non-functioning power. All of the responses should be relevant to the question to insure their selection to meet function limits. 8. Instructors of physical education classes should accept the responsibility of evaluating their tests if they are to gain valid and reliable measurements.


