Measuring the impact of an instructional laboratory on the learning of introductory physics

We have analyzed the impact of taking an associated lab course on the scores on final exam questions in two large introductory physics courses. Approximately a third of the students who completed each course also took an accompanying instructional lab course. The lab courses were fairly conventional, although they focused on supporting the mastery of a subset of the introductory physics topics covered in the associated course. Performance between students who did and did not take the lab course was compared using final exam questions from the associated courses that related to concepts from the lab courses. The population of students who took the lab in each case was somewhat different from those who did not enroll in the lab course in terms of background and major. Those differences were taken into account by normalizing their performance on the lab-related questions with scores on the exam questions that did not involve material covered in the lab. When normalized in this way, the average score on lab-related questions of the students who took the lab, in both courses, was within 1% of the score of students who did not, with an uncertainty of 2%. This result raises questions as to the effectiveness of labs at supporting mastery of physics content.


PHYSICS EDUCATION RESEARCH SECTION
The Physics Education Research Section (PERS) publishes articles describing important results from the field of physics education research. Manuscripts should be submitted using the web-based system that can be accessed via the American Journal of Physics home page, http://ajp.dickinson.edu, and will be forwarded to the PERS editor for consideration.

I. INTRODUCTION
Instructional labs are a major part of undergraduate physics education, particularly at the introductory level. They involve substantial instructional resources due to the need for dedicated space and equipment and a relatively low student-to-instructor ratio. As such, it is important to measure their actual educational contribution to ensure these resources are being used wisely.
To many physicists, instructional labs are considered essential for a proper physics education, and they are required for AP physics courses and as part of many state K-12 science standards. There has been little research, however, on whether learning is promoted by such laboratories. America's Lab Report 1 reviewed the research on science instructional laboratories and found that there was little research in general, and what there was provided little evidence of effectiveness, particularly for stand-alone laboratory courses, though the research is very thin and most of it is relatively old. [2][3][4][5] We are distinguishing lab courses-those held in different locations and times from the introductory course lectures, and either receiving a separate course grade or a grade that is a composite of the two parts of the course-from a workshop or studio approach. [6][7][8][9][10] In workshop or studio physics courses, there is complete integration of all aspects of the course, which includes a number of small experiments that are carried out in the classroom. There is solid evidence of the learning achieved in such courses, but because of their integrated nature it is impossible, and arguably meaningless, to try to isolate what component of the course is the "experimental" portion. Given the added resources and novel teaching methods needed for such courses, it is unlikely most institutions will be adopting them in the near future so it remains important to examine the learning that is produced in stand-alone lab courses.
As noted in America's Lab Report, the biggest challenge with regard to research on the educational effectiveness of labs is the lack of clarity with regard to the educational goals. 1 The instructional lab community often emphasizes using experiments to present particular physics concepts, especially to reinforce material from lecture. Learning physics concepts through lab experiments might resemble learning through demonstrations. Although not all demonstrations improve learning, interactive lecture demonstrations, where students predict the outcome of a demonstration prior to seeing it and then reflect on the results, have been shown to improve students' conceptual understanding. [11][12][13] It is possible that seeing physical phenomena when carrying out a hands-on experiment might lead to similar gains in physics knowledge, especially if students are asked to make predictions and reflect on results in a similar way.
While many instructional labs, including the ones studied in this paper, present explicit goals for content mastery, there are often many additional goals for labs, though they are not always explicitly stated. If one judges the goals by what is required and counted in grading, it is customary to have a lab course also include the following goals: learning to keep a lab notebook; carrying out proper analysis of experimental uncertainty; learning to work with complex equipment; learning to take data in the optimum manner; learning to correctly interpret the comparison of uncertain data with mathematical models; learning to write up an experiment; and generally learning the process of science as an experimental activity. This diversity of goals was reflected in the American Association of Physics Teachers' (AAPT) published set of goals for introductory physics labs, 14 which were summarized as: the art of experimentation; experimental and analytical skills; conceptual learning; understanding the basis of knowledge in physics; and developing collaborative learning skills.
In recent years, there has been growing discipline-based education research (DBER) interest in instructional labs. Much of the research has explored student attitudes about labs, 15-19 student understanding of uncertainty, measurement, and data analysis, [20][21][22][23][24] and student development of scientific reasoning and experimentation skills. [25][26][27][28][29] The learning of the physics content, including the understanding and application of concepts, however, remains a common goal of physics labs. Presumably, this is because many physicists see experimental research, both currently and in the past, as how physics knowledge is established, and so it is believed that replicating the discovery process will support student learning of the knowledge. Nevertheless, the effectiveness of such learning remains poorly evaluated. Because students are typically exposed to physics content through lectures, labs, homework, and recitation sections, isolating the contribution from labs is challenging. Here, we took advantage of some particularly favorable circumstances to measure the impact of learning physics content from a lab course that was relatively standard in its structure, but optimized in a number of respects.
In this study, we examined two stand-alone introductory physics lab courses that were both closely coordinated with the standard first year mechanics and electricity and magnetism (E&M) courses for students planning to major in engineering or most of the sciences. The goals, design, and structure of these lab courses are unusually well defined as detailed below, and focused on increased understanding of the concepts covered in the mechanics and E&M courses. We examined the impact of taking the two lab courses on students' performance on the exams in the two respective lecture courses.

II. SETTING
The context for this study is the first and second term courses of the introductory calculus-based physics sequence at a large elite institution. The first course is the primary introductory mechanics course, "Physics 1," and the second is the primary introductory E&M course, "Physics 2." More than 500 students take each of these courses each year, most of whom are intending to major in various fields of engineering. About half the physics majors and some other science majors (such as chemistry) also take these courses, but they make up a small fraction of the enrollment. The Physics 1 course covers the standard mechanics topics, from vectors up through rotational motion. The Physics 2 course covers topics from electrostatics and simple ac and dc circuits up through Maxwell's equations. The Physics 1 course is prerequisite for Physics 2. The textbook for both courses is Knight, Introduction to Physics (volumes 1 and 4), a widely used textbook. Both instructors used interactive engagement methods in lecture, such as peer instruction (supported by clickers) and interactive demonstrations. The courses are also accompanied by a one-hour recitation session, which primarily focuses on solving problems related to the lecture material while working in small groups, supported by a teaching assistant (TA).
Accompanying both lecture courses is an associated laboratory course, also led by TAs. These are one-credit standalone courses with their own grade (pass/fail) taken by approximately one third of the students who take the respective lecture course (Physics 1 lab: 211 students out of 571 students in Physics 1; Physics 2 lab: 129 students out of 530 students in Physics 2). The Physics 1 lab is not pre-requisite for the Physics 2 lab, so students in either lab may or may not have had previous university lab experience. The reason for taking the lab is primarily determined by the student's major and career goals. The labs are well coordinated with the lecture course in timing and content. Both lab courses involve nine experiments that cover conventional topics and utilize standard experimental design, with goals explicitly focused on improving understanding of a number of key concepts covered in the lecture courses, as shown in the examples below (the full list is provided in the Appendix).
• "The goals of this lab are to understand: The vector nature of forces; the interaction of multiple forces in two dimensions." • "Energy is neither created nor destroyed but can be converted from one form to another. In this lab, you will explore two different forms of energy, kinetic and potential. (Knight sections 10.2, 10.4, 10.5)." • "The purpose of this lab is to: understand how charges distribute on a conductor; map electric potentials for various charge configurations; and develop an intuitive understanding of the relationship between fields and potentials."

III. LAB STRUCTURE
Most of the student lab work takes place during the scheduled lab period, which is a two-hour block of time each week led by one TA. All TAs take part in a 1-unit "Teaching of Physics" seminar course prior to their first quarter of teaching. In addition to discussing interactive teaching and learning techniques, the TAs observe other TAs in lab and/or discussion sections, work through grading activities, and teach a lab or discussion section in which they are observed and receive feedback. The TAs meet weekly with the course instructor, a head TA, and the lab coordinator to work through the lab activities. They also attend lecture regularly and run office hours, which helps them stay on top of the material. In each lab section, the TAs prepare a short introductory lecture to go over the lab's concepts and equipment. For the remainder of the time, they visit individual groups to answer questions or probe their understanding through Socratic questioning techniques.
About three quarters of the students have their lab class on Wednesday or Thursday, with the remainder on Friday, with the timing set to coordinate coverage of topics in the lab with that in the lecture classes. Each lab includes a pre-lab activity, which is to be completed before students enter the lab. The pre-lab activity provides information to prepare students for the lab experiment, focusing on introducing students to the equipment they will be using (e.g., one activity has questions targeted at understanding the sampling technique used by Logger Pro), technical skills they will need (e.g., one activity has questions targeted at understanding the nature of log-log plots), and the physics concepts they will be using in the experiment. Depending on the experiment, pre-lab activities range from a page of text with one or two probing questions, to several pages of introductory material and lab instructions to be read, to five pages of pre-lab questions to be completed. Many of the pre-lab activities include sequences of questions where students explore the relevant physics concepts to make predictions about what they will encounter in the lab experiment (e.g., through sketching curves and graphs or completing simple calculations).
Some of the activities were adapted from the Tutorials in Introductory Physics 30 or materials from the University of Illinois at Urbana-Champaign introductory physics labs. 31 The equipment is standard (Pasco carts, force and motion sensors, etc.) and is used in typical experiments, but with considerable focus on sense making and reflecting the goals of the labs. For example, there are regular, explicit prompts for students to compare their observations to predictions, or to look for discrepancies or surprises in their data and to explain their causes. To allow students to focus on the physics concepts, the experiments in the lab are kept quite simple (without complex equipment) and closely tied to the material presented in the lecture and text. The lab instructions and pre-lab activities often refer to specific associated sections in the text. There is minimal or no error analysis, and the write-ups of results only involve students filling out a worksheet. Logger Pro is often used for data collection to make data analysis more efficient, allowing students to spend more time on the concepts. The lab manuals provide detailed instructions, hints, or additional information to avoid issues or complications. For example, in an elastic collisions experiment in the Physics 1 lab, students are instructed to keep the initial speed of the carts relatively low to avoid inelastic collisions. In short, these labs are doing just about everything one might hope for in creating an instructional lab course designed to support the learning of the concepts presented in an associated lecture course.

IV. METHODOLOGY
Our goal is to determine the effect of taking the lab course on the performance on the Physics 1 and Physics 2 final exams. This is arguably a proxy for the impact of the lab on the learning of the respective physics. If the two populations (those that take the lab and those that do not) were equivalent, and all of the exam questions related only to the material covered in the lab course, we could simply look at the average final exam score for each of the two populations. However, this approach has two problems. First, the two populations are different; they tend to have different majors, with associated differences in preparation and attitudes. And second, the lecture courses, and correspondingly their final exams, cover significantly more material than what is covered by the lab courses. This second fact allows us to use the scores on the non-lab-related exam questions to normalize for the differences in the two populations, and then compare their performance on final-exam questions that involved concepts covered in the lab course.
The two courses have different instructors (neither of whom are the authors of this article) with different styles, but in both cases the exams are carefully vetted by both experienced TAs and another very experienced instructor in the department. The Physics 1 exam consisted of 15  multiple-choice and 5 open-response questions and the  Physics 2 exam consisted of 20 multiple-choice questions  and 6 short open-response questions. We coded the course exam questions according to whether or not they substantially involved a topic that was listed in the goals (Appendix) of one or more of the lab experiments. On five of the 20 final-exam questions for Physics 1, the primary concept involved in the question was a concept that was listed in one of the lab goals. On two of the questions, the concepts covered in the lab were half of what was necessary to solve the problem. In contrast, 11 of the 20 final exam multiple-choice questions for Physics 2 involved primary concepts that were listed in one of the lab goals. None of the Physics 2 openresponse questions involved concepts from the lab goals, so these were excluded from the analysis. We coded the final exam questions independently and obtained over 80% agreement, and after brief discussion to clarify what exactly was done in a couple of the labs, converged on 100% agreement.
We then calculated the averages of the scores on labrelated and non-lab-related exam questions for each of the two population groups on each exam. The scores were from the original grading in the course, which was done by TAs using rubrics and with considerable care taken to ensure consistency. The grading was done blind as to whether or not students were enrolled in the lab. The two split questions for Physics 1 were each weighted by one half in the calculation of both averages. Finally, we took the ratio (average score on lab-related questions)/(average score on non-lab-related questions) for each student and found the average ratio for the two populations.
The hypothesis is that the lab course should improve a student's understanding of content covered in the lab, and this improvement should be reflected in a higher score on exam questions involving those concepts. Hence, those students who took the lab should have a higher ratio than the students who did not. If the lab had no added value, then the ratio for the students who took the lab should be the same as the ratio for the students who did not take the lab. This will be true even if the populations are different.

V. DATA
The results for the final exams are shown in Tables I and  II. As has been consistently seen over the years, the students who took the lab are, in general, scoring higher on the overall exam than the students who do not take the lab: Physics 1, t(569) ¼ 6.63, p < .001; Physics 2, t(488) ¼ 4.74, p < .001. We attribute this to the self-selection of students to take the optional lab course in addition to the lecture course. However, when we compare the ratios of their performance on the lab-related and non-lab-related items, it is clear that the difference in the ratios between the students that take the lab and those that do not is much smaller than the uncertainty in both cases.
Although we believe that the scores on the final exam are the most relevant, as this was the most inclusive measure and reflects the extent of learning at the end of the course, we also carried out similar analyses for the two mid-term exams in Physics 1. There were a total of 21 questions, half of which were lab-related. We found similar results; the ratio of scores was slightly higher for students who took the lab on one midterm and slightly lower on the second midterm, but in both cases the differences between lab takers and nontakers were small compared to the standard errors. So, just as on the final exams, there was no measureable impact of taking the lab on the midterm exam scores.
Ratios inherently have non-normal statistical distributions, requiring a more careful statistical analysis. For Physics 1, the distribution of the ratios is nearly normal, but for Physics 2, the distribution of the ratios has significant outliers that impact the analysis. Because the scores themselves have normal distributions, a simpler and more precise way to analyze the differences and calculate their statistical uncertainties is to calculate Here, LR k is the k th student's average score on the lab-related items, and NLR k their average on the non-lab-related items; the first set of square brackets corresponds to values from students who took the lab course and the second set from students who did not take the lab course, with N being the number of students in the corresponding group. The Mean lab benefit calculates the difference, for each student, in the percent correct score on the lab-related questions and on the non-lab-related questions and takes the average of these differences for each group, and then subtracts the averages. This mean difference between the two groups gives the measure of the increased performance on lab related questions for students that took the lab course. All the quantities in this calculation are linear and normally distributed, allowing standard analysis of the means and uncertainties and use of the t-test to evaluate the statistical significance. For Physics 1, this analysis gives a difference between groups of 0.6% with a standard error of the difference of 1.2%; t(569) ¼ 0.47, p ¼ .640. For Physics 2 the difference between groups was 0.15% with a standard error of 2%; t(488) ¼ 0.08, p ¼ .938. In both cases, the differences are obviously consistent with zero, each being a fraction of the uncertainty. Although this is a better way to calculate the uncertainty in the difference between groups than using the ratios, we have found that it is a less intuitive way to understand the normalization method. Therefore, in Table I and II we also present the ratio analysis. As an added check, we have used the nonparametric Wilcoxon-Mann-Whitney test to analyze the distributions of the ratios and it provides the same conclusion that the differences are small compared to the uncertainties. Since many teaching methods have been shown to improve student performance on conceptual questions, but not on more traditional quantitative problem-solving items, 32 we wanted to also explore the type of question included in our analyses. For the Physics 2 exam, we coded the labrelated items as to whether they were primarily conceptual or primarily quantitative. We only used the Physics 2 exam, since this involved multiple-choice questions. The multi-part structure to the open-response items on the Physics 1 exam, as well as their subjective grading by different individuals, made these harder to code and less reliable to examine on a question-by-question basis. The question-by-question analyses for the Physics 2 questions are found in Table III. Note that since these items were multiple-choice, we provide the percent of students that got the item correct in each group. Table I. Performance on the lab-related and non-lab-related questions on the Physics 1 final exam for students who took the lab and students who did not take the lab. The lab related questions involve 5 þ 2 Â 0.5 items, and the non-lab-related questions are based on 13 þ 2 Â 0.5 items. Due to non-normal distribution of ratios, a Wilcoxon-Mann-Whitney test was used to determine the statistical significance of the differences in the ratios. See text for an alternative way to calculate the differences (less intuitive) and standard errors (more intuitive) for the two cases, shown in the bottom row.  Table II. Performance on the lab-related and non-lab-related questions on the Physics 2 final exam for students who took the lab and students who did not take the lab. The lab related questions involve 11 items, and the non-lab-related questions involve nine items. Due to non-normal distribution of ratios, a Wilcoxon-Mann-Whitney test was used to determine the statistical significance of the differences in the ratios. See text for an alternative way to calculate the differences (less intuitive) and standard errors (more intuitive) for the two cases, shown in the bottom row. With a Bonferroni correction applied to account for the multiple comparisons (a ¼ .003), only one of the final exam questions showed a statistically significant difference between the two groups of students. While this result is statistically significant, we do not feel that modest statistical significance on a single question out of 20 is of practical significance.
Only four of the 11 lab-related questions were coded as being primarily conceptual. If we combine these four items and set up a ratio similar to that used in the previous analysis (ratio of conceptual lab-items/all other items, this time these ratios are relatively normally distributed), again, there is no significant difference between the two groups: t(488) ¼ -0.08, p ¼ 0.938. This demonstrates that the labs did not measurably affect conceptual learning.
Our methodology for evaluating the impact of the labs on student learning is based on two assumptions. First, that most or all of the exam questions are providing a meaningful test of the understanding of the material each question is intended to test. All of the questions were carefully vetted in advance of the exam by at least two very experienced instructors and numerous teaching assistants, and the questions are likely to be as good in this respect as any physics course exam questions.
The second assumption is that each question is primarily measuring a different, independent, aspect of learning, so the performance on lab-related and non-lab-related questions would be expected to vary independently. If instead there were common factors, for example, that performance on all questions depended strongly and in a similar manner to the students' general interest in physics, and this general interest was improved by taking the lab course, this benefit would not be apparent in our analysis. This second assumption can be tested by the data. If there were additional variables such as general interest that significantly impacted the individual student's performance equivalently on all items, then this will be apparent as correlations between the responses to individual questions. We examined the correlations between student responses for the exam in Physics II. The average correlation coefficients between each question and every other question are shown in the last column of Table III. The typical correlation coefficient is 0.2. The correlation between individual question responses and the score on the exam as a whole excluding that question gives similar values (average 0.25). A coefficient of 0.2 means that any such underlying common variables are responsible for no more than 4% of the variance in responses to that question. This shows that the questions are measuring quantities with a high degree of independence and so it is implausible that whatever benefits the lab would provide to performance on a given question would correspond to similar benefits on other questions.

VI. DISCUSSION AND CONCLUSION
Within uncertainties in the differences between the respective ratios, there is no observable effect on exam performance whether or not a student completes the associated lab course in two different undergraduate physics courses. This is somewhat unexpected, particularly given the high precision of the measurement due to the large sample size and the apparently optimal design and implementation of the lab to maximize learning of the physics content in both of the courses. This result indicates that, relative to the other means by which students learn content in a physics course, such as lectures (particularly the interactive lectures used in these Table III. Question-by-question analysis on the multiple-choice items on the Physics 2 final exam questions were coded as being related (LR) or not related (NLR) to the lab. Lab-related items were also coded as being primarily conceptual (CONC) or primarily calculational (CALC). The percentage of students who got each item correct from the group of students who did and did not take the lab are compared with a Bonferroni correction accounting for the multiple comparisons (a ¼ 0.003). The average correlation coefficient between each item and each of the other items in the test is shown in the last column.

Item
Lab-related (LR) or non-lab related (NLR) cases), homework, recitation sections, and studying for exams, labs contribute very little. There are some caveats to this result. The exam questions were not written to specifically address what was covered in the labs, only to test students on their mastery of the material covered in the regular course. The assumption underlying our methodology is that if an exam problem involves correctly applying a concept or procedure covered by a lab experiment, students will perform better on that question if the lab resulted in their having a better understanding of that material. This may not be the case. As previously mentioned, it has been demonstrated that particular teaching methods can result in improved scores on concept inventories while having much less effect on scores on more traditional quantitative test questions. 32 However, examining primarily conceptual questions alone still did not demonstrate the added-value of taking the lab, and many of the labs also involved calculations similar to those required on the exams.
This study does not rule out the possibility that there are other things learned in the lab that are not being tested by the course exams. Indeed, the AAPT has recently published an updated set of goals for undergraduate physics lab curriculum, 33 which are much more skills-focused. These goals fall under the following themes: analyzing and visualizing data, communicating physics, constructing knowledge, designing experiments, developing technical and practical skills; and modeling. A striking difference between this report and the 1998 report 14 is the removal of conceptual learning as an outcome. It is possible that students were learning some of the skills listed in the AAPT document, even though they were not the stated goals of these individual labs.
While these caveats offer possible reasons that the impact of the lab course might not have been visible in the course exam scores, the results still should raise concern. Given the resources required for instructional laboratories and the amount of student time invested in them, one would hope it should be quite easy to observe their educational impact. We hope this work will inspire many institutions to examine carefully the learning objectives of their introductory lab courses, and the learning outcomes actually being achieved by those courses. This work indicates that, in the absence of evidence, it would be a mistake to assume that all that time and money is being well spent. electron beam and measure the charge to mass ratio for an electron. • In one part, you will use the magnetic field of a pair of current-carrying coils to measure the earth's magnetic field. (7) The goals of this lab are: To understand the concept of magnetic flux and to understand what happens when the magnetic flux through a wire loop is changed over time (8) The goal of this lab is to • Clarify the concept of "ground" in a circuit • Learn about time-dependent LR (or RL) circuits. The main emphasis is on studying the response of these circuits to step changes in voltage, e.g., by flipping a switch (open or closed) and exposing the circuit elements to a constant voltage. • If there is time left, your TA will suggest some experiments with sinusoidally varying voltages and LR circuits. (9) As part of this lab you will: • Continue to work with breadboards, oscilloscopes, and grounds • Obtain a deeper understanding of the material presented in Chapter 35 of Knight • Specifically, you should have a better understanding of AC circuits