Initial Publication Date: February 1, 2017

Instruments and Surveys Collection

Research in geoscience education depends on quality data collection. There are many existing instruments and surveys that have been developed and used in geoscience education or other discipline-based education research. The initial collection of instruments and surveys provided here began with those submitted and recommended by participants in the 2016 GER workshop and continued with submissions by participants in the 2017 GER workshop. All instruments and surveys included in this annotated list have been used in previous studies described in the published literature; citations are provided, as are comments from users. We envision this collection as being a resource especially for those new to GER or those who are interested in starting a new research direction in GER. The instruments and surveys in the collection may be useful in the research design of new studies, adapted to test new research questions, and can inform the development of new means of data collection. Please consider contributing to this collection by submitting your recommendations of instruments and surveys here.

Instruments and Surveys Used to Measure Perceptions, Attitudes, and Behavior

Behavioral Engagement Related to Instruction (BERI) Survey

Reference: Lane, E, and Harris, S. (2015). A New Tool for Measuring Student Behavioral Engagement in Large University Classes. Journal of College Science Teaching, Vol. 44(6), 83-91.

Quantitative data (but with some observer interpretation). Have used it with undergraduate populations at all levels. This is data about how many students are "engaged" during different times in a class period. What I like: The data give instructors feedback on what types of activities in class were engaging/not engaging, e.g. was that mini-lecture too long? Was the followup to that clicker question useful? Challenges: It takes some manual processing to get the data into a clear and useful form, so immediate feedback is a bit harder.

Beliefs About Reformed Science Teaching and Learning Survey (BARSTL)

Reference: Sampson, V., Enderle, P., & Grooms, J. (2013). Development and initial validation of the Beliefs About Reformed Science Teaching and Learning (BARSTL) questionnaire. School Science and Mathematics, 113(1), 3â€"15.

The BARSTL is a 32-item Likert-type self-report questionnaire designed to quantify an instructor's pedagogical beliefs relative to reformed-based teaching of science. It aims to determine teaching beliefs and can be used for any population; BARSTL was designed for K-12 population, but has been used in current geoscience instructors as well as graduate students and post-doctoral scholars. Participants indicate whether they strongly disagree, disagree, agree, or strongly agree with each item, represented by a statement. Half of the BARSTL statements are phrased from the perspective of traditional instruction and are reverse scored. Possible BARSTL scores range from 32 to 128 points, with higher scores reflecting more reformed, student-centered beliefs. The BARSTL has four sub-categories with eight statements each. The four sub-categories are how people learn about science, lesson design and implementation, characteristics of teachers and the learning environment, and nature of the science curriculum.

Classroom Community Scale

Reference: Rovai, A. P. (2002) Development of an instrument to measure classroom community. Internet & Higher Education, 5 (3), 197-211. Retrieved from

Very useful instrument for measuring the development of community in classroom (or field) centered events and courses.

Classroom Observation Checklist

Reference: Stearns, L. M., Morgan, J., Capraro, M., & Capraro, R. M. (2012). A Teacher Observation Instrument for PBL Classroom Instruction. Journal Of STEM Education: Innovations & Research, 13(3), 7-16.

We used a modified form of this to have researchers observe the classroom-based teaching and learning practice in different sections of an introductory geoscience course. The observation results were used to make comparisons between the learning outcomes in the traditional and blended formats of GSci101. Observations were made in in-class sessions for which both formats of the class (traditional and blended) cover similar content and activities.

Classroom Observation Protocol for Undergraduate STEM (COPUS)

Reference: Smith, M.K., Jones, F.H.M., Gilbert, S.L., and Wieman, C.E. (2013). The Classroom Observation Protocol for Undergraduate STEM (COPUS): A New Instrument to Characterize University STEM Classroom Practices. CBE-Life Sciences Education, 12(4), 618-627.

A protocol sheet in Excel format is available at: Quantitative data. Have used it with undergraduate populations at all levels This is data about what students and instructors are doing during class time. What I like: The data give instructors immediate feedback on what happened during the class time and an immediate indication of the level of student activity in the class.

Classroom Undergraduate Research Experience (CURE) Survey

References: Lopatto, D. (2004). Survey of undergraduate research experiences: First findings. Cell Biology Education, 3(4), 270-277. Denofrio, L.A., Russell, B., Lopatto, D., and Lu, Y. (2007). Linking student interests to science curricula. Science, 318, 1872-1873. and Lopatto, D. (2009). Science in Solution: The Impact of Undergraduate Research on Student Learning. Research Corporation for Science Advancement.

The instrument collects quantitative (Likert-scale) data using 22 items. My group uses this as a pre-post measure of student attitudes toward science and scientific research in studies of online undergraduate learning.

Common Instrument to Measure Interest and Engagement in Science


Simple and easy to use 10-item survey to assess self-reported interest and engagement with science. Useful for pre-post surveys for youth out-of-school time programs.

Course Experience Questionnaire (CEQ)

References: Ramsden, P. (1991). A performance indicator of teaching quality in higher education: the curse experience questionnaire. Studies in Higher Education, 16, 129-150. Wilson, K. L., Lizzio, A., & Ramsden, P. (1997). The development, validation and application of the Course Experience Questionnaire. Studies in Higher Education, 22(1), 33-53. Steele, G. A., West, S. A., & Simeon, D. T. (2003). Using A modified Course Experience Questionnaire (CEQ) to evaluate the innovative teaching of medical communication skills. Education for Health: Change in Learning & Practice, 16(2), 133-144. Jansen, E., va der Meer, J., and Fokkens-Bruinsma, M. (2013). Validation and use of the CEQ in The Netherlands. Quality Assurance in Education, 21(4), 330-345.

This is an instrument that has been used broadly in a wide-range of fields, but I don't think it has been used very much yet in the geoscience. Based on our current use of this in an introductory interdisciplinary science course, I think this instrument has a lot of potential for geoscience education research for (1) summative evaluation, (2) adapted to look at individual geoscience courses and whole geoscience programs, and (3) comparisons of geoscience courses with different pedagogies (e.g., in the way we used it we compared two gen ed sections of an intro geoscience course both taught by the same instructor but one section was taught in a blended-format and the other was taught with traditional face-to-face meeting. Based on the sources provided here, the CEQ is used as a measure of perceived teaching quality in undergraduate degree programs in national annual surveys of all graduates in the Australian higher education system. It has been validated for samples of students in Australia, New Zealand, the UK and the Netherlands. It is constructed to consider 6 factors of influence: good teaching practices, clear goals and standards, appropriate student workload, appropriate assessment, independence and generic (transferable) skills.

Expectancy Violation

Reference: Gaffney, Education majors' expectations and reported experiences with inquiry-based physics: Implications for student affect. PHYSICAL REVIEW SPECIAL TOPICS - PHYSICS EDUCATION RESEARCH 9, 010112 (2013). See Appendix B. EXPECTANCY VIOLATION

A short survey developed for education majors enrolled in a physics class but easily adaptable to other contexts. It can be administered in class or outside class, and as pre and post test. Fifteen items were included in the survey, prefaced with the anchor, ''Indicate how often you (expected to experience or experienced) the following during this semester's PAT, using the following scale'' (ranging from 1, very infrequently, to 7, very frequently).

Instrument for Assessing Interest in STEM Content and Careers

Reference: Tyler-Wood, T., Knezek, G., & Christensen, R. (2010). Instruments for assessing interest in STEM content and careers. Journal of Technology and Teacher Education, 18(2), 345-368; and Knezek, G., Christensen, R., and Tyler-Wood, T. (2011). Contrasting perceptions of STEM content and careers. Contemporary Issues in Technology and Teacher Education, 11(1), 92-117.

Used to measure both pre-service teachers and middle school students' attitudes towards STEM content and careers.

InTeGrate Attitudinal Instrument (IAI)


This is a partner instrument to the GLE. It is used by many others developing and testing InTeGrate materials. The IAI is on-line, so not administered in a controlled environment. It is given at the beginning and end of the semester. The survey asks students "attitudinal" questions about their academic major, interests, why they are taking the class, their level of concern about certain environmental issues (e.g., climate change, water resources, etc.), and their conservation habits. Online survey used pre and post instruction that probes students' interest in careers related to the Earth and Environment, their motivation to contribute to solving problems related to environmental sustainability. Survey also asks for demographic information. Survey has been used in the InTeGrate project to measure change across instruction among students involved in testing of InTeGrate modules and courses. Subgroups that have been analyzed include minorities underrepresented in science, future teachers, and 2YC students. There is also a "toolkit" so that the developers of InTeGrate courses/modules can compare the results from students in the course or module they developers with a project-wide sample. The toolkit is in the form of an Excel spreadsheet that is pre-populated with the project-wide data. Following detailed directions, developers copy and paste in selected data from their own student sample, and the spread sheet auto-generates tables and graphs to allow the compare and contrast. It is a likert-scale type survey that is given as a pre and post semester test to an introductory course. It asks about students perceptions of the geosciences and likelihood to take actions that are environmentally oriented.

Motivated Strategies for Learning Questionnaire (MSLQ)

Reference: Pintrich, P. R., Smith, D. A. F., García, T., & McKeachie, W. J. (1991). A manual for the use of the Motivated Strategies for Learning Questionnaire (MSLQ). Ann Arbor: University of Michigan, National Center for Research to Improve Postsecondary Teaching and Learning.

The Motivate Strategies for Learning Questionnaire (MSLQ) is an attitudinal survey composed of a list of 81 statements encompassing 9 learning strategies scales and 6 motivation subscales that students rate on a 7-point Likert scale. I typically administer these questions (or a subset of the questions) at the end of the semester to introductory-level undergraduate students taking a class I am teaching using a "novel pedagogical approach" (such as online or flipped classes). I like the MSLQ because it provides insight about how the students learn and what they value in the class. However, if I give the students all 81 questions, I am concerned that (1) either they just wont complete the survey or (2) they will not accurately respond to the last half of the questions (i.e. just randomly post a response to end the survey).

We have used this in a general education course to learn about student perceptions of their own learning strategies and study skills for the class. Can use this early on vs later to see if instructional approaches can directly impact study skills. Can use this to compare across populations.

Place Attachment Inventory (PAI)

References: Williams, D. R., & Vaske, J. J. (2003). The measurement of place attachment: Validity and generalizability of a psychometric approach. Forest Science, 49, 830-840. and adaptation to GER: Semken, S., & Butler Freeman, C. (2008). Sense of place in the practice and assessment of place-based science teaching. Science Education, 92(6), 1042-1057.

We did not develop this instrument, but we adapted it for educational use and named it the PAI. It collects quantitative (Likert-scale) data using 12 items. We have used this instrument on groups of undergraduate geoscience students.

Possible Outcomes

Reference: Gaffney, Education majors' expectations and reported experiences with inquiry-based physics: Implications for student affect. PHYSICAL REVIEW SPECIAL TOPICS - PHYSICS EDUCATION RESEARCH 9, 010112 (2013). APPENDIX C: POSSIBLE OUTCOMES

A short survey developed for education majors enrolled in a physics class but easily adaptable to other contexts. It can be administered in class or outside class, and as pre and post test. Twelve items were included in this survey. Students indicated how important they believed these potential goals were on a 7-point Likert scale, with 1 for items that were ''very unimportant (trivial)'' and 7 for items that were ''very important (crucial)''.

Reformed Teaching Observation Protocol (RTOP)

Reference: and

RTOP provides a standardized means for detecting the degree to which classroom instruction uses student-centered, engaged learning practice. This is a protocol for observing the classroom instructor.

Six Americas Survey

Maibach, E.W., Leiserowitz, A., Roser-Renouf, C., Mertz C.K., & Akerlof, K. (2011). Global Warming's Six Americas screening tools: Survey instruments; instructions for coding and data treatment; and statistical program scripts. Yale University and George Mason University. Yale Project on Climate Change Communication, New Haven, CT, and Leiserowitz, A., Maibach, E., Roser-Renouf, C., Feinberg, G., Rosenthal, S., (2014). Politics & Global Warming, Spring 2014, Yale Project on Climate Change Communication. Yale University and George Mason University, New Haven, CT.

This is a 36- or 15-item survey that categorizes people into six typologies based on their beliefs and attitudes about global warming. The reference given contains the survey questions and how to process the data. Quantitative data. I have used it with undergraduate populations, both introductory level and upper-level science This is data about attitudes. What I like: It can give me a sense of how similar or different my student population is to the general US population. I can also communicate those differences to my students. What I don't like: It's US-centric (I've modified it for Canada).

Science Teaching Efficacy Belief Instrument (STEBI)

Reference: Riggs, I.M. and Enoch, L.G. (1990). Toward the Development of an Elementary Teacher's Science Teaching Efficacy Belief Instrument. Science Education 74(6): 625-637.

This quantitative instrument was designed to measure pre-service elementary teachers belief in their ability to teach science. It is comprised of 25 Likert scale items (Strongly Agree to Strongly Disagree) divided into two subscales: Science Teaching Outcome Expectancy (STOE) and Personal Science Teaching Efficacy (PSTE). It is quick to administer (5-10 minutes), with lots of references to support data analysis.

Student Perceptions about Earth Sciences Survey (SPESS)

Reference: Jolley, A., Lane, E., Kennedy, B., and Frappe-Seneclauze, T-P, (2012). SPESS: A New Instrument for Measuring Student Perceptions in Earth and Ocean Science. Journal of Geoscience Education, 60:83-91.

SPESS is used for measuring student perceptions of earth sciences. It was designed following the C-LASS survey developed for physics. It's primarily used in a pre-post format to see whether a learning experience shifts student perceptions more toward expert-like perceptions of the field. Quantitative data. I have used it with undergraduate populations at all levels This is data about attitudes. What I like: I haven't used this in a while. We used it extensively across our department to see if we had differences among types of courses (levels, service/majors, distance ed/face-to-face) in terms of student perceptions, and shifts in student perceptions.

Teaching Dimensions Observation Protocol (TDOP)

Reference: Hora, M.T. and Ferrare, J.J., (2013). Instructional systems of practice: A multidimensional analysis of math and science undergraduate course planning and classroom teaching. The Journal of the Learning Sciences, 22 (2), 212-257.

TDOP is a classroom observation protocol with 5 dimensions (teacher behaviors, instructional technology use, cognitive engagement, pedagogical strategies, and student-teacher interactions). It can be customized. Data are recorded in real time (2-min intervals) by trained observers using the TDOP website. What I like: very fine grained, nuanced, used widely in college STEM classrooms, useful for measuring the impact of PD. What's hard: the training! Observers have to be trained and calibrated, which in the project I was part of took 4 or 5, 2-hr sessions. I have no idea what the data are like to analyze.

Teaching Beliefs Interview

Reference: Luft, J.A. and Roehrig, G.H. (2007). Capturing Science Teachers' Epistemological Beliefs: The Development of the Teacher Beliefs Interview. Electronic Journal of Science Education 11(2): 38-63.

Semi-structured, qualitative interview with a valid and reliable coding scheme. Interviews take anywhere from 20 minutes to over an hour, and can be conducted over the phone. Instructor responses are broken down into one of five categories, ranging from less to more student centered (Traditional, Instructive, Transitional, Responsive, Reform-based).

Views of the Nature of Science

Reference: Lederman, N. G., Abd-El-Khalick, F., Bell, R. L., & Schwartz, R. S. (2002). Views of nature of science questionnaire: Toward valid and meaningful assessment of learners' conceptions of nature of science. Journal of research in science teaching, 39(6), 497-521.

This instrument collects qualitative data about knowledge of the nature of science through a series of open-ended essay questions. I have used this for students in a graduate level STEM education course. I like that it elicits rich data about the construct, but analysis can be difficult and interpretation of understanding of particular aspects of the nature of science may not be possible without follow-up interviews (as recommended by the authors).

Instruments and Surveys Used to Measure Geoscience Skills and Content Knowledge

Geologic Block Cross-Sectioning Test

Reference: Ormand, Carol J., Thomas F. Shipley, Basil Tikoff, and Cathryn A. Manduca. Developing a Valid, Reliable Psychometric Test of Visual Penetrative Ability: The Geologic Block Cross-Sectioning Test. In preparation for submission to the Journal of Geoscience Education.

Quantitative measure of subjects' ability to select the correct cross-section through a geologic block diagram. I have used this instrument with undergraduate students in a wide variety of geoscience courses, undergraduate psychology majors, and professional geoscientists. Subjects' incorrect answers indicate what type(s) of penetrative thinking errors they are making.

Geoscience Concept Inventory

References:, and Libarkin J.S and Anderson, S.W. (2005). Assessment of Learning in Entry-Level Geoscience Courses: Results from the Geoscience Concept Inventory. Journal of Geoscience Education, 53 (4), 394-401.

The Geoscience Concept Inventory (GCI) is a quantitative instrument used to diagnose conceptual understanding and assess learning in entry-level geoscience courses. I like that the GCI is easy to deliver and assess, that it focuses on conceptual understanding, that it has an active community working on re-evaluating and revising the questions, that it has equivalent but different questions available. I would like a broader coverage of earth science topics.

Concept inventory, closed response (some items have multiple correct answers), quantitative. I've used this with introductory students, preservice teachers, and novice-expert studies. Collects knowledge data (unidimensional, one factor). What I like: short, easy, validated questions, customizeable to course content, high reliability, works with multiple populations. What I don't like: unclear how the new online version should be scored related to the previous Rasch version, question bank dominated by geosphere, has a ceiling effect for novice populations (but not experts). I've never used the online interface, just paper & pencil.

I have used specific questions from the Geosciences Concept Inventory to make a standard pre- post test that I use in my introductory level classes. I have added a few specific questions written in the same style as the GCI to address specific content areas I emphasize which the GCI does NOT have questions for. My modified instrument seems to work well with introductory students in community colleges, state universities and private colleges with equal efficacy across these populations. When used with upper level students, the test is also a good measure of the background knowledge/misconceptions that even majors hold as sophomores and juniors. This instrument assesses content knowledge on core geoscience concepts and as well as topics with common misconceptions. The instrument has the strengths and weaknesses one would expect in a multiple choice pre- post test format. When used with a semester long writing project built around analysis of a particular place, it provides a good method for normalizing and standardizing student learning. This is necessary to sort out the disparate geologic content found in the students case studies and helps to measure how the details of the projects cause disparate learning of certain topics within geology over others. It is thus a good complement to the deeper but less easily quantified writing project. The biggest complaint I have with the GCI...beyond those commonly discussed with multiple choice format assessments the fact that the GCI is no longer being supported for modification and expansion. There are sub fields of geology that have important concepts which it would be helpful to be able to measure. The GCI, for example, does not address issues related to glaciation which are very important in many parts of the world. My students knowledge of geology is far deeper in issues related to glaciation than in coastal features...but the tests does not measure this content giving an inaccurate impression of their geoscience content knowledge and particular misconceptions they are prone to. Getting on-going funding to support and expand the GCI should be a priority of the community and of NSF.

The Geoscience Concept Inventory is a collection of validated earth science questions that I use in pre- and post-semester assessments to measure how much information my students retain throughout the semester. I typically administer these questions to introductory-level undergraduate students (either freshman geology major, or an intro-level general education class). As the questions are multiple-choice questions and I can "grade" the assessments, this inventory collects data that I can then use to compare pre- and post-semester scores. I really like this list of questions because it is an extensive list of questions covering a variety of earth science topics (I can typically find a question addressing each learning objective in my class). However, it seems like the list has not been updated since 2011, and I am concerned that it might be considered "cutting edge" for new geoscience education research.

Pre/post measure of changes to conceptual understanding quantitative used in for both non-science majors and science majors I like that this instrument can be used quickly and easily. The data are easy to interpret. Some of the items could use revision. Not sure how to submit new items to the GCI.

Geoscience Literacy Exam (GLE)


Includes multiple choice and short essay questions about knowledge-based, geoscience topics. They are administered in a controlled environment (i.e, with the instructor present). Administered at beginning and end of semester of classes that have had InTeGrate materials used, as well as those that have not (control groups). Used (by me) on community college students in introductory Physical Geology, Earth Science, and Oceanography classes. Used by many, many others developing and testing InTeGrate materials. I especially like the two common essay questions that are administered at the end of the semester because they provide information about HOW students think and reason rather than what they know. They are given the option to answer these questions with an example situation of their choice. The way that I have been using these is in line with InTeGrate research interests, so in some cases I do not cover the multiple-choice topics and this confuses my students. I have collected both "control data" (classes where students are not receiving InTeGrate-based instruction) as well as data where the materials are being used.

Landscape Identification and Formation Test (LIFT)

Reference: Jolley, A., Jones, F., and Harris, S. (2013). Measuring Student Knowledge of Landscapes and Their Formation Timespans. Journal of Geoscience Education, 61, 240-251.

This is an assessment of student abilities to identify landscapes and the formation timespans. It also includes questions about students' confidence in their answers. Quantitative data. My student, Alison Jolley, used it with undergraduate students. This is data about knowledge. What I like: The test is easy to administer. It covers a broad range of relevant timespans. Challenges: Some of the landscape features may not be relevant for all instructors/students, but those can be changed to be relevant. â€"â€" Since I'm guessing all these contributions will form the beginnings of a list of instruments, so I'm listing two more here, which are not specific to Geoscience but are useful. Both are classroom observation instruments:

Landscape Perception Test

Reference: Iwanowska, K. and Voyer, D. Mem Cogn (2013). Dimensional transformation in tests of spatial and environmental cognition. Memory and Cognition, 41, 1122-1131.

This paper references the test, but does not include it. Authors need to be contacted to find out if the assessment is available.

Moon Phase Assessment Instrument

Reference: Rivet, A.N., and Kastens, K, A. (2012). Developing a construct-based assessment to examine students' analogical reasoning around physical models in Earth Science, Journal of Research in Science Teaching, 49 (6), 713-743.

This paper references the test, but does not include it. Authors need to be contacted to find out if the assessment is available. In addition to the moon phases assessment, the same authors developed parallel assessments on causes of seasons and sedimentary deposition. All three assessments target students' ability to carry out an analogical mapping between a physical model and the portion of the real Earth system represented by the model. All three assessments require that a physical model be set up and run at the front of the classroom by the experimenter. Published data are from 8th/9th graders, but assessment could also be used at the introductory undergraduate level.

Quantitative Reasoning for College Science Assessment (QuaRCS)

Reference: Follette, Katherine B.; McCarthy, Donald W.; Dokter, Erin; Buxner, Sanlyn; and Prather, Edward 2015. "The Quantitative Reasoning for College Science (QuaRCS) Assessment, 1: Development and Validation." Numeracy 8 (2). doi: and Follette, Katherine, Sanlyn Buxner, Erin Dokter, Donald McCarthy, Beau Vezino, Laci Brock, and Edward Prather. 2017. "The Quantitative Reasoning for College Science (QuaRCS) Assessment 2: Demographic, Academic and Attitudinal Variables as Predictors of Quantitative Ability." Numeracy 10 (1):5.

Useful assessment for quantitative reasoning in introductory level science courses.

Revised Purdue Spatial Visualization Test (PVST:R)

Reference: Yoon, S. Y. (2011). Revised Purdue Spatial Visualization Test: Visualization of Rotations (Revised PSVT:R)

This is a psychometric instrument. It is a revised version of the classic instrument, with corrections. Useful for measuring mental rotations and highly resistant to analytical solutions.

Spreadsheets Across the Curriculum Instrument

Reference: Lehto, H.L and Vacher, H.L. Spreadsheets Across the Curriculum, 4: Evidence of Student Learning and Attitudes about Spreadsheets in a Physical Geology Course. Numeracy, 5 (2), 1-21,

Pre/Post-test developed for a set of spreadsheet modules used to study the effectiveness of the modules. Quantitative data is collected. It was validated and used for a small population of undergraduates taking an introductory geology course. The survey worked well for our study, but I don't know if it would work for other studies.

Scale of Objects Questionnaire (SOQ)

Reference: Tretter, T. R., Jones, M. G., Andre, T. Negishi, A., and Minogue J. (2006). Conceptual boundaries and distances: Students' and experts' concepts of the scale of scientific phenomena. Journal of Research in Science Teaching 43 (3), 282-319.

Quantitative, with an option to write in a number for each item. The instrument is a list of 26 common objects; the subject selects the correct size range for each object (e.g, a grain of rice is in the "1 mm to 1 cm" size bin). Subject can also write in a size estimate. Allows for variable analysis; can rank order responses or score as correct/incorrect. What I like: not much, actually, but it's the only existing instrument that measures sense of scale that I know of. The design is clever and the instrument is quick to use (<10 min), it's well grounded in theory. What I don't like: many of the items are ambiguous sizes (length of an ant? size of a shopping mall?), some are distances rather than sizes. It's not clear to me whether the instrument is valid - whether it actually measures sense of scale or knowledge of the metric system or trivial knowledge how big each item is.

TOSL - Test of Science Literacy

Reference: Gormally, Cara, Peggy Brickman, and Mary Lutz. "Developing a test of scientific literacy skills (TOSLS): measuring undergraduates' evaluation of scientific information and arguments." CBE-Life Sciences Education 11.4 (2012): 364-377.

The validated test measures skills related to major aspects of scientific literacy: recognizing and analyzing the use of methods of inquiry that lead to scientific knowledge and the ability to organize, analyze, and interpret quantitative data and scientific information. The test focuses on Biology but can be adapted to other fields if one chooses to do that but is also valid for non-Biology students. It is currently being used in geoscience to assess quantitative skills in REU students (Anne Gold). The full test is available for free with the paper.

VandenBerg and Kuse mental rotation test


This is a well-established, quantitative, psychometric test of abstract mental rotation. I have used this test with undergraduate students in a wide variety of geoscience courses, undergraduate psychology majors, and professional geoscientists. Widely used to measure a type of spatial thinking; it is unfortunately often used as if it measured all of a person's spatial thinking skills.

Useful Collections from STEM Education and Spatial Learning Research

Compendium of Research Instruments for STEM Education

References: Part 1: Teacher Practices, PCK, and Content Knowledge and PART II: Measuring Students' Content Knowledge, Reasoning Skills, and Psychological Attributes

The compendium provides an overview on the status of STEM instrumentation commonly used in the U.S and to provide resources for research and evaluation professionals. It was developed through review of the e NSF Discovery Research K-12 (DR K-12) program. Part 1 of a two-part series, the goal to provide insight into the measurement tools available to generate efficacy and effectiveness evidence, as well as understand processes relevant to teaching and learning. It is focused on instruments designed to assess teacher practices, pedagogical content knowledge, and content knowledge. Part 2 of a two-part series, the goal to provide an overview of the measurement tools available for studying student outcomes within the DR-K12 portfolio.

Spatial Intelligence and Learning Center (SILC)


A large number of instruments have been developed to measure domain-general spatial skills. I have used the following in various research projects: form board (ETS), paper folding (ETS), Vandenberg mental rotation, Santa Barbara sense of direction, perspective taking (Hegarty et al.), and hidden figures (ETS). There are many others (including some specific to geosciences). All are quantitative. All are widely used; I've used them with novice-expert populations of non-geologists, intro students, advanced students, and geology experts (mainly field geologists and structural geologists). What I like: highly valid and reliable, everybody uses them, easy to administer (most are group setting, pencil & paper, timed), easy to score, easy to interpret results. What I don't like: difficult to adapt to online survey research, there are so many tests it is hard to know which one(s) to use in a particular study, unclear whether these test may or may not tell us anything about domain-specific spatial skills, unclear how many of these tests map on to specific spatial tasks in the geosciences.

Physics Education Research (PER) Assessments

Reference: and

This contains a collection of >80 assessments for PER, some of which are likely adaptable for GER.

Provide feedback »