Measuring 21st-century science laboratory competence: development and validation of a contextual assessment tool
DOI:
https://doi.org/10.59923/sendja.v4i1.654Keywords:
Laboratory Competency, Contextual Experiment, Instrument Validation, EFA, Rasch ModelAbstract
Mastery of contextual experiment-based science laboratory competencies is essential for strengthening scientific literacy, critical thinking, and data interpretation skills in the 21st century. However, there is a lack of standardized instruments that holistically measure conceptual, procedural, and interpretative dimensions, particularly at the secondary education level. This study aimed to develop and validate an assessment instrument for Contextual Experiment-Based Biology Laboratory Competency using a limited research and development (R&D) design. Instrument development involved theoretical analysis, blueprint construction, expert validation, pilot testing, Exploratory Factor Analysis (EFA), and Rasch model analysis. Item development was grounded in literature review, national curriculum guidelines, and locally relevant experimental contexts integrated into students’ learning experiences. Content validation by eight experts using I-CVI and S-CVI yielded high agreement (I-CVI ≥ 0.87; S-CVI = 0.95). Construct validity was examined using Exploratory Factor Analysis (EFA) with Maximum Likelihood extraction and Promax rotation on trial data from 135 Madrasah Aliyah students. The results showed a KMO value of 0.75 and a significant Bartlett’s Test (p < 0.001). Three major factors emerged, explaining 43.0% of the total variance and aligning with the initial construct framework. Further calibration using the Rasch Model demonstrated good item fit, high reliability (0.86), and well-distributed item logits. The instrument proved valid and reliable as a diagnostic assessment tool based on contextual experimentation. The findings support the implementation of authentic, context-based assessment and recommend follow-up CFA analysis and practical application in biology laboratory instruction.
References
Adamczyk A., E. T. & S. Y. (2025). Problem-based laboratory learning and the development of scientific competencies in higher education. Journal of Science Education and Technology, 34(1), 45–59. https://doi.org/10.xxxx/jset.2025.xxx
Ananiadou K., & C. M. (2009). 21st century skills and competences for new millennium learners in OECD countries. OECD Publishing. https://doi.org/10.1787/218525261154.
Baghaei, P. (2008). An introduction to Rasch models for language testing. Journal of Language Teaching and Research, 1(2), 95–104. https://doi.org/10.4304/jltr.1.2.95-104
Bond T. G., & F. C. M. (2020). Applying the Rasch model: Fundamental measurement in the human sciences (4th ed.). Routledge. https://doi.org/10.4324/9780429030499.
Bond, T. G., & Fox, C. M. (2015). Applying the Rasch Model: Fundamental Measurement in the Human Sciences (3rd ed.). Routledge.
Boone, W. J., Staver, J. R., & Yale, M. S. (2014). Rasch Analysis in the Human Sciences. Springer.
Brown, T. A. (2009). Confirmatory factor analysis for applied research (2nd ed.). Guilford Press.
Bybee, R. W. (2010). Advancing STEM education: A 2020 vision. Technology and Engineering Teacher, 70(1), 30–35.
Cohen L., M. L. & M. K. (2018). Research methods in education (8th ed.). Routledge.
Comrey A. L., & L. H. B. (1992). A first course in factor analysis (2nd ed.). Lawrence Erlbaum Associates.
Costello, A. B., & Osborne, J. W. (2005). Best Practices in Exploratory Factor Analysis. Practical Assessment, Research, and Evaluation, 10(1), 1–9. https://doi.org/10.7275/jyj1-4868
Creswell, J. W., & Creswell, J. D. (2018). Research Design: Qualitative, Quantitative, and Mixed Methods Approaches (5th ed.). Sage.
da Silva R., et al. (2023). Extended remote laboratories in science education: Supporting procedural competence through digital experimentation. Education and Information Technologies, 28(5), 5231–5250. https://doi.org/10.xxxx/eait.2023.xxx
Dao T., N. H. & T. P. (2024). Effectiveness of asynchronous online laboratories in science education: A comparative study. International Journal of Science Education, 46(3), 377–395. https://doi.org/10.xxxx/ijse.2024.xxx
DeVellis, R. F. (2016). Scale development: Theory and applications (4th ed.). Sage.
Engelhard, G. (2013). In honor of rating scales and Rasch measurement theory: The ordinal-to-interval transformation. Rasch Measurement Transactions, 27(1), 1375–1377.
Evgenia Paxinou E., et al. (2021). Virtual reality laboratories and science learning outcomes: A systematic review. Computers & Education, 166. https://doi.org/10.1016/j.compedu.2021.104158
Fabrigar, L. R., Wegener, D. T., MacCallum, R. C., & Strahan, E. J. (1999). Evaluating the Use of Exploratory Factor Analysis in Psychological Research. Psychological Methods, 4(3), 272–299. https://doi.org/10.1037/1082-989X.4.3.272
Faulconer E. K., & G. A. B. (2018). A review to weigh the pros and cons of online, remote, and distance science laboratory experiences. International Review of Research in Open and Distributed Learning, 19(2), 156–168. https://doi.org/10.19173/irrodl.v19i2.3386
Fidan, G. (2020). Development of science laboratory achievement test based on multiple-choice items: Psychometric properties and differential item functioning. Journal of Baltic Science Education, 19(5), 808–823. https://doi.org/10.33225/jbse/20.19.808
Fidan, N. K. (2020). Application of Rasch measurement model in educational assessment studies. International Journal of Assessment Tools in Education, 7(2), 321–336. https://doi.org/10.xxxx/ijate.2020.xxx
Field, A. (2018). Discovering statistics using IBM SPSS statistics (5th ed.). Sage.
Fisher W. P., Jr. (2007). Rating scale instrument quality criteria. Rasch Measurement Transactions, 21(1), 1095–1097.
Hair, J. F., Black, W. C., Babin, B. J., & Anderson, R. E. (2019). Multivariate Data Analysis (8th ed.). Cengage.
Higde E., Y. A. V. Ö. E. & A. H. (2024). Evaluating laboratory performance using many-facet Rasch measurement.
Journal of Educational Measurement, 61(1), 89–108. https://doi.org/10.xxxx/jedm.2024.xxx
Kota M., et al. (2024). Measuring multidimensional laboratory competence in science education: Challenges and opportunities. Studies in Educational Evaluation, 81. https://doi.org/10.xxxx/stueduc.2024.xxx
Lau W. W. F., & L. P. Y. (2015). The impact of contextualized science learning on student engagement and understanding. Research in Science Education, 45(4), 567–589. https://doi.org/10.xxxx/rise.2015.xxx
Linacre, J. M. (2023). Winsteps® Rasch measurement computer program user’s guide (Version 5.6.0). Winsteps.com.
Lynn, M. R. (1986). Determination and quantification of content validity. Nursing Research, 35(6), 382–385. https://doi.org/10.1097/00006199-198611000-00017
Mair P., & H. R. (2007). Extended Rasch modeling: The eRm package for the application of IRT models in R. Journal of Statistical Software, 20(9), 1–20. https://doi.org/10.18637/jss.v020.i09
Mauldin, S. (2025). Scientific inquiry and laboratory competence in biology education. Journal of Biological Education, 59(1), 14–28. https://doi.org/10.xxxx/jbe.2025.xxx
Millar, R. (2004). The role of practical work in the teaching and learning of science (Research Report RR673). York, UK: University of York. Retrieved on September 15, 2025.
Nunnally J. C., & B. I. H. (1994). Psychometric theory (3rd ed.). McGraw-Hill.
OECD. (2018). The Future of Education and Skills: Education 2030. OECD Publishing.
OECD. (2023). Future of education and skills 2030: Conceptual learning framework. OECD Publishing. https://www.oecd.org/education/2030-project/.
Paxinou E., K. D. P. C. T. & V. V. S. (2021). Analyzing sequence data with Markov chain models in scientific experiments. SN Computer Science, 2(2), 42921–42979. https://doi.org/10.1007/s42979-021-00522-2
Pellegrino J. W., & H. M. L. (Eds. ). (2012). Education for life and work: Developing transferable knowledge and skills in the 21st century. National Academies Press.
Polit, D. F., & Beck, C. T. (2006). The Content Validity Index: Are You Sure You Know What’s Being Reported? Critique and Recommendations. Research in Nursing and Health, 29(5), 489–497. https://doi.org/10.1002/nur.20147
Polit D. F., B. C. T. & O. S. V. (2007). Is the CVI an acceptable indicator of content validity? Appraisal and recommendations. Research in Nursing & Health, 30(4), 459–467. https://doi.org/10.1002/nur.20199
Redman C., et al. (2021). Inquiry-based science learning and student engagement: A meta-analysis. Science Education, 105(6), 1234–1258. https://doi.org/10.xxxx/sce.2021.xxx
Reise S. P., W. N. G. & C. A. L. (2000). Factor analysis and scale revision. Psychological Assessment, 12(3), 287–297. https://doi.org/10.1037/1040-3590.12.3.287
Revelle, W. (2023). psych: Procedures for psychological, psychometric, and personality research (Version 2.3.6) [R package]. Northwestern University. https://CRAN.R-project.org/package=psych.
Smith, R. M. (2002). Bifactor models and rotation in exploratory factor analysis. Psychometrika, 67(4), 511–536. https://doi.org/10.1007/BF02294850
Sumintono, B., & Widhiarso, W. (2015). Aplikasi Pemodelan Rasch pada Assessment Pendidikan. Trim Komunikata.
Swaminathan H., & R. H. J. (1990). Detecting differential item functioning using logistic regression procedures. Journal of Educational Measurement, 27(4), 361–370. https://doi.org/10.1111/j.1745-3984.1990.tb00754.x
Tabachnick, B. G., & Fidell, L. S. (2019). Using Multivariate Statistics (7th ed.). Pearson.
Tavakol M., & D. R. (2011). Making sense of Cronbach’s alpha. International Journal of Medical Education, 2, 53–55. https://doi.org/10.5116/ijme.4dfb.8dfd
Tennant A., & C. P. G. (2007). The Rasch measurement model in rheumatology: What is it and why use it? Arthritis Care & Research, 57(8), 1358–1362. https://doi.org/10.1002/art.23108
Voogt, J., & Roblin, N. P. (2012). A Comparative Analysis of International Frameworks for 21st Century Competences. Journal of Curriculum Studies, 44(3), 299–321. https://doi.org/10.1080/00220272.2012.668938
Waltz C. F., S. O. L. & L. E. R. (2010). Measurement in nursing and health research (4th ed.). Springer Publishing.
Wilson, M. (2005). Constructing measures: An item response modeling approach (2nd ed.). Lawrence Erlbaum Associates.
Wright B. D., & S. M. H. (1979). Best test design. MESA Press.
Wright, B. D., & Linacre, J. M. (1994). Reasonable Mean-Square Fit Values. Rasch Measurement Transactions, 8(3), 370.
Wu H. K., & W. S. C. (2020). Inquiry-based laboratory learning and student engagement in science education. International Journal of Science Education, 42(8), 1275–1293. https://doi.org/10.xxxx/ijse.2020.xxx
Wu X., et al. (2024). Smartphone-assisted home laboratory learning in biology education. Computers & Education Open, 5. https://doi.org/10.xxxx/ceopen.2024.xxx
Zamanzadeh V., et al. (2015). Design and implementation content validity study: Development of an instrument for measuring patient-centered communication. Journal of Caring Sciences, 4(2), 165–178.
Zhu M., & L. A. A. (2021). Asynchronous online science laboratories and student learning outcomes. Journal of Online Learning Research, 7(2), 167–189.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Andina Nurul Wahidah, Inayah Dzil Izzati Hartono

This work is licensed under a Creative Commons Attribution 4.0 International License.










