Towards a Benchmark for Scientific Understanding in Humans and Machines

Minds and Machines 34 (1):1-16 (2024)
  Copy   BIBTEX

Abstract

Scientific understanding is a fundamental goal of science. However, there is currently no good way to measure the scientific understanding of agents, whether these be humans or Artificial Intelligence systems. Without a clear benchmark, it is challenging to evaluate and compare different levels of scientific understanding. In this paper, we propose a framework to create a benchmark for scientific understanding, utilizing tools from philosophy of science. We adopt a behavioral conception of understanding, according to which genuine understanding should be recognized as an ability to perform certain tasks. We extend this notion of scientific understanding by considering a set of questions that gauge different levels of scientific understanding, covering information retrieval, the capability to arrange information to produce an explanation, and the ability to infer how things would be different under different circumstances. We suggest building a Scientific Understanding Benchmark (SUB), formed by a set of these tests, allowing for the evaluation and comparison of scientific understanding. Benchmarking plays a crucial role in establishing trust, ensuring quality control, and providing a basis for performance evaluation. By aligning machine and human scientific understanding we can improve their utility, ultimately advancing scientific understanding and helping to discover new insights within machines.

Links

PhilArchive



    Upload a copy of this work     Papers currently archived: 92,227

External links

Setup an account with your affiliations in order to access resources via your University's proxy server

Through your library

Similar books and articles

Transhumanism: a realistic future?Jean-Pierre Fillard - 2020 - Hackensack, New Jersey: World Scientific.
Persuasion and the expressivity of gestures in humans and machines.Isabella Poggi & Pelachaud & Catherine - 2008 - In Ipke Wachsmuth, Manuela Lenzen & Günther Knoblich (eds.), Embodied Communication in Humans and Machines. Oxford University Press.
Humans, Machines, and an Ethics for Technology in Dune.Zachary Pirtle - 2022-10-17 - In Kevin S. Decker (ed.), Dune and Philosophy. Wiley. pp. 76–86.
Fundamental issues in social robotics.Brian R. Duffy - 2006 - International Review of Information Ethics 6 (12):2006.
Focusing on scientific understanding.Henk W. de Regt, Sabina Leonelli & K. Eigner - 2009 - In Henk De Regt, Sabina Leonelli & Kai Eigner (eds.), Scientific Understanding: Philosophical Perspectives. University of Pittsburgh Press.

Analytics

Added to PP
2024-04-27

Downloads
8 (#1,322,157)

6 months
8 (#368,968)

Historical graph of downloads
How can I increase my downloads?

Author Profiles

Henk W. de Regt
Radboud University

Citations of this work

No citations found.

Add more citations

References found in this work

The extended mind.Andy Clark & David J. Chalmers - 1998 - Analysis 58 (1):7-19.
Minds, brains, and programs.John Searle - 1980 - Behavioral and Brain Sciences 3 (3):417-57.
Computing machinery and intelligence.Alan M. Turing - 1950 - Mind 59 (October):433-60.
Studies in the logic of explanation.Carl Gustav Hempel & Paul Oppenheim - 1948 - Philosophy of Science 15 (2):135-175.

View all 25 references / Add more references