Ethical Theories in Research Evaluation: An Exploratory Approach

Research evaluation encompasses the practices of assessing research quality and impact at various stages of research. The processes and criteria of research evaluation vary depending on the nature and objectives of the assessment. Different research evaluation systems influence the research strategies of universities and institutes. There are, however, some known issues of research evaluation with regards to the peer review and, most prominently, the use of citation-based metrics, which lead to recent calls for responsible use of metrics. In this paper, we argue that there is a need for ethical theories for considering research evaluation and that research evaluation ethics, as an overlapping area between research ethics and evaluation ethics, deserve its own treatment. The core of the article consists of a discussion of the most influential ethical theories in the context of the research evaluation, including the deontological ethics, the consequentialist ethics and the virtue ethics. The aim is to highlight the need to assume an ethical view that combines the deontological and the consequentialist concepts, adopting ‘common good’ as the most likely pillar for the research evaluation procedures. We propose that the mixed approach would be useful for developing a framework for research evaluation ethics and for analysing ethical approaches and ethical dilemmas in research evaluation.


Introduction
Research evaluation encompasses the practices of assessing research quality and impact of scholarly works ex ante and ex post. Ex ante research evaluation usually refers to the evaluation of research proposals for grant funding, where the quality, feasibility and potential contributions of funding proposals are assessed. Ex post research evaluation, on the other hand, is used to assess scientific-scholarly and sometimes economic and societal impacts, after a research project has been conducted. At the individual level, the assessments are often used in the decision-making process of hiring and promotion of scholars and of their career advancement, as well as the evaluation of grant proposals and awards. At the university level, research evaluation is sometimes used for allocating block grants to universities. While research evaluation is considered necessary to assess the research performance of individuals and universities, there have not been ethical guidelines in the drafting of evaluation processes or criteria notwithstanding the constitutive effects of evaluation (Dahler-Larsen, 2012). A survey of national evaluation systems (Ochsner, Kulczycki and Gedutis 2018) reveals that different systems have incompatible priorities for the research evaluation. As a result, their evaluation criteria vary, e.g. metric and non-metric require diverse approaches, including the ethical ones. Whitley (2007) argues that universities and centers of research are often in competition for favorable assessment and that strong research evaluation systems can limit intellectual autonomy and the ability to implement research strategies that challenge current orthodoxies. Moreover, research evaluation may impact on the development of disciplines and limit novelty and inventiveness (Whitley, Gläser and Laudel 2018). In many scientific fields highly cohesive scientific elites may influence the organization of strong Research Evaluation Systems (RES) in agreement with their conceptions of quality. Where elites hold the consensus on central topics of their disciplines, strong RES may reinforce their authority, as they might decide the quality standard for the discipline. Research evaluation plays a fundamental role in both the development of disciplines and the career advancements of researchers. It is expected to impact on the development of scientific fields, as it may limit novelty and inventiveness of emerging researchers, which must conform to the dominant elites to achieve academic consensus.
There are also known problems and issues concerning research evaluation-both in the peer review process and the criteria used, including citation-based metrics. The assessment in Social Sciences and Humanities (SSH), for instance, may be based on emotions and on the interactions between individuals; the social identity of reviewers and their membership to a scientific-scholarly community, rather than the neutral judgment, can play a fundamental role (Lamont 2009). For one, bias in peer review has been discussed with respect to gender, race, language, career stage and interdisciplinarity (see, for example, Helmer, 2017;Lee, et al., 2013). Peer reviewers also tend to be conservative and risk-averse in their evaluation of innovation methods and approaches (Luukkonen, 2012), not to mention the inconsistent reliability and validity of peer review notwithstanding the availability of innovative procedures and platforms (Bornmann, 2011;Horbach & Halffman, 2019).
Furthermore, studies have shown that the use of citation-based metrics have led to the misuse and gaming of evaluative metrics (see, for example, Biagioli & Lippman 2020) as well as changes in research practices and knowledge production (de Rijcke et al., 2016). It is understood that the use of metrics induces competition, rather than collaboration, between researchers. The drive to publish in high JIF journal also prompts researchers and scholars to publish in international journals, leading to decreased number of publications in local/national languages that are important especially for the SSH. Some have argued that the use of metrics, which eventually has led to 'misuses' and ' abuses' of metrics, is due to the audit culture, in which accountability is at its core. Ma and Ladisch (2019) have suggested that evaluation complacency and evaluation inertia are a cause, as well as an effect, of the use of metrics in research evaluation. Recently, there are increasing pressures for institutions to reconsider and reconfigure the use of citation-based metrics in response to DORA (ASCB, 2013), The Metrics Tide (Wilsdon et al., 2015), The Leiden Manifesto (Hicks, et al., 2015) and the Hong Kong Principles (Moher, et al., 2020).
Taking into account the complexities of research evaluation, we must consider whether to look "from above" and seek universal ethics, or to calibrate our optics for an empirical case study. At the initial stage of our endeavour it is more viable to seek for a theoretical horizon than limit ourselves to empirical case studies. Merton (1973) has proposed scientific ethos often known by the name of CUDOS: communalism (originally communism), universalism, disinterestedness, and organized skepticism. While his conceptions are closely related to the goal of science and scientific method, there seems to be a lack of ethical justification. Therefore, CUDOS are open to criticism for being too general, not reflective enough and rather inefficient if compared with the particular practices of scientific research in their diversity. For this reason, a more detailed study of the ethical field is needed. In the rest of the paper, we will argue that ethics of research evaluation lies in the overlapping area of research ethics and evaluation ethics, followed by a discussion of three ethical theories: deontological ethics, consequentialist ethics, and virtue ethics. Finally, we propose that best practices of research evaluation can be based on a mixed approach.

Research Ethics and Evaluation Ethics: Guidelines and Principles
In this section, we review major documents concerning research ethics and integrity, on the one hand, and evaluation ethics, on the other, to situate ethics of research evaluation in the overlapping area of these two domains (Figure 1).

Research ethics and research integrity
The European Code of Conduct for Research Integrity (ALLEA 2017) is a comprehensive document illustrating the principles of research ethics, including reliability, honesty, respect, and accountability. It also describes good research practices in different scenarios. Of particular interest to this article is the section on reviewing, evaluating and editing, where it states: • Researchers take seriously their commitment to the research community by participating in refereeing, reviewing and evaluation.
• Reviewers or editors with a conflict of interest withdraw from involvement in decisions on publication, funding, appointment, promotion or reward. • Reviewers maintain confidentiality unless there is prior approval for disclosure.
• Reviewers and editors respect the rights of authors and applicants, and seek permission to make use of the ideas, data or interpretations presented.
Although the good practices prescribe what a reviewer should do, there is little guidance as to how to develop research evaluation processes and criteria, or how to deal with oft-debated issues of bias and conservatism in peer review and the negative impacts of the use of citation-based metrics. In other words, there is a lack of principles guiding the processes and criteria of research evaluation in and of itself.

Evaluation ethics
Evaluation ethics has been discussed and debated in the context of international development. The American Evaluation Association (AEA), Australian Evaluation Society (AES, AES2), Canadian Evaluation Society (CES), UK Department of International Development (DFID), and United Nations (UN), for example, have published guidelines and best practices of evaluation (Table 1). In Table 1 we list the topics about the ethics of evaluation drawn from the current ethical guidelines and good practices for evaluation by the aforementioned institutions.  However, these guidelines and best practices are not usually justified or supported by ethical theories. For instance, Helen Simons, a plenary speaker at The Framing Ethics in Impact Evaluation workshop (Barnett & Munslow, 2014) argues that the current ethical guidelines are mostly principles of intentions and they are often about methodology of evaluation and about the quality of the evaluation product, while ethical guidelines should instead focus on whether research evaluation is good and right, pointing to the need for an ethical theory to guide behaviour and choices of evaluators. Further, the importance of socio-political contexts is highlighted by another speaker at the workshop, Laura Camfield: "Therefore, instead of having an absolute minimum standard, it seemed to be useful to put in place a process whereby standards may be arbitrated in relation to the specific socio-political context" (Barnett, Munslow (Eds.) 2014: 11).
Similarly, publications such as The Leiden Manifesto has drawn attention to the important topics for the ethical entailments for research evaluation. However, it is worth noting that they do not present deep discussions about the issues and challenges. Groves Williams (2016) argues that ethical advices tend to be different for evaluation and for research, as these differ in purposes and follow different processes; despite this, some consider the evaluation a kind of research activity, as it generates knowledge. Hence, it is important to consider the ethical theories for research evaluation, which considers specifically evaluation ethics in the context of research.

Ethical Theories
To address ethical issues of research evaluation, it is necessary to look at the ethical theories and instruments they provide. Studying traditional ethical approaches, Doris and Stich (2007) attempt at broadening the methodological scope of philosophical ethics. They argue that traditional ethical theories can benefit from an empirical approach toward the solution of ethical issues. It is thus possible that new ethical approaches, as well as introduction of implicit and tacit ethical knowledge, can provide us with explanatory mechanisms needed in our pursuit for considering ethics of research evaluation. Looking back at the theoretical tradition in ethics, which is applicable to research in general and research evaluation in particular, the general ethical choices are limited to certain number of theoretical approaches. The 'hard-core' of normative ethics, is comprised, first of all, of deontological ethics and consequentialist ethics, and, to a lesser extent, virtue ethics.

Deontological ethics
Deontological ethics places special emphasis on the relationship between duty and the morality of human actions. In deontological ethics an action is considered morally good because of some characteristic of the action itself, not because the product of the action is good. Deontological ethics holds that at least some acts are morally obligatory regardless of their consequences for human welfare. It might be even labeled as 'Duty for duty's sake'. The most typical examples: Thou shalt…, thou shalt not… (Old Testament); Love thy neighbor (New Testament); Good is to be done and evil is to be avoided (Thomas Aquinas); Act as if the maxim of your action were to become through your will a universal law of nature (Immanuel Kant).
Individuals are subject to absolute and universal rules and duties, which are defined independently of them. Individuals find themselves in the realm of duties given by extra-individual entities such as God, Humanity, Rationality, Weltgeist etc. These duties are claimed to be universal, therefore, they cannot be altered by an individual. Moral behavior is the one, which sticks to the rules without exceptions. Ethical instructions are reduced to clearly enumerated and limited number of norms. These norms and their explanations are straightforward directions. They leave no space for moral ambiguity and further discussions. This type of ethical theories faces at least two challenges: (i) as a rule, they are too general, and/or (ii) too rigid. Therefore, they are hardly applicable in concrete situations, e.g. one must know what is good in advance and this kind of knowledge tend to neglect individual differences in complex situations or in ethically desired cases? What if universal moral law cannot be applied in everyday action of a peer reviewer? Or what if being a researcher became a universal norm? In the context of research evaluation, deontological ethics can guide the list of norms, regulating behavior of research evaluators. The norms are to be formulated in a way of BEs and DOs or DON'Ts, for example: do not harm, respect, be objective.

Consequentialist ethics
Consequentialist ethics concerns universal values, for example, life, freedom, property, and so on. Moral behavior is defined by the values 'saved'. In other words, the only important criterion of moral action is the increase of the amount of common good in society. There is not a defined list of norms, as predefined norms are not relevant in consequentialist ethics. Consequentialist ethics is a theory of morality that derives duty or moral obligation from what is good or desirable as an end to be achieved. Morally good action is the one having the best possible consequences if compared with other actions. The leading principle in consequentialist is so-called social principle: The greatest good for the greatest number. Other consequentialist principles include: principle of consequences, principle of utility, principle of hedonism and principle of universality.
In the context of research evaluation, we can ask questions such as: What greater social good is created during the evaluation process? What values does the evaluation process refer to? How to maximize happiness in scientific community? Consequentialist ethics have at least three unresolved issues: (i) unpredictability of consequences, e.g. not every situation is as simple and transparent, as providing clear-cut assurance of its outcomes, (ii) hedonist approach, e.g. it would be rather wrong to suppress or censor the results of a research study, which would definitely make some politicians unhappy, and (iii) difficulties to measure and to compare the consequences of ethical action, e.g. how to provide a strong case in measuring the consequences of two or more conflicting values, such as truth and happiness, freedom and security, scholarly integrity and solidarity etc.

A mixed approach of ethics in research evaluation
In order to avoid the shortcomings of the ethical theories as discussed above, we propose a mixed approach to tackle the issues of research evaluation. We assume it is possible to combine deontology and consequentialism. This kind of middle-way approach has the capacity to transgress the boundaries of the rivalling theories and provide a basis needed for research evaluation ethics.
When one talks about principles, simultaneously she might state the values. In this case the utilitarian (or similar) values might be expressed as norms in a commandment way: e.g. universalism as a value might be transformed into a commandment or imperative. Certain values and virtues, such as honesty, responsibility, respect etc., might be transformed into duties adding verbs: be honest, stay responsible, respect others etc. In this sense both evaluation ethics and research ethics are mainly based on norms, rules and principles, manifested in statements, initiatives and manifestos. For example, the San Francisco Declaration on Research Assessment aka DORA (ASCB, 2013), The Leiden Manifesto for Research Metrics (Hicks, et al., 2015), The Hong Kong Principles for Assessing Researchers (Moher, et al., 2020) are formulated as imperative deontological claims, for example, ' do not use metrics as a surrogate measure', 'be open and transparent'. In Table 2 are presented some basic characteristics of principal ethical theories along with some examples of their use in research evaluation.
One of the major issues in deontological ethics is the source and authority of the principles: Who issues them? And why those subjected to them are supposed to comply with the principles? Eliminating entities (God, karma, Kant's The duty of the evaluator is to comply with the rules and norms, which are not under her control. Precise application of the rules is a priority of the evaluation procedures. Research is evaluated per se and not on its social consequences. As in Leiden Manifesto: protect excellence in locally relevant research; allow those evaluated to verify data and analysis; scrutinize indicators regularly etc.

Consequentialist
• Given values • Morality is based on the consequences of action • The notions of right and wrong are not clearly defined in advance • Teleology: priority of consequences (e.g. common good) over rules • Context-dependence The evaluator seeks to maximise common good assessing potential impact of the research under scrutiny. Research is not evaluated per se, its importance is revealed rather via its consequences. As in objectives of Cardiff Statement (2019): "the first is to restate and champion the fundamental role that the SSH play in society and the second is to call for an expanded role for the social sciences and humanities in tackling problems through interdisciplinary research." Mixed approach (of this paper) • Norms and principles are related to common good • Some notions of right and wrong are given but they might be changed by ethical subjects • Common good means taking stakeholders into consideration • Context-dependence The evaluator needs to take into consideration all the stakeholders that research under evaluation deals with. The norms might be more flexible and not that rigid as in deontology. Still the norms are present (which is denied by traditional consequentialism). universal rationality), we have to deal with different types of rule-giver or try to justify the rules in different ways. In this case we can consider the notion of common good in utilitarian ethics, for it is not necessary to interrogate the question as to who exactly creates the principles. Rather, it is more crucial to elaborate such a set of principles that increases common good. For example, Collins and Evans (2017) argue that science is a moral enterprise, guided by values that matter to all, that is, the idea of common good. Thus, if a researcher participates in the creation of common good, then research evaluation is supposed to take part in it as well. For the mission of the evaluators is to assure and to control the quality of research, by which valuable knowledge along with societal impact are fostered. In this case evaluation procedures could both enable societally relevant and block irrelevant research.
The next question -how to define common good? How not to neglect theoretical knowledge, which cannot provide impact immediately on the spot? A social contract is needed, meaning discussion inter pares and not the dictate used by the powerful to the powerless. If one delegates certain rights, she must receive something important in return, e.g. security, freedom of thought, expression, research etc. As a temporary solution the notion of the veil of ignorance as a precondition to the original position might be borrowed from John Rawls (1971Rawls ( , 2001. If any stakeholder making an ethical list of principles does not know which part (including the most unprivileged one) she is going to perform during the evaluation process, then the principles might be more just or justified than otherwise.
All the above means that discussing the principles for research evaluation ethics it is necessary to organise them around the notion of common good or collective good (as Kitcher 2001 alternatively labels it) which would not solely fall under the criterion of utilitarian notion of societal happiness. By identifying the stakeholders and moral responsibility to them the research evaluation ethics gains a legitimate ground to construct further research evaluation principles and norms.

Conclusion: Ethical Considerations in Research Evaluation
One of the crucial issues of the research evaluation ethics is its borders and scope of the field. In other words, it is the question of what is ethical. Or, how to clearly distinguish between, let's say, the epistemological and ethical? According to Mustajoki and Mustajoki (2017), recognition of ethical questions comprises of a three-fold way: (a) identification of stakeholders (e.g., individuals, groups, communities, animals, ecosystems, future generations etc.), (b) understanding rights and responsibilities of the stakeholders and for the stakeholders, and (c) definition of options, i.e. looking for the win-win situation, or at least proximity to it, for the stakeholders involved. Although Mustajoki and Mustajoki (2017) do not refer to the idea of common good, the three criteria of the ethical presuppose common good as a horizon of ethical deliberation.
If the ethical aim of any research is to increase the amount of common good or impact, then what is the ethical aim of the research evaluation? Or to paraphrase the latter question, whom the research evaluator is responsible to? An abstract notion such as common good serves as a horizon, which provides a both thematic and problematic framework for the ethical consideration in the research evaluation. Also Social Sciences and Humanities contribute to the common good, notwithstanding they are usually difficult to trace, track, and measure. Therefore, in order to make it operational ' common good' is to be divided into smaller realms inhibited by different collective stakeholders. In practice it means the following: every evaluator should consider the target groups (stakeholders) that research is dealing with. The authors of ENRESSH Policy Brief on Research Evaluation (Ochsner et al., 2020) claim that one can find four major categories-research production, research consumption and use, research policy and administration, evaluation servicesand three intermediary categories of stakeholders, resulting in a taxonomy of twenty different types of stakeholders: from researchers to business, from cultural institutions to research councils, from taxpayers to learned societies, from funders to data providers, and so on. And this diversity of potential stakeholders must be taken into consideration when ethical issues are discussed.
Evaluation procedures should start with acknowledging responsibility to the disciplinary and academic communities. Only truthful research of high quality might be of any societal value. Thus, at the initial stage of an evaluation scientific integrity is needed to be checked and evaluated. Research with considerable flaws is incapable to benefit the collective good in any meaningful sense. If and only if the research under consideration is both epistemologically and methodologically plausible and valid, it has potential to benefit the broader amount of common good in extra-academic communities or the larger society as a means of practical problem solving. Therefore, setting the ethical principles for the research evaluation we must concentrate on these, which increase the amount of common good for different groups of stakeholders. Schwandt (2015) urges general guidelines for evaluation grounded on the requirement of ' critical thinking', which involves the absence of political, personal, cultural and disciplinary biases and refuses the group-centered perspective and prejudices. More recently, Schwandt (2018) urges the necessity of a 'professionalism in evaluation', that is an ethical culture of evaluation, which concerns interpersonal relationships, the right conduct of evaluative process, the social responsibility, the necessity to serve the common good, to respect the dignity and cultural values of individuals and groups. Furner (2014) has suggested a conceptual framework for bibliometric ethics with the following essential tasks: The different systems of assessment can affect scientific production in academic and research institutions (Whitley 2011) and the crucial point concerns the potential limit, due to the fear of the assessment, of the universities' independence in pursuing research that follows unorthodox methodologies, or in developing innovative fields of research in disagreement with dominant approaches. The assessment of research activities, either ex ante or ex post, entail very important ethical issues. Pursuing the ' common good' in the research evaluation means that in case of multidisciplinary or interdisciplinary research every stakeholder with which the research is dealing with must be considered (ESF 2011).
In this article, we consider three ethical theories-deontological, consequentialist and virtue ethics-and propose a mixed approach for developing a framework in the design and development of research evaluation. Moreover, the ethical theories can be deployed in analysing empirical findings for understanding the ethical approaches, as well as ethical dilemmas, in research evaluation.