2004 Global: The Quality of Evaluations Supported by UNICEF Country Offices 2000-2001
Author: Watson, K., Rideau Strategy Consultants, Ltd.
UNICEF has a decentralised accountability structure for evaluation. The UNICEF country offices undertake evaluations of programmes and projects for oversight and learning. Although the UNICEF Regions and Evaluation Office at New York headquarters also conduct evaluations, these were outside the scope of this study. The Evaluation Office at headquarters provides functional leadership and overall management of the evaluation system, and conducts evaluations. The office reinforces UNICEF's evaluation capacity, with an emphasis on the requirements of country offices and capacity-building in countries; and maintains a database of evaluations and research studies. It also monitors and reviews the quality of UNICEF-sponsored evaluations and, in that context, commissioned this study.Purpose/Objective:
This study evaluates the quality of UNICEF evaluations conducted and commissioned by its country offices in 2000-2001. It is UNICEF's second review of evaluation quality this decade. It also follows various thematic reviews, including a review of UNICEF-supported education evaluations. The objectives of this study are to assess the quality of evaluations supported by UNICEF's country offices, to see whether progress has been made since the last review, and to recommend how quality might be improved.
Methodology: A 50% random sample of the UNICEF evaluation reports that had been submitted to New York Headquarters by the country offices in the years 2000 and 2001 was reviewed. Of those 97 reports, 75 were found to be evaluations, strictly defined. For approximately one third of the reports, we reviewed the Terms of Reference for quality as well. Finally, the assessment team also examined the quality of 31 Terms of Reference for our sample of evaluations — about one third — all that were available at New York headquarters for this set of studies.
A questionnaire containing 23 questions, some structured and some open-ended, was used to gather information from UNICEF country offices. In addition, a survey of staff in a 50% sample of UNICEF country offices  was undertaken. The response was close to 100%.
The quality assessors were independent. The review was commissioned by competitive tender, and no reviewer had been involved with any of the evaluations. The research design for this study was not subject to ethical review. However, since the work was done entirely from existing UNICEF reports, and from information proffered by UNICEF staff in the course of their professional duties, no issues of informed and competent consent were expected to arise. The only ethical issue was the protection of informant confidentiality during this work.
This study had two significant limitations. First, it was based solely on documents, interviews at New York headquarters, and a written questionnaire for country offices. Country offices were not visited, and country office staff not interviewed in depth. Evaluation procedures were not observed directly, nor files reviewed. Partners and users of UNICEF evaluations were not interviewed. Our judgements were based solely on the quality of the evaluation reports and their written Terms of Reference. Consequently, we had a narrow base for our recommendations about how evaluation quality might be improved, and any recommendations on how better management of evaluation might improve quality should be read in this light.
The second important limitation of this study was that the evaluation reports examined were sampled from those submitted to New York headquarters by the country offices, and not all evaluations supported by country offices were submitted. We expect that poor reports are less likely to be submitted to headquarters. Respondents to the questionnaire indicated that their offices had completed approximately 75 evaluations during 2000 or 2001 that had not been submitted to New York headquarters. However, there may have been more. On the other hand, some reports not submitted may not have been evaluations by our definition. The reader should keep in mind that selective submission of evaluation reports by country offices may have resulted in bias in our sample.
Findings and Conclusions:
It was found that UNICEF evaluations are not consistent in quality. About one in five are excellent, but the worst third are sufficiently poor to constitute a serious problem. The five aspects of quality on which the UNICEF evaluations did best were:
- The objectives of many evaluations, and the questions to be answered, were often stated fully and clearly.
- Many evaluation reports were clear, transparent, and easily accessible to the reader. The best were concise, well-organised, and logical, with clearly written text, supported by tables, figures, and descriptive headings, and led by an executive summary.
- The objectives of the evaluation, and the questions to be answered, often reflected UNICEF's mission and approach to programming, including protection of children's rights, promotion of their welfare, and gender equality (gender being the weakest of these).
- Recommendations were often well-based on evidence and analysis.
- The qualitative and quantitative information gathered by many evaluations was, in aggregate, adequate to answer the evaluation questions.
The five criteria of quality on which the UNICEF evaluations did worst were:
- Costs were not well-described, and were seldom compared with results.
- The "outputs" of the programme or project were often not adequately described or measured and, with this missing link, the causal chain from activities to outcomes was broken.
- Ethics review was seldom undertaken at the research design stage, and the topic of research ethics was seldom addressed in the reports. It is, of course, vital that the evaluation design be ethical and include ethical safeguards where appropriate, including protection of the dignity, rights, and welfare of human subjects, particularly children, and respect for the values of the beneficiary community. We have no opinion on whether there were any ethical problems with the research, such as competence or informed consent, but simply note that the evaluations seldom addressed the topic. The evaluators seldom made a statement about how their objectivity and independence were ensured.
- The evaluations were generally parochial. The degree to which the project, programme, or initiative might be replicable in other contexts often was not described.
- Lessons learned often were not generalised beyond the immediate intervention being evaluated to indicate what wider relevance to UNICEF there might be.
To improve the quality of its evaluations, UNICEF could focus on risk or excellence or, of course, both, if sufficient resources can be mobilised.
- Option 1: Maximise UNICEF's influence by focusing evaluation efforts on producing a relatively small number of excellent evaluations of intervention strategies in vital areas of intervention, and with wide replicability.
- Option 2: Minimise risk by upgrading the poorest third of UNICEF evaluations to minimum professional standards.
The tools to upgrade the worst third of evaluations should be appropriately simple. They might include generic frameworks for evaluation terms of reference (including guidelines for processes, time frames, and budgets), and standard Tables of Contents for the two main types of evaluations [performance evaluations and evaluations of intervention strategy].
At the other end of the quality spectrum, the problem and the appropriate response are quite different. UNICEF has produced many evaluations that are good but not excellent. Achieving excellence, starting from this base of good work, would be more difficult and expensive than the "minimum standard" option described above.
Three things would be important:
- Engaging the best evaluation professionals, which would probably be considerably more expensive than some country offices think they can afford.
- Insisting on rigorous research designs that are much less impressionistic than UNICEF evaluations often have been.
- Developing methodologies for evaluations of various strategies for rights-based interventions. The cultural, political, and legal dimensions of rights-based interventions make their evaluation particularly challenging, and UNICEF often would be breaking new ground in evaluation methodology.
Achieving excellence would require highly trained evaluation managers, as well as highly qualified consultants and larger budgets to enable more thorough primary data collection and analysis. All these raises issues of how much UNICEF is able and willing to pay for evaluations. These issues are beyond the scope of this study. Perhaps the only way to achieve excellence within an affordable budget is by doing a limited number of broadly-relevant evaluations of intervention strategies in key topic areas, each year, each involving several country offices. The design and coordination of such evaluations might require leadership from UNICEF regions and headquarters.
Another approach to excellence, within the constraints of economy, is to undertake more evaluations of intervention strategies jointly with other development agencies, including UNDAF partners, the multilateral development banks, and the major bilateral agencies. Cases where this was done for performance evaluations of jointly-funded interventions tended to be better quality than UNICEF-alone efforts. Since many areas of rights are of common interest, it might be possible to institute a series of joint evaluations of intervention strategies in key areas, with one or more partners among the international agencies.
The suggestions above relate to UNICEF's own evaluation capability and performance. They will not improve the evaluation capability of host-country agencies. Attempts to upgrade local capability by involving a few local staff and/or local consultants in UNICEF evaluations in minor roles may not be very effective. Most developing countries need more structural assistance to improve their evaluation capabilities, such as forming a national evaluation association, with affiliation with an international professional association. Again, other international agencies are interested and involved in enhancing the evaluation capacities of national partners, and UNICEF should undertake joint capacity-enhancement projects, wherever possible.
Our general recommendation is, of course, that everything possible be done to improve the quality of UNICEF's evaluations. However, we are aware that this study is only partly adequate as a basis for recommendations on how to do this. It does not cover all UNICEF evaluations, nor does it examine UNICEF's evaluation resources, systems, and practices. That said, we make the following main recommendations:
1. UNICEF Evaluation Office should formulate an action plan for evaluation quality improvement in response to this study. This study should be complemented by a study of the resources and organisation of UNICEF evaluation. Given the persistence of the same quality problems over a long period of time, some systemic changes might be in order. Pending the outcome of such a complementary study, consideration should be given to balancing UNICEF's decentralised country office-focused evaluation system with strengthened requirements for review of evaluation research designs outside the initiating country office at the time the terms of reference are being formulated. Better research designs are probably the single thing most likely to improve the quality of UNICEF evaluations. Each evaluation study should have a methodological review and challenge by a peer outside the country office. For each evaluation budgeted at over $25,000, this review should be based on a full evaluation framework.
2. Since virtually all UNICEF evaluations involve human subjects, the country office evaluations should be subject to stronger requirements for ethics review before implementation. We suggest that UNICEF Evaluation Office state an evaluation research ethics review policy, and that an appropriate system of evaluation research ethics review be established. This would include a policy on ethics review of Terms of Reference, subject competence and informed consent, and a policy on adverse event reporting. We do not believe that it is sufficient for the Evaluation Office to rely on other UNICEF policy statements, in this respect, or that the Technical Note extant is sufficient. In addition, each evaluation report should contain a statement of how the objectivity and independence of the evaluators were ensured.
3. Since country office evaluation reports continue to exhibit many of the deficiencies found in 1995 regarding evaluation reports in the early 1990s, despite improved evaluation policy, guidelines and technical notes, we recommend that certain things be made mandatory content for every evaluation study (this requirement reinforced by model Tables of Contents for evaluation reports). Evaluation reports should follow a standard format, unless there is a good reason for varying it. UNICEF terms of reference for evaluation studies should include a draft Table of Contents for the evaluation report, such as those models shown in Appendices 3 and 4. Standard content should include the following:
- There should be a clear description of what is being evaluated, including sufficient background and context to enable a reader unfamiliar with the country and programme to fully understand it, and explicitly describing the relevance and replicability in other contexts of the project/programme and its evaluation. All UNICEF evaluations of intervention strategy should address the wider relevance of lessons learned, and the replicability and scalability of the successful aspects of the project or programme. There should be a full profile of the intervention and its context, an attribution analysis where appropriate, and a consideration of replicability, scalability, sustainability, and environmental aspects. There should be a clear statement whether the study is a situation analysis, an evaluation of alternative intervention strategies, and/or an evaluation of UNICEF's or its partner's past performance in a programme or project.
- An analysis of costs and, where appropriate, of efficiency, should be part of all evaluations. UNICEF should have guidelines on how to estimate the full cost of a project or programme, including UNICEF staff time, contracts, costs of partners, and costs of participants.
- There should be detailed measurement of the programme/project outputs. To enable this, UNICEF contracts/agreements with implementing agents should be output performance-based. This base of clear output-performance agreements is essential to a results-based approach to good management, and to enabling good performance evaluation.
- If the project or programme is expected to be sustained, with or without continued UNICEF funding, then the mechanisms for ensuring this should be explained. If commitments from others are necessary, then these should be described. Self-sustainability through cost-recovery should always be one sustainability alternative assessed.
4. The UNICEF country office should record its decision on each evaluation recommendation. The evaluation manager (focal point) should prepare an action plan for approval by the country representative. The approved action plan should become an appendix to the evaluation report before it is submitted to NY headquarters for archiving. Where an evaluation recommendation requires action by another agency, UNICEF should ask that agency for a response. To facilitate action plans following evaluation reports, draft findings, lessons, and recommendations should be subjected to challenge in an "exit workshop" that involves all major stakeholders.
In addition to these main recommendations, we make the following suggestions:
General Format and Content
- Evaluation reports should be limited to a more or less standard length of presentation, say 50 pages + executive summary + appendices.\
- Evaluation reports should discuss data quality, and explain the effects of reach constraints, attrition, and/or non-response.
- An executive summary of the evaluation report should be translated into the local language[s]. This should be part of the Terms of Reference, and appropriately budgeted.
We suggest that NYHQ Evaluation Office and Regional Offices develop a three-year rolling evaluation plan in conjunction with the country offices in order to coordinate evaluation activity and ensure adequate coverage of key issues. The plan should be updated annually. UNICEF needs some means of frequently updating country office awareness of what evaluations are being started and completed in other country offices. We suggest that the full text of completed evaluations from the current year and two previous years be accessible on the Internet in Adobe format, and keyworded for easy search and access. We believe that, to facilitate openness and to give an incentive to improve quality, all UNICEF evaluations that cost more than, say, $10,000 should be available to the public on the open website.
- UNICEF should require that an electronic copy of all Terms of Reference be submitted to the Regional Office and NYHQ before an evaluation is contracted and an electronic copy of the evaluation report afterwards.
- To assure that minimum quality standards are met, UNICEF should consider instituting more requirements for management, evaluation, and sector expert sign-offs, especially at the evaluation research design stage. Sign-offs should be acknowledged in the final report.
Full report in PDF
PDF files require Acrobat Reader.