We’re building a new UNICEF.org.
As we swap out old for new, pages will be in transition. Thanks for your patience – please keep coming back to see the improvements.

Evaluation database

Evaluation report

2016 Global: UNICEF GEROS Meta-Analysis 2015

Author: Joseph Barnes, Susie Turrall, Sara Vaca

Executive summary

UNICEF GEROS Meta-Analysis 2015: An independent review of UNICEF evaluation report quality and trends, 2009-2015


Global Evaluation Reports Oversight System (GEROS)1 during 2015. It synthesizes results of 90 evaluation reports, reviewed by two independent consultancy teams. It shares findings on a global level, as well as highlighting trends across regions, sectors trends and quality assessment criteria. This report contributes to a wider body of knowledge, of similar GEROS meta-analysis reports produced each year since GEROS began in 2010.
GEROS is underpinned by United National Evaluation Group (UNEG) norms and standards, UN System Wide Action Plan on gender equality (UN SWAP) and other UNICEF-adapted standards, including equity and human-rights based approaches. The system consists of rating evaluation reports commissioned by UNICEF Country Offices, Regional Offices and HQ divisions. All reports and the results of their quality assessment are made available in the UNICEF Global Evaluation and Research Database (ERDB), as well as made publicly available on the UNICEF external website. GEROS is an organization-wide system.


Quality reviews of evaluations completed in 2015 were carried out over a 14-month period, from January 2015 to February 2016. Two separate review teams undertook reviews of 2015 evaluations, Universalia Management Group (for reports submitted to onto the ERDB from January to December 2015) and ImpactReady LLP (for reports submitted to onto the ERDB from January to February 2016). Both teams included evaluation experts with a broad range of relevant sectoral knowledge and linguistic capabilities (English, French and Spanish). Reviews were delegated within teams according to ‘fit’ with thematic and language expertise.
Evaluation quality assessment was carried out for each report using 64 questions and a scale of four ratings: Outstanding, Highly Satisfactory, Mostly Satisfactory, and Unsatisfactory. The meta-analysis was conducted once all of the evaluation reports had been assessed, submitted to UNICEF EO and accepted. Quantitative data was compiled regarding scores for different aspects of the reports using Excel. Qualitative analysis of reviewer comments was used to explore any causal links. In addition, the reviews were searched to explore good practice from the reports. Particular attention was given to SWAP reporting. Quantitative and qualitative data were triangulated, and compared with longitudinal data on findings from four previous years to map key trends and patterns.
For the purposes of efficiency, the meta-analysis unit of assessment for quality is an evaluation report – a proxy of overall evaluation quality used by many UNEG members and SWAP. Whilst this serves the purpose of assessing quality, it is inevitably not a complete picture of evaluation quality and this limitation must be considered in interpreting the implications of findings.


The purpose of the meta-analysis is to contribute to achieving the three revised (2016) objectives of GEROS (particularly objective 1):
Objective 1: Enabling environment for senior managers and executive board to make informed decisions based on a clear understanding of the quality of evaluation evidence and usefulness of evaluation reports;
Objective 2: Feedback leads to stronger evaluation capacity of UNICEF and partners;
Objective 3: UNICEF and partners are more knowledgeable about what works, where and for who.


Conclusion 1: The strengths and weaknesses of evaluation reports remain similar to previous years. Elements of the evaluation that are influenced by the ToR (purpose and objectives) are an organisational strength, whilst areas for improvement include improving theories of change, stakeholder participation, and lessons learned.
Conclusion 2: The variations in the proportion of reports that are rated as meeting UNEG standards is best explained in terms how diverse the overall evaluation portfolio is. Greater diversity in terms of the types of evaluation being undertaken seems to slow – or even regress – the rate at which the quality of reports improves over time. The long-term trend, however, remains one of improvement in quality.
Conclusion 3: UNICEF is approaching UN SWAP commitments with regard to integration of gender equality, but significant scope remains for enhancing the use of gender analysis in developing findings, conclusions and recommendations. This
places UNICEF in a similar position to its comparable sister agencies.
Conclusion 4: Whilst there are recurrent shortfalls in overall report quality, a wide range of evaluative capacities are also evident from examples of high quality reports. However, these capacities seem to be available only in specific regions and strategic plan objective areas – suggesting a strong need for better internal learning and knowledge exchange in UNICEF.


Recommendation 1: Review the strategy for evaluation systems-strengthening to prioritise enhancements to the quality of less-frequent and complex types of decentralised evaluations.
Recommendation 2: Ensure evaluators clearly elaborate comprehensive stakeholder mapping, analysis of human rights roles (e.g. duty bearers and rights holders), and examination of an Object’s theory of change within the evaluation inception report.
Recommendation 3: Review the strategy for internal learning and knowledge sharing – especially for evaluation focal persons – to focus on addressing the persistent performance gaps identified through the GEROS process.
Recommendation 4: In order to meet SWAP standards by 2018, UNICEF needs to prioritise the inclusion of gender responsive evaluation frameworks, methods and analysis in all evaluations through: 1) increasing awareness of UNICEF staff around the SWAP evaluation performance indicators, 2) specifying gender requirements in all evaluation ToRs, and 3) include assessment of SWAP in all regional evaluation helpdesks.

Full report in PDF

PDF files require Acrobat Reader.



Report information



Management Excellence (Cross-cutting)




New enhanced search