Behind the findings - responding to the Innovation Fund Evaluation
8 Oct 2018, 10:10 a.m.
Share this article:
This piece by the GO lab responds to the results of the Innovation Fund quantitative evaluation which were mostly negative (in contrast with the earlier qualitative evaluation). We provide a summary of the findings, cast a critical eye over the methodology, and ask what we can learn.
A quantitative evaluation of the Innovation Fund (IF) pilot projects has just been published by the Department for Work and Pensions (DWP). The IF was a £30 million programme comprising 10 SIB pilot projects delivered between April 2012 and November 2015. The programme aimed to help disadvantaged young people re-engage in education, training and employment. Here are the findings:
Surveys were conducted on 891 project participants aged 14 to 18 and compared to an equal number of non-participants.* After one year, fewer young people who participated in the SIB projects were in education and employment than from the comparator group, but more were in training. This difference was most pronounced for over-16s, where just 59% of the SIB group were in any type of education, employment or training compared with 75% of the matched group.
Administrative data for 14 and 15 year-olds from the National Pupil Database found that the programme helped young people achieve NQF level 1 qualifications. However, up to three years after starting the programme, the proportion achieving NQF levels 2 and 3 were lower than in a comparator group, while persistent absences and exclusions from school were higher for those who participated in the SIB projects.
Social Return on Investment (SROI) analysis found the IF pilots did not achieve value for money when understood in relation to the programme’s (negative) impact. As the evaluation noted, the findings imply that many outcomes would have been achieved without the programme.
Reflecting on negative findings
The results of the Innovation Fund pilot projects may appear negative. At first glance it is tempting to see the failures, scan the room for someone to blame, and choose alternative programmes next time. However, we would ask that you take a second glance at the evaluation and use a slightly stronger lens. In our latest evidence report we argue that ‘we must create an environment where policy makers, practitioners and researchers are open and frank about SIB failures as well as successes, and evaluations are published promptly’. This evaluation is not a good news story and the DWP could have buried these findings. They didn’t, and we commend them for that.
So, our response to these findings is to immediately ask ‘what can we learn?’ Why were young people who attended SIB activities less likely to be in education, training or employment? Why didn’t young people achieve more than their NQF 1 qualification? Why were they more likely to be excluded? We must seek to understand what happened and why it may have led to less inspiring findings.
Examining the methodology
When asking such questions, we need to first look at how the evaluation came up with these results. Looking through this stronger lens, we must acknowledge that evaluations are not perfect. This report openly states the limitations and the need to exert caution when reading. We draw on three examples to highlight these limitations.
Firstly, the sample and comparator groups may not be fully comparable, which would bias the results. Participants were matched with non-participants by a method known as propensity score matching (PSM). However, not all characteristics and circumstances could be observed in both groups so the accuracy of the matching may not be perfect.
Secondly, the survey study took place after the first year of the pilot projects. Not only may these observations have been premature, especially for young people with more complex needs, but they also covered only participants who enrolled within the first six months, perhaps before all SIB projects had ‘bedded-in’. Having said this, similar results were shown by the part of the evaluation which was based on administrative data from 14/15-year-old participants and matched comparators, which covered all years of the programme and followed them for up to three years.
Thirdly, it is unclear whether the comparison group were involved in other educational programmes. If they were offered more support this could potentially mask the benefits for the participants.
So, when reflecting on the quantitative evaluation of the IF programme we know there are negative findings, but the evaluation itself is not watertight. Results may appear skewed and unrepresentative. This is brought into stark perspective when we take a sideward glance at the results of the qualitative evaluation. These results suggest the IF programme was a resounding success. Griffiths et al. state "Project deliverers, investors and intermediaries perceived the pilots to have been a great success, with targeted numbers of outcomes met or exceeded and investments largely repaid to social investors". It was also found that some projects actually reached the ‘cap’ for the maximum number of outcomes payments that they could claim within their contracts.
Whilst positive comments and counting the number of outcomes is not the same as showing impact, the results don’t seem to match up. We are left with a paradox and a need to gain a stronger grip on the findings. In order to do so we must look at qualitative and quantitative insights in relation to one another to understand the project as a whole. We need to take a closer look at the data to learn more about the results, understand how the design of the payment mechanism may have led to expected and unexpected behaviours, and complete better evaluations. It would be interesting, for instance, to see whether providers over-recruited participants to achieve easier outcomes to hit maximum payment. Now that all the projects have been completed, it would be useful to have data about those involved and the outcomes that were paid for.
Acknowledging that we might hit some rocks along the way, we want to dig a little deeper. Evaluations are not there to reveal successes and bury failures – they are there to help us learn. We need to be transparent about weaknesses and not see them as a sign of failure as it would be a much greater failure not to learn from our mistakes. For if we do make mistakes, let’s make them original.
*To conduct the surveys, administrative data was used as well as the survey responses. The data used was from the national pupil database and the individual learning records for people in Further education. For clarity, the findings were divided into three parts.