Welcome to our Systematic Review of Outcomes Contracts Collaboration (SyROCCo) Machine Learning tool. This tool allows users to navigate and explore data extracted from nearly 2000 academic and grey literature publications related to outcomes-based contracting. It is based on the broadest and deepest evidence review undertaken in the field of outcomes contracting to date. SyROCCo is the product of a collaboration between the Government Outcomes Lab and machine learning experts from the University of Warwick.
The tool is currently a beta version and we would welcome any feedback.
How to cite this work: Fang, Z., Arana-Catania, M., van Lier, F.-A., Outes Velarde, J., Bregazzi, H., Airoldi, M., Carter, E. & Procter, R. (2024). SyROCCo: enhancing systematic reviews using machine learning. Data & Policy, 6, e39. DOI: 10.1017/dap.2024.33
The dataset behind SyROCCo can be found here.
This tool has a default threshold for mentioned countries: 2 times. That is to say, if a country was mentioned twice in a paper, the tool will consider that the paper is discussing that country. If a country is only mentioned once, the machine learning algorithm will not pick it up. We designed the algorithm in this way to narrow the list of papers and direct the attention of readers to the papers where the country of interest is mentioned the most. Users have the possibility to change the default threshold and establish a different one using this slide bar.
Users can adjust the similarity threshold using this bar slide.
Many practitioners and researchers are interested in the development of outcomes-based projects in particular countries. If users hover over a particular country, they will see how many papers have mentioned it. After double clicking on one country, users will find a summary of all papers mentioning that country at least two times. Users can adjust this threshold with the 'Country threshold' slide bar.
There are different ways to visualise the underlying data of this tool. Users can also use a set of bar charts that present the distribution of papers according to the year of publication, the policy sector and the income level of the countries that they mentioned. Users can click on the bars to see more details.
The similarity graph shows the semantic similarity between different articles. Each blue dot in this network map represents a paper or an article discussing outcomes-based contracts. If a blue dot is connected to other dots, it means that the machine learning algorithm considers those articles to be similar. If a paper is not similar to any other paper, it is not displayed in this visualisation. Users can click on the nodes to view the details of the corresponding articles.
This bar chart shows the distribution of papers according to the year of publication.
This bar chart shows the distribution of papers according to their policy sector. Papers that are not about a particular policy sector were labelled as 'uncategorised'.
Most papers are discussing outcomes-based contracts in different countries or regions. The machine learning algorithm identifies countries and classifies them according to the World Bank Country Income Level Classification 2021-2022. This bar chart shows the distribution of papers according to the income level of the countries that they have mentioned.
The Government Outcomes Lab team provided a list of keywords related to the different policy sectors that we usually use in our INDIGO datasets: employment and training, child and family welfare, health, education, homelessness, criminal justice, agriculture and environment and poverty reduction. Using those keywords as a reference, a machine learning algorithm calculated the probability of each paper belonging to a particular policy sector category. The Government Outcomes Lab team manually checked the list of papers and the probabilities and identified a probability threshold where the papers were not assigned to the correct category any more. The main goal was to understand which papers were correctly classified by the machine, and which ones were not. After identifying those thresholds, all articles with a probability higher than the threshold, were identified with the corresponding policy sector. Please note that the policy sector classification is based both on the machine learning algorithm and the decisions of the Government Outcomes Lab team on thresholds. Therefore, it is possible that the tool is not perfectly capturing and allocating papers to the correct policy sector.
A machine learning algorithm reads the papers and identifies geographic entities, (disregarding irrelevant parts of the papers, such as acknowledgments and bibliographies). Geographic entities include continents, countries, provinces, regions, etc. Under the summary results, the tool is showing the five most frequently mentioned geographical entities. Geographic entities are ordered from the most mentioned entity to the least mentioned.
A machine learning algorithm reads the papers and identifies legal entities, without reading certain parts of the papers, such as acknowledgments and bibliographies. Legal entities include laws, acts, decrees, regulations, etc. Under the summary results, the tool is showing the five most frequently mentioned legal entities. Legal entities are ordered from the most mentioned entity to the least mentioned.
A machine learning algorithm reads the papers and identifies organisational entities, without reading certain parts of the papers, such as acknowledgments and bibliographies. Organisational entities include any type of organisation that is working with outcomes-based contracts. These organisations can be outcome payers, commissioners, service providers, investors, evaluators, advisers, etc. Under the summary results, the tool is showing the five most frequently mentioned organisational entities. Organisational entities are ordered from the most mentioned entity to the least mentioned.
A machine learning algorithm reads the papers and identifies different types of financial mechanisms. Under the summary results, the tool is showing the five most frequently mentioned financial mechanisms. They are ordered from the most mentioned mechanism to the least mentioned.
A machine learning algorithm reads the papers and identifies key terms related with the Sustainable Development Goals (SDGs). Under the summary results, the tool is showing the five most frequently mentioned SDG terms. They are ordered from the most mentioned term to the least mentioned.
As part of our INDIGO initiative, the Government Outcomes Lab hosts an open and collaborative Impact Bond Dataset. This dataset collects data on impact bond projects, one specific type of outcomes-based contract, from all around the world. Using the list of Impact Bond Dataset projects as a reference, a machine learning algorithm searches for mentions of these projects. Under the summary results, the tool is showing the five most frequently mentioned Impact Bond Dataset projects. They are ordered from the most mentioned project to the least mentioned. Every project has a link to the corresponding Impact Bond Dataset record, so users can access the latest evidence and key data points at the same time.
These are the five most frequently mentioned geographical entities. Geographic entities are ordered from the most mentioned entity to the least mentioned.
These are the five most frequently mentioned legal entities. Legal entities are ordered from the most mentioned entity to the least mentioned.
These are the five most frequently mentioned organisational entities. Organisational entities are ordered from the most mentioned entity to the least mentioned.
These are the five most frequently mentioned financial mechanisms. Financial mechanisms are ordered from the most mentioned mechanism to the least mentioned.
These are the five most frequently mentioned SDG terms. SDG terms are ordered from the most mentioned term to the least mentioned.
These are the five most frequently mentioned countries. Countries are ordered from the most to the least mentioned one.
This is a list of all the impact bond projects that are mentioned in the paper and are also part of our Impact Bond Dataset. If you click here, you will be taken to the Impact Bond Dataset record for this project.