SkillsBank is a website that provides easy access to rigorous evidence on how to promote skills at different stages of the life cycle. This tool is expected to be used by policymakers, practitioners, and researchers. SkillsBank presents evidence on how to tackle six policy challenges that entail improving the following outcomes:

Early childhood cognition

Early childhood behavior

Learning in primary school

Enrollment in secondary school

Completion of secondary school

Learning in secondary school
For each policy challenge, such as improving learning in primary school, SkillsBank presents average effects of the main program types. For example, it shows that, based on the available evidence, providing lesson plans to teachers has improved learning by 9 learning points (on average). As a benchmark, an average third grader in the United States raises her scores in math by about 40 learning points in a year.
Moreover, SkillsBank provides key and detailed information about the rigorous evaluations used to estimate this average effect. That is, the website is structured so as to provide not only a quick, overall snapshot into what works, but also detailed information about the existing evaluations.
The information recorded in SkillsBank is the product of systematic reviews of the literature. For example, we performed a systematic review on how to improve learning of math and language in primary schools. Systematic reviews are increasingly recognized as the gold standard for summarizing scientific evidence in a particular area. This recognition stems from the emphasis on using explicit, objective, and standard procedures in systematic reviews.
Each systematic review performed to produce data for SkillsBank consisted of seven steps that are listed in the following table, and explained in more detail further on. For greater clarity, we have illustrated these steps for the systematic review performed on how to improve learning in primary school.
Steps involved for each systematic review
Steps  Description 

1. Search papers  Identify papers describing evaluations that could potentially be included in the review 
2. Filter evaluations  Keep only those evaluations that meet the inclusion criteria 
3. Code variables  Extract and code relevant variables from the evaluations to be included 
4. Categorize interventions  Classify interventions into groups called program types 
5. Compute effect sizes  Compute effects that are comparable across evaluations 
6. Combine effect sizes  Combine effects to produce one effect per evaluation 
7. Generate averages  Produce an average effect by running a metaanalysis 
Step 1: Search papers
This first step involved searching for studies that could potentially be included in the review. To find these papers, we used different approaches, including extracting references from previous reviews and performing keyword searches in bibliographic databases, such as Google Scholar. In the case of learning in primary, this step resulted in a list of studies containing information on evaluations of interventions that sought to improve learning in primary schools.
Step 2: Filter evaluations
This step involved determining which of the studies identified in the previous step should be included in the review. To that end, we developed an inclusion criteria that listed all the requirements that a study had to meet in order to be included in the review. For example, in the Learning in Primary review, only studies using at least one of the following methodological designs were included: experimental evaluations, regression discontinuity, instrumental variables, and differencesindifferences. The objective of this step was to identify a reasonable number of highquality studies that, together, could provide a good assessment of the expected effects of each program type.
Step 3: Code variables
This step involved coding relevant variables. Some variables that were constructed contained information about the context of the evaluation (e.g., participants’ age), and about the features and nature of the intervention. Other variables contained methodological information, such as the design used to estimate effects (e.g., whether an experimental evaluation was employed), and measures used to identify the effects (e.g., type of tests used). Finally, there were variables that had information about the documented effects. All variables were coded using standard procedures to ensure comparability across evaluations.
Step 4: Categorize interventions
The fourth step in the review involved classifying interventions into groups called “program types.” The basic goal of this step is to group similar interventions together. In our analysis, we generated categories that were exhaustive and mutually exclusive. That is, all interventions were assigned to one, and only one, program type. In the Learning in Primary review, we defined 20 different program types. For example, one program type included all interventions that entailed providing teachers with lesson plans.
Step 5: Compute effect sizes
This step involved computing effect sizes. An effect size is a quantitative measure of the effect of an intervention on a certain outcome that is comparable across evaluations. This step seeks to generate standardized effects that are comparable, and hence, that could be used to estimate an average effect. For the Learning in Primary review, we used the “standardized mean difference” as the effect size. Technically, the standardized mean difference can be computed by dividing the difference between the mean of the outcome in the treatment and the control group by the standard deviation of the outcome. The advantage of this procedure is that it allows comparing effects that were measured using different tests. Moreover, expressing effects in this manner is standard practice in the literature.
Step 6: Combine effect sizes
This step involved combining multiple effect sizes reported by one evaluation. For example, an evaluation could report multiple effects corresponding to different outcomes, such as learning in math and reading. Because the Learning in Primary review sought to generate an average effect of each program type on academic achievement, the effects on math and reading were combined. This essentially entailed performing averages of the different estimates to generate one effect for each evaluation.
Step 7: Generate averages
The final step involved generating an average effect for each program type. Specifically, we run a random effects metaregression for each program type by including as the unit of observation each study that measured the effects of interventions in that program type. For example, to compute the average effect of the “lesson plan” program type, we ran a random effects metaregression involving all evaluations included in the review that measured the effects of this program type.
Want to know more?
For more details, download our technical appendix.
Help us improve SkillsBank
Please contact us at skillsbank@iadb.org to let us know about any evaluations that should be included or of any information that needs to be corrected. We would like to improve SkillsBank and your help would be invaluable. Thanks!