IDEAL Coding Protocol
Comprehensive coding guidelines and protocols for IDEAL data extraction
The Survey
To operationalize the data extraction for the minimum set of fields in the meta-data schema, the IDEAL team has developed a set of survey fields to capture relevant information from each individual paper through a series of working group meetings.
Data extraction is currently conducted by human coders on a survey mask developed using Open Data Kit (ODK) tools in SurveyCTO. The initial dataset coded and checked by humans will serve as the ground truth data for future automated data extraction tools that would be integrated into IDEAL. To request a demo for the survey, please contact us.
Staged Data Extraction Workflow. IDEAL employs a three-stage data extraction workflow designed to manage complexity, ensure quality, and create logical dependencies between different types of information:
- Stage 1 extracts the structural characteristics of experiments.
- Stage 2 systematically locates treatment effects by matching outcomes, arm comparisons, and specifications identified in Stage 1 with the actual results reported in each exhibit.
- Stage 3 collects comprehensive details about experiments, interventions, outcomes, samples, and the identified treatment effects.
Quality checkpoints between stages ensure that only validated information flows forward, with supervisor review after Stages 1 and 2, and double-coding employed in Stage 3 where the bulk of detailed information is extracted.
Stage 1: Set-Up, Fields & Table-by-table
Stage 1 Set-Up
Coder name
CODING INSTRUCTIONS
EXAMPLES
See the [section] used in the paper to extract
Implementer type
Coding Instructions
- Select the types of all the entities that implemented the experiment.
- Please select all that apply. If there are both government and an NGO involved, choose both options.
- NGOs include both non-profit and for-profit non-governmental organizations that are self-managed.
- If a government contracts a private firm within the public sector management system, the implementer should still be considered as "government".
Examples
Example 1 - Chong et al., 2015:
The implementer for the experiment was "Innovations for Poverty Action".
Example 2 - Özler et al., 2018:
"Under PECD, the Government implemented the following interventions – in partnership with Save the Children and UNICEF" (p.4).
Paper ID
CODING INSTRUCTIONS
EXAMPLES
See the [section] used in the paper to extract
Paper title
CODING INSTRUCTIONS
EXAMPLES
See the [section] used in the paper to extract
Paper correction
CODING INSTRUCTIONS
EXAMPLES
See the [section] used in the paper to extract
Multi-site study entry
CODING INSTRUCTIONS
EXAMPLES
Barrera-Osorio et al. 2011: Report effects from two different experiments in Bogota in different parts of the city (San Cristobal and Suba). We would like you to code the experiments in San Cristobal and Suba separately. Only if you have submitted the entry for San Cristobal and are going to code the Suba experiment, please check this box.
Request for review: fields
CODING INSTRUCTIONS
EXAMPLES
Request for review: detail
CODING INSTRUCTIONS
EXAMPLES
Example: If a coder selects the "Estimand is full sample ITT and LATE/TOT" field in the request-for-review section, they could add an explanation about their uncertainty in this field.
Stage 1 Fields
Number of experiments in the study (expNum)
CODING INSTRUCTIONS
EXAMPLES
Barrera-Osorio et al. 2011: Report effects from two different experiments in Bogota in different parts of the city (San Cristobal and Suba). We know that there are two different experiments because the paper declares "As required by the SED, the assessment of the treatments was divided into two separate experiments located in two very similar localities in Bogota, San Cristobal, and Suba." The paper also reports that eligible populations for the tested interventions are different across the two sites: "Eligible registrants in San Cristobal, ranging from grade 6–11..."; and "The tertiary treatment was evaluated separately in an experiment in Suba, where students ranging from grade nine through eleven..."
Jeong et al. 2023: Evaluate the differences in how question modules in a survey are ordered in order to examine the effects of survey fatigue.
De Martino et al. 2015: Conduct a lab-in-the-field experiment on landholders and annual payment offers for environmental services. As the experiment conducted was hypothetical.
Number of experiments check (expNumcheck)
CODING INSTRUCTIONS
EXAMPLES
Barrera-Osorio et al. 2011: Report effects from two different experiments in Bogota in different parts of the city (San Cristobal and Suba).
Jeong et al. 2023: Evaluate the differences in how question modules in a survey are ordered in order to examine the effects of survey fatigue.
Country (country)
CODING INSTRUCTIONS
EXAMPLES
Chong et al. 2015: The intervention takes place in Mexico, so the coder would select the ISO code and country name for Mexico. [see: Abstract, Introduction, Experimental Design and Implementation sections]
Lyall et al. 2020: The intervention takes place in Afghanistan, so the coder would choose the ISO code and country name for Afghanistan. [see: Abstract, Introduction]
Muralidharan et al. 2021: The intervention takes place in Telangana, which is a state in India, so the coder would select the ISO code for India. [see: Abstract, Introduction, Setting and Intervention, and Research Methods sections]
Adida et al. 2020: The intervention takes place in Benin, so the coder would select the ISO code for Benin. [See: Abstract]
Sub-national location (subnationalLocation)
CODING INSTRUCTIONS
EXAMPLES
Barrera-Osorio et al. 2011: Report effects from two different experiments in Bogota in different parts of the city (San Cristobal and Suba). If a coder is entering the information for the experiment in San Cristobal, they should only enter "San Cristobal".
Gaikwad and Nellis (2021): Report effects from the same experiment in two cities in India, Delhi and Lucknow. In this case, the coder should enter "Delhi and Lucknow."
Intervention assignment strategy (intAssign)
CODING INSTRUCTIONS
EXAMPLES
In Barrera-Osorio et al, 2022, authors evaluate the performance-based reward program by randomizing primary schools into three distinct groups -- recognition, in-kind performance reward, and control [see: Sample and experimental design].
In Andrew et al, 2018 researchers randomized towns into four groups. The first received the psychosocial stimulation only (PS), the second received multiple micronutrient supplementation only (MN), the third received both interventions (PS and MN), and the fourth received neither (Control). The response to this field is factorial design because one arm receives the combination intervention PS + MN.
In Lopez et al, 2022 authors vary in which days doctors received a doctor-specific intervention, and in which days patients received a patient-specific intervention. As doctors (and patients) can cross between the control and treatment groups, the response to this field is crossover design [see: Data collection, figure 2].
In Miguel and Kremer, 2004 authors vary the timing in which three groups of randomly selected schools receive school-based deworming. As the control group crosses over into the treatment group by the end of the study, the response to this field is crossover.
Multi-stage randomization of interventions (intMulti)
CODING INSTRUCTIONS
EXAMPLES
Ichino and Schündeln (2012) used "a two-stage randomized design with blocking". In the first stage, constituencies were randomly assigned into treatment and control. In the second stage, approximately 25% of the electoral areas in each constituency were randomly selected to receive the intervention.
There were two units of randomization: (1) constituency and (2) electoral area.
Dolan et al., 2022, the randomization was conducted at site level. 87 sites were randomly assigned into three arms. Although some outcomes were measured and analyzed at the student level, there was only one unit of randomization. So, it is a clustered RCT but did not use a 2-stage randomization of interventions.
Gupta et al., 2024, the unit of randomization was household. Households were randomly assigned to receive the cash transfer intervention at different times, in a cross-over design. The randomization of the timing was conducted at one time and at the household level, so it is a single stage randomization.
Number of interventions (intNum)
CODING INSTRUCTIONS
EXAMPLES
Barrera-Osorio et al., 2022: Have two distinct interventions. "Out of this sample, 140 schools were randomly assigned to each of the two treatment arms – recognition or in-kind performance rewards – and 140 schools were randomly assigned to the control." For this field, the response would be 2 interventions. [see: Intervention, sample and experimental design]
Ozler et al, 2018: Have four separate arms and four unique intervention components. Out of the four interventions, one was common to all arms though not part of the status quo in child care centers outside the study sample. The response for this field would be 4 interventions: 1. Learning materials and supplies, 2. teacher training and mentoring, 3. teacher incentives, and 4. parenting training. [see: Interventions]
Egger et al., 2022: The two interventions are "cash transfer, high saturation" and "cash transfer, low saturation". The cash transfer provided to the household is the same in both interventions, but in one arm a larger share of households receive the transfer, so the intensity of treatment at the village level is different. This means there are two different (village level) interventions. [see: Figure A1]
Ichino and Schündeln (2012): Used "a two-stage randomized design with blocking". In the first stage, constituencies were randomly assigned into treatment and control. In the second stage, approximately 25% of the electoral areas in each constituency were randomly selected to be visited by registration observers.
There are three interventions in this experiment: (1) treatment at the constituency level; (2) control at the constituency level; and (3) visit by registration observers.
Miguel and Kremer (2004): Have a crossover design, in which the deworming treatment is phased in to different schools in different years: "Group 1 schools received free deworming treatment in both 1998 and 1999, Group 2 schools in 1999, while Group 3 schools began receiving treatment in 2001" (page 165). There are three "interventions" in this study, one is the program intervention: "free deworming treatment". The other two are "timing interventions" specific to the crossover design: treatment in 1998 and treatment in 1999.
Note that "treatment in 2001" happened after the study period, so should not be included in the interventions.
Intervention label (intLabel)
CODING INSTRUCTIONS
EXAMPLES
Barrera-Osorio et al., 2022: Section 2.1 (Performance-based reward program, p2) states "Rewards took the form of either goods (in-kind) or recognition, depending on the treatment arm to which the teacher's school was assigned. The value of the reward was determined on an absolute scale, without relative performance comparisons to other teachers."
Ozler et al, 2018: In this study, section 2.3 (Interventions, p451) describes the interventions. The 4 distinct interventions are – (1) "Provision of play and learning materials (intervention common to all arms)", (2) "Training and mentoring of teachers", (3) "Teacher incentives", and (4) "Parenting training".
Ichino and Schündeln (2012): Used "a two-stage randomized design with blocking". In the first stage, constituencies were randomly assigned into treatment and control. In the second stage, approximately 25% of the electoral areas in each constituency were randomly selected to be visited by registration observers.
There are three interventions in this experiment and the labels are: (1) treatment at the constituency level; (2) control at the constituency level; and (3) visit by registration observers.
The total number of study arms including control (armNum)
CODING INSTRUCTIONS
EXAMPLES
Barrera-Osorio et al., 2022: "Out of this sample, 140 schools were randomly assigned to each of the two treatment arms – recognition or in-kind performance rewards – and 140 schools were randomly assigned to the control." There are 3 treatment arms in this study: 2 treatment arms and 1 control arm.
Ozler et al, 2018: Has four separate arms and different subsets of 3 interventions are assigned to treatment arms. In this study, there are four arms -T1. Comparison Group: Provision of play and learning materials, T2. T1 + Training and mentoring of teacher, T3. T2 + Teacher incentives, T4. T2 + Parenting training [see: Interventions].
Miguel and Kremer (2004): Has three different arms: one arm that receives the deworming treatment in 1998 and 1999, one arm that receives the deworming treatment in 1999, and one comparison arm that receives the treatment after the data is collected.
Mapping interventions to arms (armMap)
CODING INSTRUCTIONS
EXAMPLES
Barrera-Osorio et al., 2022: "Out of this sample, 140 schools were randomly assigned to each of the two treatment arms – recognition or in-kind performance rewards – and 140 schools were randomly assigned to the control."
Knauer et al. (2020): Feature a factorial design in which each successive arm receives an additional intervention or two than the other arms.
Miguel and Kremer (2004): Have a crossover design, in which the deworming treatment is phased in to different schools in different years.
Unit of randomization (unitRand)
CODING INSTRUCTIONS
EXAMPLES
Ozler et al, 2018: Is a cluster randomized trial, in which Community-Based Childcare Centers (CBCCs) were randomized into control, where the children received a learning kit, or the three treatment arms in which children also received learning kits and a combination of different interventions. The unit of randomization is the CBCC since that is the level at which any of the treatments were allocated [see: Study design and sample selection]. There is only one unit of randomization in this experiment. In the follow-up field prompting the specific answer, the response would be "Childcare Center".
Guiteras et al, 2014: Is a cluster randomized trial, in which communities were first randomized to receive a community motivation and health information campaign, or an information campaign combined with subsidies for the purchase of hygienic latrines, or a supply-side market access intervention linking villagers with suppliers and providing information on latrine quality and availability, or no interventions. Second, within the subsidy communities, eligible households were randomized to receive subsidy vouchers through household-level lotteries. There are two units of randomization in this experiment: community and household.
Mapping units of randomization to interventions (unitRandMap)
CODING INSTRUCTIONS
EXAMPLES
Leaver et al, 2021: There are 5 interventions. The units of randomization in the study are -- district-subject-family and schools. Here, since there is more than one unit of randomization, we need to map each intervention to its unit of randomization.
Block randomization (block)
CODING INSTRUCTIONS
EXAMPLES
Ozler et al. 2018: A "block randomization" was used to assign childcare centers in each district to the four study arms.
Mapping blocks to interventions (blockunitRand)
CODING INSTRUCTIONS
EXAMPLES
Ozler et al. 2018: A "block randomization" was used to assign childcare centers in each district to the four study arms.
Number of stratification variables (strataNum)
CODING INSTRUCTIONS
EXAMPLES
Berman et al. (2019): The authors note that they stratify by province, share of respondents in the baseline survey that report at least occasional access to electricity, and the share of respondents reporting that the district governor carries the most responsibility for keeping elections fair.
Dupas (2011): The author notes that the randomization procedure is stratified by teacher training status.
Stratification variables (strataLabel)
CODING INSTRUCTIONS
EXAMPLES
Andrew et al, 2018: Since the response to 10 for Andrew et al, 2018 is Strata. Using text from the paper "Randomisation was done at the level of the cluster (town), after stratification by region. Within each of the 3 regions, 8 towns were randomly allocated to each of the 3 treatment groups and the control group using computer-generated random numbers" [Randomization and masking].
The response to this field would be "region" since that is the variable that makes up the strata as indicated in the paper.
Freeman et al. (2022): Authors use a stratified randomization design so the response to the previous question asking if the intervention was stratified is "Strata". The authors note "a stratified random design at the woreda-level was used to assign an equal number of study kebeles to either the Andilaye intervention or the control group receiving no intervention".
The response to this question is "Woreda", as it is the local term used consistently throughout the paper.
Stratification for study arms (strataSame)
CODING INSTRUCTIONS
EXAMPLES
Ozler et al, 2018: A "block randomization" was used to assign childcare centers in each district to the four study arms. "Centers were grouped based on mean height-for-age (HAZ) and Peabody Picture Vocabulary Test (PPVT - a measure of receptive vocabulary) z-scores, both of which were collected at baseline. The Ministry held a public lottery at each district capital where a representative from each center was asked to draw a colored dot from an envelope to determine that center's treatment status."
The response to this question would be "Yes", as the same set of variables were used for stratification.
Wolf et al. 2019: There are two stages of randomization. In the first stage, the intervention assignment was stratified by district and public/private status of the school. In the second stage, the interventions were assigned within groups created by treatment assignment in the first stage. The stratification variable is indicator for teacher training/parental awareness assignment.
The response to this question would be "No", since the stratification variables are different across interventions.
Mapping stratification variables to study arms (strataMap)
CODING INSTRUCTIONS
EXAMPLES
Wolf et al, 2018: Has 4 interventions and 5 study arms:
- Teacher training and coaching program
- Parental awareness meetings
- Text messages for teachers
- Picture-based paper flyers or texts for parents
Based on the information in the section "Randomization", in a first stage of randomization, three of the interventions (teacher training and coaching program; and parental awareness meetings) were stratified by district and public/private status of the school. The text messages for teachers and the texts/flyers for parents were assigned in a second stage of randomization. These interventions were stratified by treatment assignment in the previous stage so that stratification variables for this assignment of interventions to arms would be indicators for being assigned to the two of the arms.
So for this field, the stratification mapping for 5 study arms would be:
- Control - district and public/private status of the school
- Teacher training and coaching program - district and public/private status of the school
- Parental awareness meetings - district and public/private status of the school
- Text messages for teachers - indicator for school being part of teacher training and coaching program
- Picture-based paper flyers for parents - indicator for school being part of program with teacher training & coaching and parental awareness meetings
Other randomization methods (randDescrip)
CODING INSTRUCTIONS
EXAMPLES
None provided
Number of units of analysis in the experiment (unitAnaNum)
CODING INSTRUCTIONS
EXAMPLES
Ashraf et al., 2010: Include treatment effects for the full sample in Tables 2, 3, 4 and 5.
The unit of analysis in Table 2 is "Household" for the outcome "Household purchased Clorin (dummy)".
The unit of analysis in Table 3 is also "Household" for the two outcomes: "Water currently treated with Clorin" and "Drinking water contains free Clorin".
The two outcomes from Table 3 are also in Table 4.
The unit of analysis in Table 5 is "Household" for two outcomes: "Bottle exhausted?" and "Use Clorin for non-drinking water purposes".
Therefore, there is only ONE (1) unit of analysis in this experiment.
Unit of analysis variable (unitAnaLabel)
CODING INSTRUCTIONS
EXAMPLES
Ozler et al. 2018: Include treatment effects estimated using various units of analysis. The treatment effects on child assessments and behavioral problems (Tables 3&4) were estimated at the child level, so the unit of analysis of those outcomes is child.
Table 5 includes impacts on parenting quality, for which the unit of analysis is "primary caregiver" (see section 2.4.2, page 453).
For impacts on CBCC outcomes in Table 6, the unit of analysis is Community-Based Childcare Center (CBCC).
Unit of analysis category (unitAnaCV)
CODING INSTRUCTIONS
EXAMPLES
Ozler et al. 2018: Include treatment effects estimated using various units of analysis. The treatment effects on child assessments and behavioral problems (Tables 3&4) were estimated at the individual child level, so the unit of analysis of those outcomes is child.
From the pre-specified list of options, a coder would first choose "1. Individual" as the broad category and then choose "Child" as the category.
Table 5 includes impacts on parenting quality, for which the unit of analysis is primary caregiver. Similarly, a coder would first choose "1. Individual" as the broad category and then choose "1.11 Parent" as the unit of analysis category.
For impacts on CBCC outcomes in Table 6, the unit of analysis is Community-Based Childcare Center (CBCC). A coder would first choose "2. Organization or legal entity" and then "2.9 Other organization or legal entity" to type "Childcare center".
Number of exhibits with treatment effects (tableNum)
CODING INSTRUCTIONS
EXAMPLES
Ozler et al, 2018: The paper has a total of 13 tables and 2 figures. Of all the exhibits, 11 tables report treatment effects (i.e. Tables 3 though 13). However, Table 7 reports a robustness check and Table 8 reports quasi-experimental results using treatment assignment as an instrumental variable. These two tables should not be included. Thus, this paper has 9 tables with treatment effects for the full sample.
Leaver et al. 2011: Include 4 figures and 6 tables. Figure 1 and Tables 1 and 2 report experimental design and baseline characteristics. Figures 2, 3 and 4, and Tables 3, 4 and 5 include treatment effects for the full evaluation sample. Table 6 reports quasi-experimental results. Therefore, there are 6 tables or figures with full sample treatment effects in the paper.
Riley 2024: Has 6 tables and 2 figures. Figures 1 and 2 present take-up and balance. Table 2 only reports heterogeneous effects, and the rest 5 tables include at least one set of treatment effects for the full sample.
Ara et al. 2019: The outcome variables, Median duration of EBF and Median duration of any breastfeeding should be included in Table 2 because formal p-values are reported in the text.
Kondylis et al. 2016: The authors report treatment effects in Tables 6-10. However, they only report heterogeneous treatment effects by the gender of the farmer. In this case, we would include tables 6-10 for this paper.
Exhibit label (tableLabel)
CODING INSTRUCTIONS
EXAMPLES
Ozler et al, 2018: The labels are: Table 3, Table 4, Table 5, Table 6, table 7, Table 9, Table 10, Table 11, Table 12, and Table 13. First only enter "Table 3" in this field and answer the questions about Table 3, and repeat the process for each of the rest tables.
Leaver et al. 2021: Using the order of appearance in the paper, the labels are: Figure 2, Figure 3, Table 3, Figure 4, Table 4, and Table 5.
Presence of heterogeneous analysis
CODING INSTRUCTIONS
EXAMPLES
Examples to be provided
Number of outcome variables in the experiment (outNum)
CODING INSTRUCTIONS
EXAMPLES
Guiteras et al, 2015: The published manuscript does not show any table in the main paper. Figures 1 & 2 report the treatment effects, however, we can not obtain the precise statistics such as point estimates and standard errors directly from the figures. From the notes of Figure 1, "Figure displays the sum of the estimated coefficients and the control group means found in columns (2) and (6) of table S2 and column (2) of table S3. (A) Any latrine access; (B) hygienic latrine access; (C) open defecation among adults", we learn that the estimated coefficients can be found in tables S2 and S3 in the supplementary materials. Figure 1 includes three outcomes: "(A) Any latrine access; (B) hygienic latrine access; (C) open defecation among adults". Figure 2 includes three outcomes: "(A) Any latrine ownership; (B) hygienic latrine ownership; (C) open defecation among adults."
Ashraf et al., 2010: Include treatment effects for the full sample in Tables 2, 3, 4 and 5. The outcome in Table 2 is "Household purchased Clorin (dummy)". There is only one outcome. There are two outcomes in Table 3. They are "Water currently treated with Clorin" and "Drinking water contains free Clorin". The two outcomes from Table 3 are also in Table 4. The title of Table 5 includes Heterogeneity, however, the table reports the full sample estimates for two outcomes: "Bottle exhausted?" and "Use Clorin for non-drinking water purposes". Note that only full-sample treatment effects and their outcomes should be included.
Outcome name (outLabel)
CODING INSTRUCTIONS
EXAMPLES
Guiteras et al, 2015: Figures 1 & 2 report the treatment effects of five unique outcomes. The outcome names can be found in the notes below Figure 1 and Figure 2.
Ashraf et al. (2010): The outcome names can be found in Tables 2, 3, 4 and 5.
Outcome Result Type (outType)
CODING INSTRUCTIONS
EXAMPLES
Example 1 - Full-sample results only:
In Table 3 of Ozler et al, 2018, all treatment effects are estimated using the full sample of treatment and control units.
Example 2 - Sub-sample results only:
In Table 4 of Ganimian, Mulralidharan, and Walters, 2023, treatment effects are reported separately for male and female students (subgroup results).
Example 3 - Both full-sample and sub-sample results:
In Table 5 of Banerjee et al., 2020, the table reports both full-sample treatment effects and heterogeneous effects by gender (interaction terms).
Outcome unit of analysis (outUnit)
CODING INSTRUCTIONS
EXAMPLES
Ozler et al. 2018: Include treatment effects estimated using various units of analysis. The treatment effects on child assessments and behavioral problems (Tables 3&4) were estimated at the individual child level, so the unit of analysis of those outcomes is child. From the pre-specified list of options, a coder would first choose "1. Individual" as the broad category and then choose "1.12 Other" to enter "Child" as the unit of analysis, because "Child" was not an option in the list. Table 5 includes impacts on parenting quality, for which the unit of analysis is primary caregiver. Similarly, a coder would first choose "1. Individual" as the broad category and then choose "1.12 Other" to enter "Primary caregiver" as the unit of analysis, because that was not an option in the list. For impacts on CBCC outcomes in Table 6, the unit of analysis is childcare center. A coder would first choose "2. Organization or legal entity" and then "2.9 Other organization or legal entity" to type "Childcare center".
Outcome Sub-sample Type (outSubHTE)
CODING INSTRUCTIONS
EXAMPLES
Example 1 - Subgroup estimates:
In Table 4 of Ganimian, Mulralidharan, and Walters, 2023, treatment effects are reported in separate columns for male and female students.
Example 2 - Interaction terms:
In Table 5 of Banerjee et al., 2020, heterogeneous treatment effects are reported using interaction terms like "Outcome × Gender" in the same column.
Outcome Sub-sample label (outSubLabel)
CODING INSTRUCTIONS
EXAMPLES
Examples to be provided
Outcome Sub-sample specification (outSubSpec)
CODING INSTRUCTIONS
EXAMPLES
Examples to be provided
Number of rounds of data collection in the experiment (roundNum)
CODING INSTRUCTIONS
EXAMPLES
Freeman et al, 2022: Study collects household surveys and observation-based data at baseline, midline, and endline. In each round, both survey and observation data were collected at the same time [see: Data collection]. Additionally, Figure 2 illuminates the points in time when each of the data collection rounds was conducted.
Pande & Field, 2008: Only use online endline data on loans and repayment in the current paper.
Muralidharan et al. 2021: Use administrative data from three sources: 1. Register of landlords, 2. a record of check distribution maintained by the MAOs, and 3. bank records of check encashment [see B. data]. Table 1 suggests that the register data was collected between September and December 2017. Appendix C indicates that the authors received "the up-to-date MAO and bank-based datasets at three points in time: once in July, once in August and once in September 2018. Therefore, there are seven rounds of data collection in the study.
De Hoyos et al. 2021: Include the following data: i. Student assessments: 2013, 2014, 2015; ii. Student survey: 2013, 2015; iii. Teacher survey: 2013, 2014, 2015; iv. Principal survey: 2014, 2015; v. National assessments: 2016; vi. Internal efficiency: 2013, 2014, 2015, 2016, 2017. Based on data source and time, there are 3 rounds of survey data, as they are conducted at the same time of year, 4 rounds of assessment data (school and national), and 5 rounds of administrative data on internal efficiency.
Round name (roundLabel)
CODING INSTRUCTIONS
EXAMPLES
Freeman et al, 2022: Study collect household surveys and observations-based data at baseline, midline, and end line. [see: Data collection] Additionally, Figure 2 illuminates the points in time when each of the data collection rounds was conducted.
Ozler et al, 2018: Use three rounds of data collection [see: Data sources]
Muralidharan et al. 2021: Draw on three administrative data sources and have seven total rounds.
Pande & Field, 2008: Only use online endline data on loans and repayment in the current paper.
De Hoyos et al. 2021: Include the following data: vii. Student assessments: 2013, 2014, 2015 viii. Student survey: 2013, 2015 ix. Teacher survey: 2013, 2014, 2015 x. Principal survey: 2014, 2015 xi. National assessments: 2016 xii. Internal efficiency: 2013, 2014, 2015, 2016, 2017 Based on data source and time, there are 3 rounds of survey data, as they are conducted at the same time of year, 4 rounds of assessment data (school and national), and 5 rounds of administrative data on internal efficiency.
Data collection start date
CODING INSTRUCTIONS
EXAMPLES
Freeman et al, 2022: Has three rounds of data collection: Baseline, Midline and Endline. Figure 2 provides the timeline for data collection.
Baseline: March – May, 2017
Midline: March – May, 2018
Endline: March – May, 2019
For Midline, Year = 2018, Month = March, Day = -99
For Endline, Year = 2019, Month = March, Day = -99
Data collection end date
CODING INSTRUCTIONS
EXAMPLES
Freeman et al, 2022: Has three rounds of data collection: Baseline, Midline and Endline. Figure 2 provides the timeline for data collection.
Baseline: March – May, 2017
Midline: March – May, 2018
Endline: March – May, 2019
The answers to this question would be:
Data collection end date calculated from duration
CODING INSTRUCTIONS
EXAMPLES
Freeman et al, 2022: Has three rounds of data collection: Baseline, Midline and Endline. Figure 2 provides the timeline for data collection.
Baseline: March – May, 2017
Midline: March – May, 2018
Endline: March – May, 2019
The paper reports the start and end time for data collection directly, so the end dates were not calculated based on the duration information.
The answers to this question would be:
Stage 1 Table-by-table
For the set of fields below, coder goes through each table identified
Number of comparisons (tableCompNum)
CODING INSTRUCTIONS
EXAMPLES
Ozler et al, 2018 (Table 3):
- We see 3 comparisons in the first panel of rows:
- T2 (teacher training) vs. Control
- T3 (T2 + teacher incentives) vs. Control
- T4 (T2 + parenting training) vs. Control
- We see 1 additional comparison in the second panel of rows that pools all treatment groups into 1 group.
- Any Treatment (T2, T3, or T4) vs. Control
- The fourth panel of rows contains the precision but not treatment effects for additional comparisons:
- T2 vs. T3
- T2 vs. T4
- T2 vs. T3
Mbiti et al. 2019 (Table III): We see five comparisons in Panels A, B, and C: 1. Grants (α1) vs. None, 2. Incentives (α2) vs. None, 3. Combination (α3) vs. None, 4. Combination (α3) vs. Grants+Incentives (α2+α1), 5. Combination (α3) vs. Grants (α1).
Evaluation arm (armEval)
CODING INSTRUCTIONS
EXAMPLES
In table 3 from Ozler et al, 2018: We see 3 comparisons in the first panel of rows: 1) T2 (teacher training) vs. Control, 2) T3 (T2 + teacher incentives) vs. Control, 3) T4 (T2 + parenting training) vs. Control. We see 1 additional comparison in the second panel of rows that pools all treatment groups into 1 group: 4) Any Treatment (T2, T3, or T4) vs. Control. Thus, there are 4 different evaluation arms here.
In Table III from Mbiti et al. 2019, we see five comparisons in Panels A, B, and C: 1) Grants (α1) vs. None, 2) Incentives (α2) vs. None, 3) Combination (α3) vs. None, 4) Combination (α3) vs. Grants+Incentives (α2+α1), 5) Combination (α3) vs. Grants (α1). There are 5 different evaluation arms here.
Reference arm (armNonEval)
CODING INSTRUCTIONS
EXAMPLES
Ozler et al, 2018 Table 3: The reference arm for this table is the control group. Here, coders should select the arm "Learning kits" that was created when they mapped intervention to arms.
Mbiti et al., 2019 Table III: The five reference arms for this table are: 1. None, 2. None, 3. None, 4. Grants + Incentives [select two interventions], 5. Grants.
Estimand (tableestFull)
CODING INSTRUCTIONS
EXAMPLES
Baysan, 2022: In the section D. Implementation, the author notes "... Therefore, I estimate only the intent-to-treat (ITT) effect". There are four exhibits in the papers. Figure 1 and Table 2 report full-sample treatment effects using ITT. Table 1 and Figure 2 only present treatment effects by quartile instead of full sample, thus they are not considered in IDEAL data extraction.
Yoshikawa et al., 2015: The ITT/LATE/TOT/ATET terminology is not used. In the Data Analysis Strategy section, the estimation equation indicates that treatment effects are estimated based on the treatment assignment, i.e. being in a classroom in a FULL UBC prekindergarten. Thus, the estimand is ITT by inference. There are two exhibits in the paper including treatment effects. Table 1 reports the treatment effects for the full sample of teachers, and Table 2 presents the treatment effects for the full sample of children, both using ITT estimand.
Sigh et al., 2018: Do not use the ITT/LATE/TOT/ATET terminology and do not use an equation to describe the estimation. In 2.9 Statistical Analysis, they write "An analysis of the effectiveness of the intervention was based on the randomization of the product the patients were originally assigned using all available case data including patients missing follow-ups and dropouts. This analysis method handles missing data by fitting a statistical model over all available case data without introducing bias.", which implies the estimand is ITT as the analysis estimates treatment effects on patients originally assigned to receive the product. Table 4 reports the treatment effects through differences in means for all patients.
Linhares et al., 2022: Performed both ITT and TOT analyses, however, only TOT results are presented in the paper exhibits (Tables 3 and 4). The ITT results are omitted because the effects are not significant. The ITT estimates can not be extracted from the paper.
Ganimian, Mulralidharan, and Walters, 2023: Present both ITT and TOT results using two samples (i.e. HH assessments and AWC assessments). The authors explain in the Introduction, "Moreover, treatment-on-the-treated effects obtained by scaling the household sample estimates by the share of children observed at the center at the endline are close to the AWC and common sample estimates. We therefore interpret the AWC estimates as reflecting treatment effects on children who actively attended the centers, while the household estimates capture intent-to-treat-style impacts on the set of eligible children, many of whom had limited treatment exposure." In Tables 2, 5 and 6, both ITT and TOT effects on assessment scores are reported.
Banerjee et al., 2020: ITT estimates are reported for all outcomes (Tables 1 - 4). In addition, LATE estimates are also presented for learning outcomes (in Table 4).
Outcomes in table (tableOut)
CODING INSTRUCTIONS
EXAMPLES
Ozler et al, 2018 Table 3: Presents 5 outcomes. These are present in the top row of table 3 under the title "Dependent variable". Since these outcome names were collected previously in the survey, here coders should select:
Freeman et al. 2022 Table 2: Reports treatment effects on 10 outcomes. Each outcome is listed as a row in the Indicator column. The full list of outcomes collected for the study would appear as options, and coders should select:
Number of periods in the table (tableRoundNum)
CODING INSTRUCTIONS
EXAMPLES
Hanna et al., 2016 Table 3: Presents 5 periods used to calculate treatment effects. The first row reports a period that includes all four rounds of midline and endline surveys in estimating the treatment effects. The following rows report the treatment effects in each individual period of midline and endline surveys (one per year since the treatment was administered).
Ozler et al, 2018 Table 3: The title of the table specifies the results are for the 18-month follow up. Only one round of data collection was used in this table.
Rounds of data collection in the table (tableRound)
CODING INSTRUCTIONS
EXAMPLES
Ozler et al, 2018 Table 3: The title of the table specifies the results are for the 18-month follow up.
Hanna et al., 2016 Table 3: Presents treatment effects for individual rounds of data collection. For the individual rounds, coders should select the corresponding round names (e.g., 0-12 month survey, 13-24 month survey, etc.). For the pooled round, coders should select all four of the individual rounds.
Number of empirical specifications in the table (tableSpecNum)
CODING INSTRUCTIONS
EXAMPLES
Riley, 2024: The empirical Strategy describes the specifications in equation (1) that strata dummies and the baseline value of the outcome (if measured at baseline, otherwise excluded) are included in the estimation of treatment effects. In Table 1, where full sample results are reported, the notes indicate "All regressions include strata dummies and include the baseline value of the outcomes."
Ashraf et al., 2010: According to tables notes, in Table 2, two specifications are used for each outcome: with and without baseline controls (including baseline Clorin usage and water cholorination, general health behaviors and attitudes, household demographics, and locality fixed effects).
Grossman and Baldassarri, 2012: Report treatment effects using four set of specifications in Table 2 including no controls, individual controls, monitor profile (post-treatment control), and both individual controls and monitor profile. There are in total 4 specifications.
Stage 2: Set-Up & Fields
Stage 2 Set-Up
Coder name (coder)
CODING INSTRUCTIONS
EXAMPLES
see the [section] used in the paper to extract
Paper ID (paperID)
CODING INSTRUCTIONS
EXAMPLES
see the [section] used in the paper to extract
Paper title (X_titleConfirm)
CODING INSTRUCTIONS
EXAMPLES
see the [section] used in the paper to extract
Stage 2 Fields
This survey is going to go through each of the exhibits you specified in Stage 1 (and also verified by a supervisor) that reported treatment effects. The questions will allow us to match and specify the treatment effects that are eligible for data extraction for IDEAL.
Section Notes: Exhibit Information
In Stage 1, you reported that {Exhibit label} includes {Number of comparisons} contrast(s) and {Number of outcome variables in the table} outcome(s). In the next questions you will be asked to: Identify the IDEAL eligible treatment effect for each outcome-contrast pair; Describe the empirical specification and rounds of data collection used to estimate those treatment effects.
Number of eligible treatment effects (tfx_num)
CODING INSTRUCTIONS
WHAT ESTIMATES TO COUNT
HOW TO COUNT DATA COLLECTION ROUNDS
HOW TO COUNT SPECIFICATIONS
EXAMPLES
see the [section] used in the paper to extract
Outcome – contrast pairs (contrast_col)
CODING INSTRUCTIONS
EXAMPLES
see the [section] used in the paper to extract
Estimand for outcome – contrast pairs (estimand_col)
CODING INSTRUCTIONS
EXAMPLES
see the [section] used in the paper to extract
IDEAL preferred empirical specifications for outcome-contrast pairs (specification_col)
CODING INSTRUCTIONS
EXAMPLES
Summaries from Riley (2024), Barrera-Osorio (2011), Grossman & Baldassarri (2012), Li (2022).
Authors prefer the same empirical specifications as IDEAL (preferred_col)
CODING INSTRUCTIONS
EXAMPLES
see the [section] used in the paper to extract
Rounds of data collection (period_col)
CODING INSTRUCTIONS
EXAMPLES
see the [section] used in the paper to extract
IDEAL specification for round of data collection (same_period_col)
CODING INSTRUCTIONS
EXAMPLES
see the [section] used in the paper to extract
Author preferred empirical specifications (author_spec)
CODING INSTRUCTIONS
EXAMPLES
see the [section] used in the paper to extract
Alternative specification for each period (diff_per_spec)
CODING INSTRUCTIONS
EXAMPLES
see the [section] used in the paper to extract
Specification for all treatment effects (spec_single)
CODING INSTRUCTIONS
EXAMPLES
see the [section] used in the paper to extract
Round of data collection used for all treatment effects (spec_single)
CODING INSTRUCTIONS
EXAMPLES
see the [section] used in the paper to extract
Stage 3: Set-Up, Study Details & Estimates
Stage 3 Set-Up
Coder name
CODING INSTRUCTIONS
EXAMPLES
See descriptive examples in paper.
Paper ID
CODING INSTRUCTIONS
EXAMPLES
See descriptive examples in paper.
Paper title
CODING INSTRUCTIONS
EXAMPLES
See descriptive examples in paper.
Request for review: fields
CODING INSTRUCTIONS
EXAMPLES
See descriptive examples in paper.
Request for review: detail
CODING INSTRUCTIONS
EXAMPLES
For example, if a coder selects the "Estimand is full sample ITT and LATE/TOT" field in the request-for-review section, they could add an explanation about their uncertainty in this field.
Stage 3 Module 1: Study Details
Sampling
Section notes to display in the survey:
The questions in this section will ask how the unit(s) of randomization were sampled and how unit(s) of analysis were drawn from the units of randomization. Sampling of units of randomization and units of analysis often involved multiple steps, and we would like you to describe each of the step of the sampling process.
The best way to approach these questions would be drawing the sampling steps as you read the paper and then answer the questions in this survey. Please read the coding instructions carefully and provide the required information in the corresponding table cell. The table below demonstrates how to fill in the table using the coding instructions.
For each unit of randomization and each unit of analysis (that is not a unit of randomization), you would be asked to fill out a table like the one below with information on sampling.
Illustrated sampling questions for unit of randomization in a survey table
| [Unit of randomization] | Any inclusion/exclusion criteria (Yes/No) | Description of inclusion/exclusion criteria | Sampling method (Universe, random, non-random) |
|---|---|---|---|
| Sampling unit 1: | Indicate if there were any inclusion/exclusion criteria applied, when selecting the unit of randomization from sampling unit 1. | If yes, describe the criteria. | After applying the inclusion/exclusion criteria, how were the units of randomization selected from Sampling unit 1? |
| Sampling unit 2: | Fill in the label for the larger unit from which {sampling unit 1} was drawn from. | ||
| Sampling unit 3: | Fill in the label for the larger unit from which {sampling unit 2} was drawn from. | ||
| Sampling unit 4: | Country or another unit of randomization [You reach the end of this unit of questions if you enter "Country" or "Another unit of randomization" in this cell. Please skip the rest of the rows] |
Sampling units from which the unit of randomization was drawn
Coding Instructions
- Please identify and label each of the larger sampling units from which the unit of randomization (or unit of analysis) was drawn. A sampling unit is defined by a specific unit of inclusion and exclusion sampling criteria, for example, districts with over 1 million households.
- Start from a unit of randomization (or unit of analysis) to another larger unit of randomization (or unit of analysis) or the country (or unit of randomization) of the experiment.
- There could be more than one sampling unit within the same unit, for instance, schools that advertised a job and schools that had the job filled are two different sampling units at the school level and should be counted as separate units.
- For sampling units defined by geographic locations or administrative areas, please include the names of the places included/excluded in the label, for example, States (Jalisco, Chiapas, and Hidalgo). If the names are not available, please indicate the total number of the places, e.g. 3 provinces.
- The sampling information might be located in several places in the paper. Please search the experimental design, sampling, and data sections in the paper and appendix carefully to identify the sampling units for units of randomization.
Examples
Example 1 - Alatas et al., 2012:
In Alatas et al., 2012, the unit of randomization is subvillage.
Subvillages were drawn from villages. [See section B: Sample], so the "Sampling unit 1" should be:
Villages were drawn from another larger unit – province, so "Sampling unit 2" should be province. Since province is a geographic unit, the names should also be included in the label, so the answer would be:
The label for the next sampling unit – "Sampling unit 3" - would be the unit where provinces were drawn. The provinces were from the country, so the answer would be:
Till country, the sampling unit has reached the largest possible unit and there should be no more sampling unit entered for this unit of randomization. In total, there are three sampling units for "Subvillages".
In the same paper, there are two units of analysis different from the unit of randomization: household and subvillage head (or using answer from stage 1: individual-political/social leader). Both units were drawn from the subvillages, which were the unit of randomization.
So, for both household and subvillage head, the "Sampling unit 1" would be:
There are no more sampling units to be entered as it reaches a unit of randomization
Example 2 - Leaver et al., 2021:
Leaver et al, 2021, has two units of randomization: labor market (i.e. district-by-subject-family teaching job market) and school. For this paper, you will see the questions for each of the unit of randomization separately.
First, "Schools" were drawn from schools with "at least one new post that was filled and assigned to an upper-primary grade" (see Second-Tier Randomization: Experienced Contracts, page 2220), so "Sampling unit 1" would be:
Sampling unit 1 was drawn from "Schools to which REB had allocated the new posts to contracts", which should be "Sampling unit 2":
"Schools to which REB had allocated the new posts to contracts" were sampled form "labor markets", which is another unit of randomization. So, "Sampling unit 3" should be:
The sampling units are complete for "Schools" as it reaches another unit of randomization.
The next group of sampling units are for "Labor markets".
Labor markets were drawn from districts. The paper did not specify the names of the districts. "Sampling unit 1" would be:
The districts were directly sampled from the country, so "Sampling unit 2" would be:
Since country is the largest possible sampling unit, there will be no more sampling unit for this unit of randomization.
Any inclusion or exclusion criteria
Coding Instructions
- Yes, if sampling criteria were applied before the unit of randomization or unit of analysis was drawn from a larger sampling unit.
- No, if no sampling criteria were applied before the unit of randomization or unit of analysis was drawn from a larger sampling unit.
Examples
Example - Alatas et al., 2012:
In Alatas et al., 2012, the unit of randomization is subvillage.
Subvillages were drawn from villages. No inclusion or exclusion criteria were mentioned in the sampling process. Therefore, the answer to this field would be:
Villages were drawn from another larger unit – provinces. In Footnote 8, an exclusion criterion was stated: "An additional constraint was applied to the district of Serdang Bedagai because it had particularly large sized subvillages. All villages in this district with average populations above 100 households per subvillage were excluded."
The answer to this field would be:
Description of inclusion/exclusion criteria
Coding Instructions
- Please describe the inclusion or exclusion criteria that were applied when sampling the unit of randomization or unit of analysis from each larger sampling unit.
Examples
Example - Alatas et al., 2012:
In Alatas et al., 2012, the unit of randomization is subvillage.
Subvillages were drawn from villages. No inclusion or exclusion criteria were mentioned in the sampling process.
Villages were drawn from another larger unit – provinces. In Footnote 8, an exclusion criterion was stated: "An additional constraint was applied to the district of Serdang Bedagai because it had particularly large sized subvillages. All villages in this district with average populations above 100 households per subvillage were excluded."
Sampling method
Coding Instructions
- Select the sampling method used to draw the smaller sampling unit from the adjacent larger sampling unit.
- Total universe: All units (individuals, households, organizations, etc.) of a target population are included in the data collection.
- Random: All units (individuals, households, organizations, etc.) of a target population have a non-zero probability of being included in the sample and this probability can be accurately determined.
- Non-random: The selection of units (individuals, households, organizations, etc.) from the target population is not based on random selection. It is not possible to determine the probability of each element to be sampled. Some common non-probability sampling methods include convenience sampling, snowball sampling, random route sampling, judgement sampling, and convenience sampling (e.g. depending on participant's availability).
- Unknown: if the sampling method can not be determined based on the information reported in the paper.
- Other: if none of the above apply.
Examples
Example 1 - Alatas et al., 2012:
In Alatas et al., 2012, the unit of randomization is subvillage.
Subvillages were drawn from villages. As the paper notes "For each village, we obtained a list of the smallest administrative unit within it (a dusun in North Sumatra and Rukun Tetangga (RT) in South Sulawesi and Central Java), and randomly selected one of these subvillages for the experiment" [See Section B: Sample, page 1211].
In the same paper, there are two units of analysis that are different from the unit of randomization: household and subvillage head (Answer from Stage 1: individual-political/social leader). The paper stated that "From this census, we randomly sampled 8 households from each subvillage plus the head of the subvillage".
For household, the response would be:
For subvillage head, the answer would be:
because all the subvillage heads were included in the sample.
Example 2 - Briaux et al. 2020:
In Briaux et al. 2020 eligible households were selected using a random-route sampling method. It is a non-probability sampling method because the probability of selecting a household is unknow although the "starting points" were selected randomly.
Interventions
Section notes to display in the survey:
In this section, please describe the details of each intervention you specified in Stage 1.
It is possible that information in the paper about the intervention may be spread across different sections of the paper. This information may be located in paper sections such as the introduction or those that discuss research design or experimental design, including the footnotes.
When extracting information verbatim from the paper, please add quotation marks around the words and include the page number of the PDF document (rather than the original journal page number). Use square brackets for any paraphrased text or spelled out acronyms in the quotation.
For example, "The Andilaye intervention focused on three WASH [Water, Sanitation, and Hygiene]-related behavioral themes, informed by formative research: (1) sanitation, (2) personal hygiene, and (3) household environmental sanitation. Within these themes were 11 constituent practices targeted by the intervention; these practices were identified through formative research as ones that could be targeted using demand-side approaches, and were seen as achievable, per stakeholder feedback." (page 6)
Section instructions for data entry mask:
- The number of interventions and their labels collected, and study arms and their labels in Stage 1 need to be preloaded here to create the questions in the section.
- The questions on the details of intervention should be presented on the same page in the survey. [Check the word-limit for open-text field in SurveyCTO].
Intervention description - Detailed
CODING INSTRUCTIONS
EXAMPLES
Example 1 - Technical and vocational education training program:
One intervention in Lyall et al. 2020 is a technical and vocational education training program, and a description of the intervention would be:
Example 2 - Phone-monitoring intervention:
A description of the phone-monitoring intervention in Muralidharan et al. 2021 could be:
Details of intervention: eligibility criteria
CODING INSTRUCTIONS
EXAMPLES
Example 1 - Iron interventions:
There were three sets of eligibility criteria for the two iron interventions in Pasricha et al., 2021:
2. "Children with marked anemia (a hemoglobin level of <8.0 g per deciliter), current febrile illness, severe acute malnutrition, a known inherited red-cell disorder or previous transfusion, or known developmental delay were excluded" (p983).
3. Children in households with iron levels in drinking water exceeding 1 mg per liter were excluded.
Example 2 - Conditional cash transfer:
The eligibility criteria for the conditional cash transfer intervention in Filmer et al. 2023 were described as:
Example 3 - INVEST programs:
In Lyall et al. 2020, both interventions in the INVEST programs targeted "at-risk youth" and "internally displaced persons". The paper noted that the recruitment was done by a consortium of actors but there were no data on "individuals who were deemed ineligible for participation" (p132).
Details of intervention: proprietary name
Coding Instructions
- Please enter any proprietary name of the intervention, such as Head Start, Be a Man, PROGRESA, etc., if applicable.
- If the intervention was part of a larger program with a proprietary name, please note that in the answer using the phrase "Component of [prop. name]." Only if a treatment arm receives all components of the proprietary intervention, write [proprietary name]."
- Enter "None" if there is no proprietary name.
Examples
Example 1 - Lyall et al., 2020:
The economic assistance intervention in Lyall et al 2020 was part of larger program named "INVEST".
Example 2 - Muralidharan and Sundararaman (2015):
The private school voucher intervention in Muralidharan and Sundararaman (2015) was called "The AP Private School Choice project".
Details of intervention: study scale same as the implementation scale
Coding Instructions
- Yes: the scale of the evaluated intervention and the implemented intervention were the same. This is usually the case for researcher implemented programs or pilot programs.
- Select "yes" if the program was only scaled up after the study implementation.
- No: the intervention was implemented at a larger scale than the evaluated treatment group either before the study or during the same time period, for example, when a large-scale intervention took place but only a portion of it was being evaluated for various reasons. This could be the case for government implemented programs at scale, or for existing programs where one roll-out wave of a larger program or a subset of "marginal candidates" are used for randomized evaluation.
Examples
Example 1 - Crost, Felter, and Johnston (2016):
Crost, Felter, and Johnston (2016) evaluated part of a conditional cash transfer program (Pantawid Pamilya) in the Philippines. In 2019, the program was scheduled to begin in 19 municipalities of 8 provinces. Among the 19 municipalities, 8 were randomly selected to be part of the evaluated experiment. The remaining received the intervention as scheduled.
In this case, the scale of the implemented intervention was larger than the treatment group in the experiment.
Example 2 - Mbiti et al. (2019):
In Mbiti et al. (2019), the interventions were implemented at the school level, affecting all students in the focal grades. The study only collected data from a randomly selected group of students and households and included them in the analysis. Specifically, 10 students from each focal grade were sampled and 10 households were selected from each school for data collection. [See III.B.Data]
Therefore, the scale of the implemented intervention was larger than the study sample.
Details of intervention: implementation scale
Coding Instructions
- Describe the scale at which the intervention was implemented, including both the study and non-study participants.
- Please include information on the number of administrative and geographical units that the intervention reached, if any.
Examples
Example 1 - Crost, Felter, and Johnston (2016):
Crost, Felter, and Johnston (2016) evaluated part of a conditional cash transfer program (Pantawid Pamilya) in the Philippines. In 2019, the program was scheduled to begin in 19 municipalities of 8 provinces. Among the 19 municipalities, 8 were randomly selected to be part of the evaluated experiment. The remaining received the intervention as scheduled.
In this case, the scale of the implemented intervention was:
Example 2 - Mbiti et al. (2019):
In Mbiti et al. (2019), the interventions were implemented at the school level, affecting all students in the focal grades. The study only collected data from a randomly selected group of students and households and included them in the analysis. Specifically, 10 students from each focal grade were sampled and 10 households were selected from each school for data collection. [See III.B.Data]
Details of intervention: intensity
Coding Instructions
- Please provide information on the "intensity" of the intervention. This could be the length and frequency of the intervention (e.g. for a training), the amounts given (e.g. for a cash transfer or a subsidy), or the total duration of exposure (e.g. for an ad campaign).
- If there are multiple components in the same intervention that varied in intensity, please describe the intensity of each component.
Examples
Example 1 - Lyall et al., 2020:
Using the same example from Lyall et al. 2020, the TVET intervention has three components: 1) TVET courses, 2) a "soft skills" course and 3) a start-up kit of trade-specific tools.
Example 2 - Cardenas, Evans and Holland (2023):
The "Early Education Program (Programa Educación Inicial or PEI)" in Cardenas, Evans and Holland (2023) was an early childhood education intervention that included 65 group sessions during nine months with each session lasting for about 2 hours.
Details of intervention: reported cost
Coding Instructions
- Please provide information related to the cost of the intervention mentioned in the paper.
- Keyword search for 'cost', cost-effective*, and cost-eff* and review adjacent context to determine if any cost information is presented in the paper or in the supplementary materials.
- The information could be a total cost or a cost of an intervention per beneficiary. Other forms of analysis would include total cost, cost-efficiency metrics, e.g. unit cost, cost per beneficiary, and cost-effectiveness analyses, e.g. benefit cost ratio, incremental cost-effectiveness ratio, etc.
- If costs are presented in a table or figure, please enter the reported cost for the intervention and include the table or figure number.
- Enter "None" if there is cost information cannot be found in the paper or in the supplementary materials.
Examples
Example 1 - Barrera-Osorio et al. (2022):
Barrera-Osorio et al. (2022) highlights the cost-effectiveness of the program in Part VI: Program Cost-Effectiveness. The cost data for the intervention and details regarding program cost per student are mentioned in the appendix C.
Additional details of intervention
Coding Instructions
- Please provide any additional information on the intervention that has not been captured in the previous set of questions. This could include for example details on design or development, etc.
- Enter "None" if you think all the relevant information has been recorded.
Examples
Example 1 - Andrew et al., 2018:
In Andrew et al., 2018, the authors mentioned that the intervention was an attempt to implement a Jamaican home-visiting model at scale.
Intervention Start Date
Coding Instructions
- Please enter the day, month and year as they appear in the main text of the paper or its supplementary materials.
- Select "-99" if the information is not reported in the paper. For example, if only year is reported, select "-99" for month and for day.
- This information should be found in the sections on experimental or research design in the main text of the paper. Some papers may also include a timeline of the intervention in a figure or in the supplementary materials.
Examples
Example 1 - Lyall et al., 2020:
For Lyall et al, 2020, the start date is October 2015. which the coder would enter in a month-date format. [See: Study Timeline in Supplementary Material].
Example 2 - Chong et al., 2015:
For Chong et al, 2015, the authors write "We randomly assigned voting precincts to a campaign spreading information on corruption and public expenditure conducted one week before the 2009 municipal elections in Mexico." The coder would write "2009" and then select unsure in the follow-up question. [See: Introduction, Experimental Design and Implementation]
Intervention End Date
Coding Instructions
- Please enter the month and year as it appears in the main text of the paper or its supplementary materials.
- Select "-99" if the information is not reported in the paper. For example, if only year is reported, select "-99" for month and for day.
- Please calculate the corresponding end date for the intervention if only the start date and duration are available. For example, if the text is such as "the intervention began in June 2013 and went for six months", select the calculated month "December 2013" in this field.
- This information should be found in the sections on experimental or research design in the main text of the paper. Some papers may also include a timeline of the intervention in the main text or the supplementary materials.
Examples
Example 1 - Lyall et al., 2020:
For Lyall et al., 2020, the end date of the intervention is May 2016.[See: Study Timeline in Supplementary Material].
Intervention end date calculated from duration
Coding Instructions
- Yes if the end date is not directly reported in the paper and the date selected in "Intervention End Date" was based on the coder's calculation using data collection start date and duration reported in the paper.
- No if the end date of data collection is reported and entered as it is described in the paper.
Examples
Example 1 - Badrinathan 2021:
For Badrinathan 2021, the timeline provided indicates that outcome measures were collected between May 19 and May 23, 2019.
Outcomes
Section notes to display in the survey:
Answer the questions in the section for each outcome variable listed in Stage 1. In IDEAL, an outcome variable is defined by the way it enters the estimation of a treatment effect. For instance, if raw test scores were standardized in the main regression model, the standardized test score would be used to answer the following questions, not the raw scores.
Section instruction for data entry mask:
The number of outcomes and their labels collected in Stage 1 (and verified in the Stage 1 check) need to be preloaded here to create the questions in the section.
Outcome variable definition
Coding Instructions
- Provide a clear definition of the construct that the outcome variable measures. This is the underlying attribute or concept the outcome variable is designed to quantify, for example, mental health. The definition should be understandable for someone who has not read the paper and does not necessarily know what the intervention is.
- The definition should be illustrative to suggest that a higher value of the outcome variable means an increase in the construct being measured. For example, "Behavior" could mean both "Better behavior" or "More behavior problems". The definition inserted here should not have this kind of ambiguity and should be explicit about the meaning of an increased outcome value.
- Additionally, if an outcome variable is measured (cumulatively) over some reference period, please include the reference period in the description. Examples are "incidence of diarrhea in the last 24 hours" or "monthly income".
- Much of this information is sometimes omitted in the outcome label presented in the exhibits due to space constraints, but can be found in the text or notes.
- Be as brief as possible. The description does not need to include the statistical properties or the measurement details of the outcome variable.
Examples
Example 1 - Muralidharan et al. 2021:
In Muralidharan et al. 2021, one outcome variable is listed as "Ever encashed" in the tables (from Table 3). The outcome variable measured whether a farmer ever encashed a benefit check during the valid period.
The unit does not need to be specified here because it will be clear from the unit of analysis field.
Example 2 - Barrera-Osorio et al. 2022:
One outcome variable of Table 4 in Barrera-Osorio et al. 2022 is "Total Scores". The test scores were language and math combined for all children aged 5 to 10 measured at the second follow-up.
The target population unit needs to be mentioned in this case as the age restriction may not be obvious from the unit of analysis - child.
Example 3 - Chong et al., 2015:
In Chong et al., 2015, one outcome variable in Table 4 is "Turnout". The table notes indicate that the outcome variable refers to "total number of votes divided by number of registered voters multiplied by 100".
Binary outcome variable
Coding Instructions
- A binary outcome variable is a variable that has only two possible values. Binary variables are often – but not always – categorical variables. Binary variables that describe categories are most often coded as 0 and 1. They may also be called indicator or dummy variables. For example, sex (male/female) or currently attending school (yes/no) are both binary variables and may be coded as, say, 1 for women and 0 for men, or 1 for yes and 0 for no.
- Note this field is about the variable that enters the estimation, not how the input variable may have been originally measured. For example, the average of multiple 0/1 binary variables represents a fraction and is not a binary variable itself, even though the underlying data is binary – take the example of the outcome variable "average school attendance over the school year (in share of school days)", constructed from a series of 0/1 indicators for every school day whether the child was present in the classroom.
- Conversely, if an outcome measure was collected as a non-binary variable but transformed into a binary variable when estimating the treatment effect, it should be considered a binary outcome variable. For example, suppose educational attainment was measured with a multiple choice question (i.e. 1=primary education or less, 2=lower secondary education, 3=upper secondary education, and 4=post-secondary education), but in the estimation, the outcome was transformed into an indicator for "has a post-secondary education". The response to this question should be "No" for the school attendance example, but "Yes" for the education level example.
- Treatment effect estimates for binary outcome variables may use linear probability, probit or logit models.
Examples
Example 1 - Sukhtankar et al., 2022:
In Sukhtankar et al., 2022, the authors measure DIRs, or Domestic Incident Reports, which represent civil complaints of domestic violence. This is the count of DIRs in a given time period at a given police station.
Example 2 - Cheema et al., 2022:
In Cheema et al., 2022, for the measure of women's voter turnout is a variable which is coded as 1 if the respondent voted, and 0 if the respondent did not vote (operationalized by observing the ink from voting day on the respondent's thumb).
Example 3 - Freeman et al., 2022:
In Freeman et al., 2022, "Poor well-being" is a binary variable. Please note that the variable was dichotomized from a continuous well-being score, "with scores below 13 indicating poor well-being".
Because the variable entering the estimation is a binary variable.
Binary outcome label
Coding Instructions
- Please give a concise description of what it means when the binary outcome variable takes a value equal to 1.
Examples
Example 1 - Cheema et al., 2022:
In Cheema et al., 2022, the measure of women's voter turnout is a variable which is coded as 1 if the respondent voted, and 0 if the respondent did not vote (operationalized by observing the ink from voting day on the respondent's thumb).
Index outcome variable
Coding Instructions
- Typically, the authors will state if an outcome is an index. Sometimes an index may be called a "score".
- If an outcome is constructed from multiple independently measured variables or indicators that are not in the same domain, assess different dimensions of the same concept, or do not share the same unit, it is an index variable. An index typically does not have a unit.
- For example, household income (say, in Rupees) as the sum of all individual incomes in the household is not an index, but an early childhood development score combining assessments of math and behavioral skills is an index.
- Index aggregation of multiple variables may involve adding values, taking an average, etc.
Examples
Example 1 - Banerjee et al., 2019:
Banerjee et al., 2019, describe the outcome variable "HIV knowledge" explicitly as an index. The indicators that enter the index are also presented. "HIV knowledge measures how aware an individual is of the methods of transmission, the availability of drugs, and the timing of testing for HIV. Higher values of this index correspond to greater awareness."
Example 2 - Barrera-Osorio et al., 2011:
In Barrera-Osorio et al, 2011, one outcome is "monitored school attendance rate" (Table 3). According to the authors, "We collected attendance data during the last quarter of 2005 through direct observation. For this purpose, the team assembled a group of assistants who randomly visited schools and classes. The assistants directly called the roll of all students, and students were marked absent if they were not physically present in the classroom." [C. Data]. This variable is constructed from many separate observations, but it does not combine multiple alternative methods of measuring attendance for the same student, so it is not an index.
Example 3 - Wolf et al., 2019:
In Wolf et al., 2019, "Teacher motivation" is an index outcome although it is not called an index. The "Measures" section mentioned that "Teacher's motivation was measured using five items adapted from Bennell and Akyeampong (2007) as reported in Wolf, Aber et al. (2015)." There are multiple components in the outcome variable, thus it is an index outcome.
Example 4 - Barrera-Osorio et al., 2022:
In Barrera-Osorio et al., 2022, "Total score" is an index outcome consisting of language score and math score. The paper does not explicitly state that the variable is an index. However, the Data section stated that children were tested on language and math, so we can infer that "Total score" is an index.
Description of index outcome
Coding Instructions
- Please provide a description of the components, aggregation method and any other information on how the index was constructed from the underlying set of measures.
Examples
Example 1 - Freeman et al., 2022:
In Freeman et al. 2022 (Table 3), the outcome variable "Water and sanitation insecurity scores: Water – HWISE Scale" is an index outcome. The "Outcome of Interest" section and table notes describe the details of the variable.
Example 2 - Barrera-Osorio et al., 2022:
In Barrera-Osorio et al., 2022, "Total score" is an index outcome consisting of language score and math score. Both components were also outcome variables. The information in the paper on the "Total score" suggests it is simply the sum of the two scores.
Outcome variable measurement tool
Coding Instructions
- In disciplines like education and psychology, outcomes are often measured with standardized tools or measurement methods developed by others, such as the Bayley scale, IDELA, EGRA, Implicit Association Test (IAT), the Big 5 Inventory (BFI), etc. If one is provided, enter the name of the tool used to measure the outcome. Enter the citation for the measure.
- Look for the description in the section describing the data used for the outcomes or results.
- Include any adaptations of the measure or tool, e.g. if a measure used only some items from a longer questionnaire.
- Enter "None" if there is no name or citation for the measure.
Examples
Example 1 - Knauer et al., 2019:
From Knauer et al, 2019 "we assessed caregiver literacy by asking caregivers to read a simple, five-word (second-grade level) sentence in each language adapted from the Early Grade Reading Assessment (EGRA; Gove & Wetterberg, 2011).", "mental health was measured using an adapted version of the Centers for Epidemiological Studies‐Depression scale CES‐D; Radloff, 1977; scores range 0–60)."
These were obtained from the section "Measures: Caregiver survey". Notice for each tool, there is a citation associated with it.
Mental health: Centers for Epidemiological Studies‐Depression scale CES‐D (Radloff, 1977)
Outcome variable standardization type
Coding Instructions
- Please first determine whether the outcome variable is standardized or not. If standardized, choose the type of standardization.
- An outcome is standardized if it is converted from the original values to a z-score using mean and standard deviation.
- If an outcome variable is standardized using the mean and standard deviation of any group of the study sample, then it is internally standardized, for example, using the control group distribution.
- If an outcome variable is standardized using the distribution of a normative sample outside the study sample, it is externally standardized. For example, anthropometric measures for children under five years of age, such as Weight-for-Height or Arm Circumference, or the Peabody Picture Vocabulary Test (PPVT) are standardized externally using a reference group at "typical" level of development.
Examples
Example 1 - Pickering et al., 2019:
For the outcome variable "Weight for age Z-score" in Table 2 of Pickering et al. (2019), the standardization is external because it uses the WHO child growth reference distribution.
Example 2 - Leaver et al., 2021:
In Leaver et al. (2021), "Student learning" in Table 3 was internally standardized. Table 2 notes suggest that "student learning IRT scores are standardized based on the distribution in the experienced FW [Fixed-wage contract] arm".
Outcome variable unit of measurement
Coding Instructions
- Select the unit of measurement for each outcome variable that entered the estimation of treatment effects.
- Start with the broad category and then choose or specify the actual unit of measurement.
- For the unit of count (quantity), specify the object being measured. For example, for the number of prenatal checks, first select "count (quantity)" and then type "prenatal checks".
- For transformed variables, select unitless or other, and then specify the transformation method, for example, log, sine, and inverse hyperbolic function, and the underlying unit.
Examples
Example 1 - Haushofer and Shapiro (2016):
In Haushofer and Shapiro (2016), the unit of measurement of "Value of nonland assets (US$)" (in Table VI) is currency. According to the table notes, the currency was US$, PPP in 2012.
Example 2 - Pickering et al. (2019):
For the outcome variable "Weight for age Z-score" in Table 2, the unit of measurement is standard deviation because Z-score is measured in standard deviations.
Example 3 - Pickering et al. (2019):
For "Detectable total Cl (proportion)" in Table 3, the unit of measurement is fraction as it was measured in proportion.
Example 4 - Pickering et al. (2019):
For "E coli log (cfu/100 mL)" (Table 3), the unit of measurement is "other".
Additional details of the outcome variable
Coding Instructions
- Include any additional information about the outcome variable that is found in the paper and not recorded elsewhere, including its construction or processing (such as winsorizing or imputation of missing values), validation and quality control (such as double entry or back checks), measurement (such as exact procedures for an educational test conducted), etc.
Examples
Example 1 - Haushofer and Shapiro (2016):
In Haushofer and Shapiro (2016), the "Value of nonland assets (US$)" (in Table VI) variable was "top-coded for the highest 1% of observations". This detail was not covered by any of the previous fields. The coder should include it in this field.
Randomization
Number of randomization units in study arm
Coding Instructions
- For each study arm, provide the number of randomization units assigned to this arm as reported by the paper.
- If there are more than one unit of randomization, please enter the assigned units for each of them and separate them with commas, following the order of the displayed units of randomization in the hint. For example, the units of randomization are districts and villages (as in the hint), please enter {number of districts in the arm, number of villages in the arm} in the answer.
- It is possible that this information may not exist at the study arm level or for some of the study arms. If this is the case, enter "-99" in the corresponding field if the information cannot be found in the paper.
- This information is mostly found in the research design sections of the paper, specifically in the description of the random assignment, which is sometimes included in a separate sub-section of the paper. The information may also be found in participant flow diagrams (e.g. the CONSORT flow diagram) or a table that disaggregates information by treatment arms, such as a balance table, or even a treatment effects table, especially if the randomization unit and the unit of analysis are identical.
Examples
Example 1 - Lyall et al., 2020:
In Lyall et al, 2020, the number of units assigned to each study treatment arm is available in the paper. See Figure 2 in Randomization Section. The coder would input the following for each treatment arm, which was identified in Stage 1:
TVET treatment and UCT control: 312
TVET treatment and Non-UCT Group: 673
TVET control and UCT treatment: 273
TVET control and UCT control: 270
TVET control and Non-UCT Group: 756
Example 2 - Badrinathan 2021:
In Badrinathan 2021, the number of units assigned to each study arm is only available for the control arms (n = 406). The author does note that an equal proportion were assigned to each of the three treatment arms but does not provide an exact number for the two treatment arms.
Treatment arm 2: -99
Control arm: 406
Example 3 - Garbiras-Diaz and Montenegro 2022:
For the "call to action" intervention in Garbiras-Diaz and Montenegro 2022, the number of units assigned to each arm is presented in Figure 1 (Randomization Design) as follows:
Information Message: 158
Call-to-action Message: 156
Information + Call-to-action Message: 159
Number of randomization units in study
Coding Instructions
- Provide the total of randomization units assigned to all study arms as reported in the study.
- If there are more than one unit of randomization, please enter the total number of units in the study and separate them with commas, following the order of the displayed units of randomization in the hint. For example, the units of randomization are districts and villages (as in the hint), please enter {number of districts in the arm, number of villages in the arm} in the answer.
- This information is mostly found in the research design sections of the paper, specifically in the description of the random assignment, which is sometimes included in a separate sub-section of the paper. The information may also be found in participant flow diagrams (e.g. the CONSORT flow diagram) or a table that disaggregates information by treatment arms, such as a balance table, or even a treatment effects table, especially if the randomization unit and the unit of analysis are identical.
Examples
Example 1 - Badrinathan 2021:
In Badrinathan 2021, the total number of randomization units is 1,224 as noted in Abstract, Introduction, Sample and Timeline, etc. The study only reports the number of randomization units for the control group and the total number of randomization units.
Quality and robustness
Compliance
Coding Instructions
- For each treatment arm, compliance refers to any treatment unit that received the treatment as intended. The unit does not need to have taken the full treatment or have taken up the offered treatment to be considered in compliance.
- Sometimes compliance is not separately reported from take up. Non-compliance should only capture cases where a mistake in the randomization led to the treatment not being offered, or offered to the wrong units. In all other cases, record compliance as "not available."
- For the status quo control group, authors may refer to spillover or treatment contamination and report the share who received a treatment (the non-compliance rate), rather than the share who correctly did not receive the treatment (the compliance rate). Please always report the compliance rate.
- Please enter fraction in this field, for example, 15/16 indicating 15 out of 15 randomization units complied with the treatment status or 49/100 for a compliance rate of 49%.
- Enter "-99" if the information is not mentioned in the paper.
- Enter "-88" if compliance rate cannot be entered as numeric values. Please specify the details.
Examples
Example 1 - Bos et al., 2024:
In Bos et al., 2024, the authors discussed the compliance issue of the treatment arms in [4.2. Receipt and use of program materials by households]. "As per the intervention guidelines, households in the treatment group should have received four materials: a child development card, a household picture book, a nature picture book, and a key message booklet. However, Table 6 shows that due to imperfect compliance, the differential likelihood of receipt of the child development card, household picture book, and nature picture book between treatment and control households was approximately 49 percentage points (instead of 100 under perfect compliance). Furthermore, 2%–3% of households in the control group received these materials."
In this example, although "imperfect compliance" is used to describe the fact that only 49% of treatment households received the intervention materials. That was actually a result of low implementation fidelity rather than non-compliance to assigned treatment status.
However, the receipt of materials by the control households was a non-compliance issue as they were not assigned to get the intervention (i.e. materials).
Control study arm: -88 (Specify details: "2%–3% of households in the control group received these materials")
Take-up
Coding Instructions
- For each treatment arm, take-up measures the share of treatment units that actually participated in or adopted some portion of the assigned interventions. The treatment units do not need to have participated fully to be considered part of the group.
- Please enter percentage points in this field, for example, 82 for 82% of the treatment unit took up the treatment.
- Enter "-99" if the information is not mentioned in the paper.
- Enter "-88" if take-up cannot be entered as numeric values. Please specify the details.
Examples
Example 1 - Brudevold-Newman et al., 2024:
'Just over 61 percent of those assigned to the franchise treatment attended at least one day of business training (which was the first component of the franchise treatment), and 44 percent completed the program and launched a business.' (page 8)
In this question, we want the proportion of a treatment that participated in 'some proportion of treatment' (i.e. take-up), so the answer will then be 61%. For the grant arm, in Table A2 [Compliance and Attrition] from appendix [column: Grant], we see that 95% of the grant arm received the grant.
Although both the section in the paper and the appendix Table A2 were titled “Compliance and Attrition”. The “compliance” rates for the treatment groups were technically “take-up” rates.
Take-up for grant arm: 95
Balance test
Coding Instructions
- Please indicate whether there is a balance test table in the main paper or appendix.
- A balance test table often includes a set of balance tests to examine differences in observable characteristics between study arms. Balance can be tested individually by covariate or jointly, using an omnibus test for overall balance.
- Note that the balance test table may not be presented as a separate table but presented as part of a descriptive statistics table.
Examples
Example 1 - Abimpaye et al., 2020:
Table 2 is a balance table reporting characteristics by study arm.
Example 2 - Carneiro et al., 2024:
Two balance tables are presented: P1128 Table 1 (Baseline Balance, Household and Child Characteristics) and P1130 Table 2 (Balance of Household and Child Characteristics at Follow-Up).
Partners and Funders
Implementers of the experiment
Coding Instructions
- Enter the names of the entities that implemented the experiment as they appear in the paper and separate each of them by a comma.
- The implementers could be agencies, institutions, or individuals (for example, researchers).
- If there is no information on the implementers, please enter "Not reported".
- Please note data collection agencies should not be included as implementers.
Examples
Example 1 - Chong et al., 2015:
For Chong et al. 2015, the "Innovations for Poverty Action" implemented the intervention.
Example 2 - Gaikwad and Nellis 2021:
For Gaikwad and Nellis 2021, the experiment was implemented by "an NGO" without specifying the name.
Example 3 - Carneiro et al., 2024:
In Carneiro et al., 2024, the implementer was not specified in the paper.
Implementer type
Coding Instructions
- Select the types of all the entities that implemented the experiment.
- Please select all that apply. If there are both government and an NGO involved, choose both options.
- NGOs include both non-profit and for-profit non-governmental organizations that are self-managed.
- If a government contracts a private firm within the public sector management system, the implementer should still be considered as "government".
Examples
Example 1 - Chong et al., 2015:
The implementer for the experiment was "Innovations for Poverty Action".
Example 2 - Özler et al., 2018:
"Under PECD, the Government implemented the following interventions – in partnership with Save the Children and UNICEF" (p.4).
Acknowledgements
Coding Instructions
- Please copy and paste the acknowledgement section of the paper in this field. The information should include the funders, other entities that supported the study in pecuniary and non-pecuniary terms.
- In some papers, there are sections dedicated to acknowledging support for the study including funders, referees etc. In other papers, those could be in a footnote at the beginning or the end of the paper.
- Sometimes, the information can also be found in the “Conflict of interest” statement.
Examples
Example 1 - Özler et al., 2018:
There is an “Acknowledgements” section in the paper (page 19). The text in the section should be copied in this field.
Example 2 - Chong et al., 2015:
The acknowledgment is stated in footnote 1. Therefore, the text of footnote 1 should be copied here.
Resources
Registry Name
Coding Instructions
- Please select the name of the registry or registries in which the study is registered only if it is mentioned in the main text of the paper or its supplementary materials/appendices.
- Do not search for this information beyond what is included in the paper.
- Sometimes information on trial registration is mentioned in the footnotes.
- Searching for the exact terms throughout the text such as "registry", "pre-registration" or "pre-analysis plan" can be a good approach to double-check whether the trial registry is mentioned anywhere in the text or supplementary appendix.
- If the name of the trial registry is not mentioned in the paper or its supplementary materials/appendices, please select "Not stated".
Examples
In Brudevold-Newman et al., 2023, the name of the organization where the trial is registered is stated in the footnote page 1: "The study was registered at the AEA RCT registry under ID number AEARCTR-0000459."
Registration ID
Coding Instructions
- Please enter the registry ID or IDs of the study only if it is mentioned in the main text of the paper or its supplementary materials/appendices.
- Record full ID with prefixes if included (e.g. RIDIE-STUDY-ID-64be2e6e750).
- There could be different terminologies: the AEA RCT Registry gives each entry an "RCT ID", while ClinicalTrials.gov gives each entry a "ClinicalTrials.gov Identifier".
- This information is usually presented in the acknowledgements or ethics statement sections of a paper, or in the supplementary materials/appendices.
- If no registration ID is provided, please write "Not stated".
Examples
In Brudevold-Newman et al., 2023, the ID registration is stated in the footnote page 1. 'The study was registered at the AEA RCT registry under ID number AEARCTR-0000459.'
Number of IRBs reported
Coding Instructions
- Enter the number of ethics reviews or IRBs mentioned in the main text of the paper or in the supplementary materials/appendices.
- Studies can have multiple IRB or ethics board approvals.
- This information is usually present in the acknowledgements or ethics statement sections of a paper. If it is not present there, it may be present in the supplementary materials/appendices.
- If not mentioned in the paper or its supplementary materials/appendices, please write 0.
Examples
Abimpaye et al., 2020 reports one review with "Rwanda National Ethics Committee" as the ethics review body.
Athey et al., 2023 obtained ethical reviews from three committees: Cameroon’s National Ethics Committee (CNERSH; decision no. 2019/08/1183/CE/CNERSH/SP), administrative authorization from the Ministry of Health’s DROS (decision no. D30-760/L/MIN-SANTE/SG/DROS), and the authors’ institutional review board (decision no. 780/CIERSH/DM/2018).
Review board name
Coding Instructions
- For each ethics review board, include the complete name of the review board as it appears in the main text of the paper or in the supplementary materials/appendices.
- This information is usually in the acknowledgements or ethics statement sections of a paper or in the supplementary materials/appendices.
- If not mentioned in the paper or its supplementary materials/appendices, please write "Not stated".
Examples
Abimpaye et al., 2020 explicitly states that "This study was reviewed and approved by the Rwanda National Ethics Committee."
The study Barrera-Osorio et al., 2022 has an IRB approval number with the organization Columbia University.
Review number
Coding Instructions
- For each IRB approval, enter the approval number/ID as it appears in the main text of the paper or its appendices.
- Record full ID including prefixes. Copy the number or ID exactly as it appears in the paper.
- This may appear next to the review board name with "#". This information is usually present in the acknowledgements or ethics statement sections of a paper or in the supplementary materials/appendices.
- If not mentioned in the paper or its supplementary materials/appendices, please enter "Not stated".
Examples
Abimpaye et al., 2020 notes that "This study was reviewed and approved by the Rwanda National Ethics Committee”, but no reference or case number was reported.
In Barrera-Osorio et al., 2022, the IRB approval number is reported as AAAF4126.
Stage 3 Module 2: Estimates
Section notes to display in the survey
This section will go through each of the treatment effects confirmed in Stage 2 to collect the estimates of treatment effects. A treatment effect is defined by the outcome variable (that enters estimation), the comparison between the evaluation arm and the reference arm, the estimand, the empirical specification, and the periods (the data rounds used for the estimation and, if relevant, how they are pooled). Please review the pre-loaded information on the treatment effect carefully before answering the estimate questions.
Instructions for data entry mask
This set of questions on estimates should loop through each treatment effect confirmed in Stage 2. The preloaded prompts for coders include:
- [Exhibit number]
- [Outcome name]
- [Unit of analysis]
- [Eval arm]
- [Reference arm]
- [Estimand]
- [Empirical specifications]
- [Period (including rounds)]
This information needs to be presented for every treatment effect at the beginning of the estimate questions.
Estimation parameter
Coding Instructions
- Please select the estimation parameter of the treatment effect.
Examples
See descriptive examples in paper.
Estimation model
Coding Instructions
- Please select the statistical model used to estimate the treatment effect.
- The information is usually found in sections that focus on methods, analytical strategy, or results or in the table notes.
- If you are not sure which model was used, select "Other, specify" and enter the information found. Flag in the request-for-review if needed.
Examples
In Wolf et al., 2019, the Impact analysis suggests multi-level modeling was used; select "Multi-level or hierarchical model/regression" unless otherwise specified.
For "teacher turnover" in Table 4, multinomial logistic regressions were used; select "Logistic regression" for those effects.
In Ara et al., 2019 Table 3, select "T-test (mean-comparison test)".
Null hypothesis
Coding Instructions
- For each estimate, select the null hypothesis that was tested.
- In most cases, the null is 0 for non-binary outcomes and 1 for odds/risk/hazard ratios unless otherwise stated.
Examples
See descriptive examples in paper.
Linear combination of coefficients for treatment effect
Coding Instructions
- Enter 1 when the estimate is a single coefficient; select "More than one" when it is a linear combination and specify.
- Read the regression tables and results closely to identify the relevant coefficients.
Examples
In Leaver et al., 2021 Table 3, the pooled-period learning effect for one comparison is a single coefficient (answer: 1). Another comparison requires a linear combination of two coefficients under Model B (answer: More than one, specify: 2).
Estimate of the treatment effect
Coding Instructions
- Enter the numerical value exactly as reported (sign and decimals preserved).
- If multiple parameters are used (e.g., interactions), enter each coefficient separately.
- Enter "-99" if not available in the main paper or appendix and flag for review.
Examples
In Leaver et al., 2021 Table 3 (Pooled, Model A), the point estimate is 0.01. For a comparison requiring two coefficients in Model B, enter 0.12 and -0.03 separately.
Section notes to display in the survey
We need several precision statistics to standardize effect sizes using reported treatment effects. The survey will guide you to help extract a sequence of precision values, and it will stop once the minimum set of values are captured. The information on the type of precision statistics and the value is often found in the notes in tables that report the treatment effects in the main paper or in the appendix. It is possible that different precision values for the same treatment effect are reported in various parts of the paper. Please check both the main paper and the appendix carefully.
Guidance
For continuous variables:
- We always get the SE and t-stat. If both present, we stop.
- If either one is missing, we pick up the p-value (not adjusted for multiple inference) which can be used with other fields to back out SE and t-stat.
- If none are present, pick up the CI and significance level, and F-ratio for one-way ANCOVA.
For binary variables:
- We always get the Z-stat. If Z-stat is not present, pick up the t-stat or p-value (not adjusted for multiple-inference).
Standard error of treatment effect estimate
Coding Instructions
- Enter the SE exactly as reported; leave blank if not reported.
- If specified, record whether the value is unadjusted or adjusted and the adjustment method (e.g., robust, clustered, bootstrap). Use "Unknown" if not specified.
Examples
See descriptive examples in paper.
T-statistic of treatment effect estimate
Coding Instructions
- Enter the t-statistic exactly as reported; leave blank if not reported.
- Capture whether unadjusted or adjusted and note adjustment method if provided (robust, clustered, bootstrap). Use "Unknown" if not specified.
Examples
See descriptive examples in paper.
Z-statistic of treatment effect estimate
Coding Instructions
- Enter the Z-statistic exactly as reported; leave blank if not reported.
- Record unadjusted/adjusted status and adjustment method if provided. Use "Unknown" if not specified.
Examples
See descriptive examples in paper.
P-value of treatment effect estimate
Coding Instructions
- Enter the p-value exactly as reported; leave blank if not reported.
- When multiple adjusted p-values are provided, capture each (e.g., unadjusted, covariate-adjusted, multiple-hypothesis corrections, bootstrap, small-sample, permutation, unknown, other).
Examples
See descriptive examples in paper.
Confidence interval
Coding Instructions
- Enter the lower and upper bounds exactly as reported.
- If CI not reported, enter "-99" for both bounds.
Examples
See descriptive examples in paper.
Confidence interval significance level
Coding Instructions
- Select the CI level indicated; choose "Not reported" if not specified.
Examples
See descriptive examples in paper.
F-Ratio
Coding Instructions
- Enter the F-statistic exactly as reported; leave blank if not reported.
Examples
See descriptive examples in paper.
Additional precision information
Subsection notes: The following three questions are meant for you to include any information about any additional precision statistics reported by the authors that have not been captured in the previous questions. An open-ended question will appear to encode the information, if any, you may want to report.
Additional precision value
Coding Instructions
- Enter each non-sampling-based precision value if reported (often in appendix).
Examples
See descriptive examples in paper.
Additional precision value type
Coding Instructions
- Select the type corresponding to the value reported.
Examples
See descriptive examples in paper.
Additional precision value inference method
Coding Instructions
- Select the method reported for the additional precision value.
Examples
See descriptive examples in paper.
Subsection notes
The following questions ask the mean, standard deviation, and sample size for the outcome of the treatment effect. They loop through the study arms in comparison at baseline and at the period over which the treatment effect was estimated. Please read the prompts carefully before entering your answers.
SurveyCTO instructions: For the evaluation arm, reference arm and the two arms combined, display the mean, Standard deviation, Standard error and Sample size question for each of them, at baseline and at the period (used to estimate the treatment effect). Please see a presentation of the questions in tables, which would be ideally how the questions presented without the limitations of SurveyCTO. [Please check if we can use this table grid plug-in https://github.com/surveycto/table-grid].
Table 1: Baseline
| (1) | (2) | (3) | (4) | |
|---|---|---|---|---|
| Mean | Standard deviation | Standard error | Sample Size (N) | |
| Eval arm | ||||
| Reference arm | ||||
| Eval+Reference combined |
Table 2: {Period} (pulled from Stage 2 that is associated with the treatment effect)
| (1) | (2) | (3) | (4) | |
|---|---|---|---|---|
| Mean | Standard deviation | Standard error | Sample Size (N) | |
| Eval arm | ||||
| Reference arm | ||||
| Eval+Reference combined |
A different format of Baseline outcome is reported [Yes/No] - If yes, specify the format of baseline outcome
Baseline outcome variable format
Coding Instructions
- Indicate whether the unit of measurement at baseline is the same as the one used in estimation.
- If authors report baseline statistics in a different unit (e.g., original raw scale) than used in estimation (e.g., standardized), select "No" and specify the baseline unit.
- Select "Yes" if the same unit is used at baseline and in estimation.
Examples
See descriptive examples in paper.
Outcome variable mean
CODING INSTRUCTIONS
DESCRIPTIVE EXAMPLES FOR CODING
The outcome of "Intimidation during voting" has three means at baseline Asunka et al, 2019 (Table 1): Full sample, Treatment and Control. This question will be asked three times for each of the means.
Outcome variable standard deviation
CODING INSTRUCTIONS
DESCRIPTIVE EXAMPLES FOR CODING
The outcome of "Intimidation during voting" has three means at baseline Asunka et al, 2019 (Table 1): Full sample, Treatment and Control. The corresponding standard deviation for the three means are:
Outcome sample size
CODING INSTRUCTIONS
DESCRIPTIVE EXAMPLES FOR CODING
In Ganimian, Mulralidharan, and Walters 2023, only one sample size is reported for the outcome "Math" at baseline (Table 1) for all arms combined: