The Survey

To operationalize the data extraction for the minimum set of fields in the meta-data schema, the IDEAL team has developed a set of survey fields to capture relevant information from each individual paper through a series of working group meetings.

Data extraction is currently conducted by human coders on a survey mask developed using Open Data Kit (ODK) tools in SurveyCTO. The initial dataset coded and checked by humans will serve as the ground truth data for future automated data extraction tools that would be integrated into IDEAL. To request a demo for the survey, please contact us.

Staged Data Extraction Workflow. IDEAL employs a three-stage data extraction workflow designed to manage complexity, ensure quality, and create logical dependencies between different types of information:

  • Stage 1 extracts the structural characteristics of experiments.
  • Stage 2 systematically locates treatment effects by matching outcomes, arm comparisons, and specifications identified in Stage 1 with the actual results reported in each exhibit.
  • Stage 3 collects comprehensive details about experiments, interventions, outcomes, samples, and the identified treatment effects.

Quality checkpoints between stages ensure that only validated information flows forward, with supervisor review after Stages 1 and 2, and double-coding employed in Stage 3 where the bulk of detailed information is extracted.

For more information about IDEAL, please refer to IDEAL's data extraction guide.

Stage 1: Set-Up, Fields & Table-by-table

Stage 1 Set-Up
Coder name

CODING INSTRUCTIONS

Please select your name in SurveyCTO
If your name does not appear, reach out to your supervisor

EXAMPLES

See the [section] used in the paper to extract

Implementer type

Coding Instructions

  • Select the types of all the entities that implemented the experiment.
  • Please select all that apply. If there are both government and an NGO involved, choose both options.
  • NGOs include both non-profit and for-profit non-governmental organizations that are self-managed.
  • If a government contracts a private firm within the public sector management system, the implementer should still be considered as "government".

Examples

Example 1 - Chong et al., 2015:

The implementer for the experiment was "Innovations for Poverty Action".

Answer: NGO

Example 2 - Özler et al., 2018:

"Under PECD, the Government implemented the following interventions – in partnership with Save the Children and UNICEF" (p.4).

Answer: Government; NGO; Multilateral or bilateral international organizations
Paper ID

CODING INSTRUCTIONS

Select the IDEAL paper ID for which you will start extraction.
If the paper ID does not appear on the list, reach out to your supervisor.

EXAMPLES

See the [section] used in the paper to extract

Paper title

CODING INSTRUCTIONS

Confirm the name of the paper that appears on SurveyCTO matches the name of the paper on the list assigned to you.
If you have selected no, a message will be sent to your supervisor. Please await a new corrected paper assignment.

EXAMPLES

See the [section] used in the paper to extract

Paper correction

CODING INSTRUCTIONS

If you have received a resubmission request for your previous entry and are ready to start your new entry, please check this box.

EXAMPLES

See the [section] used in the paper to extract

Multi-site study entry

CODING INSTRUCTIONS

If you are coding a multi-site or multi-experiment paper, please enter each experiment separately. Check this box if you are entering another experiment for a previously submitted paper.
Please do not check this box if this is the first response for the paper.

EXAMPLES

Barrera-Osorio et al. 2011: Report effects from two different experiments in Bogota in different parts of the city (San Cristobal and Suba). We would like you to code the experiments in San Cristobal and Suba separately. Only if you have submitted the entry for San Cristobal and are going to code the Suba experiment, please check this box.

Request for review: fields

CODING INSTRUCTIONS

Please select all questions you were unsure about.

EXAMPLES

Request for review: detail

CODING INSTRUCTIONS

For each field, explain what you were uncertain about and (if applicable) which options you were considering.

EXAMPLES

Example: If a coder selects the "Estimand is full sample ITT and LATE/TOT" field in the request-for-review section, they could add an explanation about their uncertainty in this field.

Example Response: "Unsure regarding the empirical specification of the estimand. The author mentions that 'we view the evaluation design as quasi-experimental and use difference-in-differences to estimate program impact.' Are the estimates indirectly ITT/LATE/TOT or are they only quasi-experimental estimates?"
Stage 1 Fields
Number of experiments in the study (expNum)

CODING INSTRUCTIONS

Please indicate the number of experiments being evaluated in the paper.
An experiment is principally defined by the study population and unit of randomization, the intervention, and the randomization used to create comparable treatment arms.
If results are reported from multiple countries, these are likely coming from different experiments.
Normally, there is only one experiment being evaluated in a paper, but there are exceptions. Please see the example column. The experimental design section often provides information on how many experiments are being tested in the paper.
Note that in the pilot, we are not coding studies that are lab-in-the-field experiments. If a study includes a field experiment and a lab-in-the-field experiment, we will only code the field experiment. In that case, please enter 1 for this field.
We are also not coding studies in which there is a design intervention rather than a policy intervention. A design intervention includes studies that randomize the order or wording of survey questions.

EXAMPLES

Barrera-Osorio et al. 2011: Report effects from two different experiments in Bogota in different parts of the city (San Cristobal and Suba). We know that there are two different experiments because the paper declares "As required by the SED, the assessment of the treatments was divided into two separate experiments located in two very similar localities in Bogota, San Cristobal, and Suba." The paper also reports that eligible populations for the tested interventions are different across the two sites: "Eligible registrants in San Cristobal, ranging from grade 6–11..."; and "The tertiary treatment was evaluated separately in an experiment in Suba, where students ranging from grade nine through eleven..."

Answer: 2

Jeong et al. 2023: Evaluate the differences in how question modules in a survey are ordered in order to examine the effects of survey fatigue.

Answer: 0

De Martino et al. 2015: Conduct a lab-in-the-field experiment on landholders and annual payment offers for environmental services. As the experiment conducted was hypothetical.

Answer: 0
Number of experiments check (expNumcheck)

CODING INSTRUCTIONS

You are seeing this question because you indicated that the number of experiments reported in the current papers was not 1. Please double check your answer.
Go to the previous question to revise your answer if you think the number of experiments should be 1.
Select "Confirm that there is no eligible experiment reported in the paper" if you believe the experiment reported in the paper is not an RCT or not a field RCT. This could include cases where a study only reports lab-in-the field experiments or a design intervention that randomizes the order or wording of survey questions. Note that this will be a rare case since all assigned papers have been pre-screened. The survey will stop after you select this option. Please notify your supervisor that the paper is not eligible for coding.
Select "Confirm that there is more than one experiment reported in the paper" when the paper evaluates more than one experiment. In that case, please enter the information for each experiment separately into different survey entries.
Please reach out to your supervisors if you are not sure about your answer to this question.

EXAMPLES

Barrera-Osorio et al. 2011: Report effects from two different experiments in Bogota in different parts of the city (San Cristobal and Suba).

Answer: Confirm that there is more than one experiment reported in the paper

Jeong et al. 2023: Evaluate the differences in how question modules in a survey are ordered in order to examine the effects of survey fatigue.

Answer: Confirm that there is no eligible experiment reported in the paper
Country (country)

CODING INSTRUCTIONS

Select the country in which the intervention took place, even if the study did not cover the entire country or only mentions a region or city in the country.
If the country is not found in the CV, select "other" and then write the name of the country.
If you cannot find the name of the country in the main text of the paper or its supplementary materials/appendices, enter "not reported" and specify that you are "unsure" about this field in the request-for-review section of the survey.
This information is usually found in the abstract, introduction, context, or research design sections of the main text of the paper.

EXAMPLES

Chong et al. 2015: The intervention takes place in Mexico, so the coder would select the ISO code and country name for Mexico. [see: Abstract, Introduction, Experimental Design and Implementation sections]

Answer: Mexico

Lyall et al. 2020: The intervention takes place in Afghanistan, so the coder would choose the ISO code and country name for Afghanistan. [see: Abstract, Introduction]

Answer: Afghanistan

Muralidharan et al. 2021: The intervention takes place in Telangana, which is a state in India, so the coder would select the ISO code for India. [see: Abstract, Introduction, Setting and Intervention, and Research Methods sections]

Answer: India

Adida et al. 2020: The intervention takes place in Benin, so the coder would select the ISO code for Benin. [See: Abstract]

Answer: Benin
Sub-national location (subnationalLocation)

CODING INSTRUCTIONS

Enter the largest geographic location within a country where the experiment took place.
The location is often not randomly drawn from the country, for example, six states in Mexico were included in the study for some criteria or unspecified reasons.
For multi-site studies, the location(s) should differentiate the current experimental site from other experiments reported in the same paper. Please only enter the location for the experiment you are entering information for in this entry.

EXAMPLES

Barrera-Osorio et al. 2011: Report effects from two different experiments in Bogota in different parts of the city (San Cristobal and Suba). If a coder is entering the information for the experiment in San Cristobal, they should only enter "San Cristobal".

Answer: San Cristobal

Gaikwad and Nellis (2021): Report effects from the same experiment in two cities in India, Delhi and Lucknow. In this case, the coder should enter "Delhi and Lucknow."

Answer: Delhi and Lucknow
Intervention assignment strategy (intAssign)

CODING INSTRUCTIONS

Parallel: This is the most common strategy used in randomized control trials. Each intervention is assigned to only one arm.
Factorial: This design is used when evaluating the impact of two or more interventions alone and in combination with each other. At least one intervention is assigned to more than one study arm.
Crossover: This is used when each study arm receives different interventions (including no intervention) in different phases of the study. Also select this option for phase-in or stepped-wedge designs, where the roll-out of the intervention is randomized and every unit ultimately receives the program. If the study endline occurs prior to units receiving interventions beyond what they were initially assigned to, do not select this option.
Adaptive: In adaptive designs, the rule by which interventions are randomly assigned can change in the course of the trial, based on the experimental data. For example, in a trial conducted in multiple waves, the number of units assigned to each treatment arm may change across waves based on results in prior waves. Alternatively, in a longitudinal cross-over design, the next intervention to which an experimental unit is "switched" (or the time of switching) may depend on the outcome under the current intervention.
Other: If the assignment strategy does not fit in any of the above categories, select this option
Information on the assignment strategy is found in the experimental/study design or methods sections of a paper. When available, the coder may also consult how an intervention is described in a study participant flow diagram.
Note that authors may use the word phased-in while describing rollout of a program, even if the intervention assignment strategy is parallel or factorial.
Note also that some studies may feature a phase-in design or analyze a pilot program, in which a broader population ultimately receives the intervention(s). However, if the study endline is conducted prior to the phase-in of the rest of the units or before the program is expanded, the study will still be included if it has a parallel or factorial intervention assignment strategy.

EXAMPLES

In Barrera-Osorio et al, 2022, authors evaluate the performance-based reward program by randomizing primary schools into three distinct groups -- recognition, in-kind performance reward, and control [see: Sample and experimental design].

Answer: Parallel

In Andrew et al, 2018 researchers randomized towns into four groups. The first received the psychosocial stimulation only (PS), the second received multiple micronutrient supplementation only (MN), the third received both interventions (PS and MN), and the fourth received neither (Control). The response to this field is factorial design because one arm receives the combination intervention PS + MN.

Answer: Factorial

In Lopez et al, 2022 authors vary in which days doctors received a doctor-specific intervention, and in which days patients received a patient-specific intervention. As doctors (and patients) can cross between the control and treatment groups, the response to this field is crossover design [see: Data collection, figure 2].

In Miguel and Kremer, 2004 authors vary the timing in which three groups of randomly selected schools receive school-based deworming. As the control group crosses over into the treatment group by the end of the study, the response to this field is crossover.

Answer: Crossover
Multi-stage randomization of interventions (intMulti)

CODING INSTRUCTIONS

Yes – if participants of the experiment are randomly assigned to different interventions at more than one unit of randomization or at more than one point in time.
A multi-stage random assignment design has at least two units of randomization. So, it is also clustered-randomization design. However, not all clustered randomized controlled trials adopt multi-stage assignment of interventions. Additionally, it can be adopted in any experiment assignment strategy.
A multi-stage random assignment of interventions can be done either simultaneously or sequentially, depending on the nature of the intervention. For example, in a cash transfer program, both districts and their villages can be randomized at the same time in a two-stage randomization design. In contrast, a school voucher program may require randomization at two separate time points: first at the village level, and later at the individual level. This is because individuals need to first sign up for the program before they can be assigned to receive it.
When coding a paper, please look for terms like "2-stage" or "multi-stage," as authors often use these explicitly to describe the randomization of interventions. However, be cautious—these terms can also refer to sampling procedures or data collection methods. Since this field is about the randomization of interventions, it's important to make sure the correct information is captured.
No – if participants of the experiment are randomized to receive interventions only at one unit of randomization at the beginning.

EXAMPLES

Ichino and Schündeln (2012) used "a two-stage randomized design with blocking". In the first stage, constituencies were randomly assigned into treatment and control. In the second stage, approximately 25% of the electoral areas in each constituency were randomly selected to receive the intervention.

There were two units of randomization: (1) constituency and (2) electoral area.

Answer: Yes

Dolan et al., 2022, the randomization was conducted at site level. 87 sites were randomly assigned into three arms. Although some outcomes were measured and analyzed at the student level, there was only one unit of randomization. So, it is a clustered RCT but did not use a 2-stage randomization of interventions.

Answer: No

Gupta et al., 2024, the unit of randomization was household. Households were randomly assigned to receive the cash transfer intervention at different times, in a cross-over design. The randomization of the timing was conducted at one time and at the household level, so it is a single stage randomization.

Answer: No
Number of interventions (intNum)

CODING INSTRUCTIONS

Count the distinct interventions in the paper under evaluation.
Include in the count any intervention beyond the status quo administered to a group designated as a main control or comparison group.
Some arms receive a package of interventions, unless any intervention in the package is evaluated separately do not split the package of interventions into separate interventions.
If a common type of intervention is assigned in varying intensity to different study arms, count each intensity level as a unique intervention.
In an experiment using multi-stage design, any intervention and treatment status, regardless of its intensity (including 100% or 0%), that occurs before the lowest level of random assignment should be identified and recorded as a distinct intervention. For example, in a two-stage design where districts are first randomized to treatment or control, and then villages within treatment districts are randomized to receive an additional intervention, both the district-level treatment and district-level control should be extracted and counted as separate interventions, along with the intervention randomized at the village level.
In crossover design, count each roll-out timing as a different intervention.
This information is mostly found in the experimental/study design or methods sections of a paper. If there are study participant flow diagrams, these may illuminate the distinct interventions administered to treatment groups.

EXAMPLES

Barrera-Osorio et al., 2022: Have two distinct interventions. "Out of this sample, 140 schools were randomly assigned to each of the two treatment arms – recognition or in-kind performance rewards – and 140 schools were randomly assigned to the control." For this field, the response would be 2 interventions. [see: Intervention, sample and experimental design]

Answer: 2

Ozler et al, 2018: Have four separate arms and four unique intervention components. Out of the four interventions, one was common to all arms though not part of the status quo in child care centers outside the study sample. The response for this field would be 4 interventions: 1. Learning materials and supplies, 2. teacher training and mentoring, 3. teacher incentives, and 4. parenting training. [see: Interventions]

Answer: 4

Egger et al., 2022: The two interventions are "cash transfer, high saturation" and "cash transfer, low saturation". The cash transfer provided to the household is the same in both interventions, but in one arm a larger share of households receive the transfer, so the intensity of treatment at the village level is different. This means there are two different (village level) interventions. [see: Figure A1]

Answer: 2

Ichino and Schündeln (2012): Used "a two-stage randomized design with blocking". In the first stage, constituencies were randomly assigned into treatment and control. In the second stage, approximately 25% of the electoral areas in each constituency were randomly selected to be visited by registration observers.

There are three interventions in this experiment: (1) treatment at the constituency level; (2) control at the constituency level; and (3) visit by registration observers.

Answer: 3

Miguel and Kremer (2004): Have a crossover design, in which the deworming treatment is phased in to different schools in different years: "Group 1 schools received free deworming treatment in both 1998 and 1999, Group 2 schools in 1999, while Group 3 schools began receiving treatment in 2001" (page 165). There are three "interventions" in this study, one is the program intervention: "free deworming treatment". The other two are "timing interventions" specific to the crossover design: treatment in 1998 and treatment in 1999.

Note that "treatment in 2001" happened after the study period, so should not be included in the interventions.

Answer: 3
Intervention label (intLabel)

CODING INSTRUCTIONS

Use the short name of the intervention verbatim as described in the paper and used by the author in tables and figures.
The intervention label will be used later to form the names of the study arms and map the unit of randomization and stratification variables to each intervention.
Information to derive the name is mostly found in the experimental/study design or methods sections of a paper. The coder may also consult how an intervention is described in tables (for example, if a single intervention has been assigned to a treatment group, then authors may use a brief name to describe this intervention in tables presenting treatment effects). Labels for treatment groups in participant flow diagrams may also provide starting points for brief names for interventions.

EXAMPLES

Barrera-Osorio et al., 2022: Section 2.1 (Performance-based reward program, p2) states "Rewards took the form of either goods (in-kind) or recognition, depending on the treatment arm to which the teacher's school was assigned. The value of the reward was determined on an absolute scale, without relative performance comparisons to other teachers."

Answer: "Public recognition of high-performing teachers", "In-kind reward for high-performing teachers"

Ozler et al, 2018: In this study, section 2.3 (Interventions, p451) describes the interventions. The 4 distinct interventions are – (1) "Provision of play and learning materials (intervention common to all arms)", (2) "Training and mentoring of teachers", (3) "Teacher incentives", and (4) "Parenting training".

Answer: "Provision of play and learning materials", "Training and mentoring of teachers", "Teacher incentives", "Parenting training"

Ichino and Schündeln (2012): Used "a two-stage randomized design with blocking". In the first stage, constituencies were randomly assigned into treatment and control. In the second stage, approximately 25% of the electoral areas in each constituency were randomly selected to be visited by registration observers.

There are three interventions in this experiment and the labels are: (1) treatment at the constituency level; (2) control at the constituency level; and (3) visit by registration observers.

Answer: "Treatment at the constituency level", "Control at the constituency level", "Visit by registration observers"
The total number of study arms including control (armNum)

CODING INSTRUCTIONS

Enter the total number of study arms created by the randomized assignment of interventions. A study arm is a subgroup of experimental units that receive the same (set of) interventions. Include the control group(s) in the total number of study arms.
In a factorial design, include groups of experimental units that received more than one intervention as separate arms. That is, if there are two interventions (A & B) and participants are assigned to either A alone, B alone, the combination of A & B, or a control group, then this would count as 4 study arms.
In a crossover design, each arm should include at least one intervention and a timing indicated in the intervention field. For example, intervention A implemented at timing X should be counted as one arm while intervention A implemented at timing Y is another arm.
This information is mostly found in the intervention details, randomization or methods section of the paper. If participant flow diagrams are available, please consult them to see how many arms are in the study. Tables that present the treatment effects can illuminate the different arms as well.

EXAMPLES

Barrera-Osorio et al., 2022: "Out of this sample, 140 schools were randomly assigned to each of the two treatment arms – recognition or in-kind performance rewards – and 140 schools were randomly assigned to the control." There are 3 treatment arms in this study: 2 treatment arms and 1 control arm.

Answer: 3

Ozler et al, 2018: Has four separate arms and different subsets of 3 interventions are assigned to treatment arms. In this study, there are four arms -T1. Comparison Group: Provision of play and learning materials, T2. T1 + Training and mentoring of teacher, T3. T2 + Teacher incentives, T4. T2 + Parenting training [see: Interventions].

Answer: 4

Miguel and Kremer (2004): Has three different arms: one arm that receives the deworming treatment in 1998 and 1999, one arm that receives the deworming treatment in 1999, and one comparison arm that receives the treatment after the data is collected.

Answer: 3
Mapping interventions to arms (armMap)

CODING INSTRUCTIONS

For each arm, select the name of the intervention that is assigned from the drop-down list, starting with the control arm(s).
If the control arm received no intervention (status quo), choose "None".
For arms that receive a combination of 2 interventions (including timing), select both interventions.
This information is mostly found in the intervention details, randomization or methods section of the paper. If participant flow diagrams are available, please consult them to see all the arms in the study.

EXAMPLES

Barrera-Osorio et al., 2022: "Out of this sample, 140 schools were randomly assigned to each of the two treatment arms – recognition or in-kind performance rewards – and 140 schools were randomly assigned to the control."

Answer: First, select "None" Next, select "Public recognition of teacher high performance" Finally, select "In-kind rewards for teacher high performance"

Knauer et al. (2020): Feature a factorial design in which each successive arm receives an additional intervention or two than the other arms.

Answer: First, for arm 1, the coder would select "None." For arm 2, the coder would select "Storybooks." For arm 3, the coder would select "Storybooks + DRT + SMS" For arm 4, the coder would select "Storybooks + DRT + SMS + Booster Training" And finally for arm 5 the coder would select "Storybooks + DRT + SMS + Booster Training + Home Visits."

Miguel and Kremer (2004): Have a crossover design, in which the deworming treatment is phased in to different schools in different years.

Answer: For arm 1, the coder would select "Free deworming treatment" + "treatment in 1998" + "treatment in 1999" For arm 2, the coder would select "Free deworming treatment" + "treatment in 1999" For arm 3, the coder would select "None"
Unit of randomization (unitRand)

CODING INSTRUCTIONS

Select the unit(s) of randomization for the study. If the unit of randomization differs across interventions, select all units of randomization used.
Unit of randomization indicates the level at which the assignment of the intervention to study arms was done and at which the intervention is delivered.
If the selected units are "individual", "organization", "family" or "geographic unit", please also choose the subtype of unit in the follow-up field.
Geographic units include all spatial units or physical locations.
Select 'other' and fill out the textbox if the unit of randomization does not exactly match the pre-specified units in the CV.
This information is mostly found in the randomization or methods sections of a paper. The text should indicate that the unit was used to assign the treatment. If there are study participant flow diagrams, these may illuminate the units to which treatment was allocated.

EXAMPLES

Ozler et al, 2018: Is a cluster randomized trial, in which Community-Based Childcare Centers (CBCCs) were randomized into control, where the children received a learning kit, or the three treatment arms in which children also received learning kits and a combination of different interventions. The unit of randomization is the CBCC since that is the level at which any of the treatments were allocated [see: Study design and sample selection]. There is only one unit of randomization in this experiment. In the follow-up field prompting the specific answer, the response would be "Childcare Center".

Answer: Other: "Childcare Center"

Guiteras et al, 2014: Is a cluster randomized trial, in which communities were first randomized to receive a community motivation and health information campaign, or an information campaign combined with subsidies for the purchase of hygienic latrines, or a supply-side market access intervention linking villagers with suppliers and providing information on latrine quality and availability, or no interventions. Second, within the subsidy communities, eligible households were randomized to receive subsidy vouchers through household-level lotteries. There are two units of randomization in this experiment: community and household.

Answer: (selecting both) "Geographic unit" and "Household", then select "Community" for the follow-up field containing the sub-type for "Geographic unit". There is no follow-up field for the choice of "Household".
Mapping units of randomization to interventions (unitRandMap)

CODING INSTRUCTIONS

If there is only one unit of randomization used for treatment assignment, this question will be skipped. When there is more than one unit of randomization (e.g. schools are assigned to a teacher training program and then families within schools are assigned to a parental support program), each unit will have to be mapped to an intervention.
This information is mostly found in the randomization or methods sections of a paper or in a study participant flow diagram.

EXAMPLES

Leaver et al, 2021: There are 5 interventions. The units of randomization in the study are -- district-subject-family and schools. Here, since there is more than one unit of randomization, we need to map each intervention to its unit of randomization.

Answer: Advertisement of fixed-wage contract - unit of randomization: District-by-subject-family pairs; Advertisement of pay for performance contract - unit of randomization: District-by-subject-family pairs; Advertisement of both fixed-wage and pay for performance contracts - unit of randomization: District-by-subject-family pairs; Implementation of pay for performance contract - unit of randomization: School; Implementation of fixed-wage contract - unit of randomization: School
Block randomization (block)

CODING INSTRUCTIONS

Please indicate if block randomization is used when assigning any units of randomization.
In most papers, block assignment will appear in the description of the randomization procedure and will use phrases such as "blocked assignment" or "treatment within blocks." Block randomization is intended to equalize the number of units per treatment arm.

EXAMPLES

Ozler et al. 2018: A "block randomization" was used to assign childcare centers in each district to the four study arms.

Answer: Yes
Mapping blocks to interventions (blockunitRand)

CODING INSTRUCTIONS

Select all the units of randomization for which blocks were used in assignment.

EXAMPLES

Ozler et al. 2018: A "block randomization" was used to assign childcare centers in each district to the four study arms.

Answer: Childcare center
Number of stratification variables (strataNum)

CODING INSTRUCTIONS

Please enter the number of stratification variables used in the assignment of interventions. If there is no stratification in the assignment, please enter 0.

EXAMPLES

Berman et al. (2019): The authors note that they stratify by province, share of respondents in the baseline survey that report at least occasional access to electricity, and the share of respondents reporting that the district governor carries the most responsibility for keeping elections fair.

Answer: 3

Dupas (2011): The author notes that the randomization procedure is stratified by teacher training status.

Answer: 1
Stratification variables (strataLabel)

CODING INSTRUCTIONS

List each of the stratification variables out in the order in which they appear in the paper.
Use indicators for being assigned to an intervention (e.g. indicator for being assigned to provider incentive) if randomization at a lower level is stratified by groups generated by higher-level randomization of one or more other interventions.
If local terms are used to define the strata - for example, woreda, oblast, union, or taluka - please retain the local term rather than using the author's translation of them into English.
Note that strata can be used in both sampling (how units came to be in the study) and randomization (how units came to be in the study arms). Stratified sampling is used to ensure that each stratum has a fixed representation in the sample. Be careful to mark stratification in this field only if the paper does stratified randomization.
Note that group fixed effects included in the estimation or in tables do not mean that these are stratified fixed effects. Only include these if the authors discuss stratifying their treatment assignment.
This information can be found in the randomization and methods sections of the paper. In general, discussions of stratified randomization should come after discussions of stratified sampling, if applicable.

EXAMPLES

Andrew et al, 2018: Since the response to 10 for Andrew et al, 2018 is Strata. Using text from the paper "Randomisation was done at the level of the cluster (town), after stratification by region. Within each of the 3 regions, 8 towns were randomly allocated to each of the 3 treatment groups and the control group using computer-generated random numbers" [Randomization and masking].

The response to this field would be "region" since that is the variable that makes up the strata as indicated in the paper.

Answer: region

Freeman et al. (2022): Authors use a stratified randomization design so the response to the previous question asking if the intervention was stratified is "Strata". The authors note "a stratified random design at the woreda-level was used to assign an equal number of study kebeles to either the Andilaye intervention or the control group receiving no intervention".

The response to this question is "Woreda", as it is the local term used consistently throughout the paper.

Answer: Woreda
Stratification for study arms (strataSame)

CODING INSTRUCTIONS

Indicate whether the same (set of) stratification variables are used for assigning interventions to all study arms.
Note for two stage randomizations, the treatment arm assignment for first stage can be the basis for stratification for the second stage.
This is found in sections describing randomization or methods section. A flow diagram depicting the experiment may also contain helpful information for this field.

EXAMPLES

Ozler et al, 2018: A "block randomization" was used to assign childcare centers in each district to the four study arms. "Centers were grouped based on mean height-for-age (HAZ) and Peabody Picture Vocabulary Test (PPVT - a measure of receptive vocabulary) z-scores, both of which were collected at baseline. The Ministry held a public lottery at each district capital where a representative from each center was asked to draw a colored dot from an envelope to determine that center's treatment status."

The response to this question would be "Yes", as the same set of variables were used for stratification.

Answer: Yes

Wolf et al. 2019: There are two stages of randomization. In the first stage, the intervention assignment was stratified by district and public/private status of the school. In the second stage, the interventions were assigned within groups created by treatment assignment in the first stage. The stratification variable is indicator for teacher training/parental awareness assignment.

The response to this question would be "No", since the stratification variables are different across interventions.

Answer: No
Mapping stratification variables to study arms (strataMap)

CODING INSTRUCTIONS

Select the strata variables use in creating each study arm.
If the same (set of) stratification variable is used for all treatment assignments, then this question will be skipped.
This information is mostly found in the randomization and methods sections of the paper. For most papers, this information will appear in the same paragraph as the one describing the stratification procedure.

EXAMPLES

Wolf et al, 2018: Has 4 interventions and 5 study arms:

  • Teacher training and coaching program
  • Parental awareness meetings
  • Text messages for teachers
  • Picture-based paper flyers or texts for parents

Based on the information in the section "Randomization", in a first stage of randomization, three of the interventions (teacher training and coaching program; and parental awareness meetings) were stratified by district and public/private status of the school. The text messages for teachers and the texts/flyers for parents were assigned in a second stage of randomization. These interventions were stratified by treatment assignment in the previous stage so that stratification variables for this assignment of interventions to arms would be indicators for being assigned to the two of the arms.

So for this field, the stratification mapping for 5 study arms would be:

  • Control - district and public/private status of the school
  • Teacher training and coaching program - district and public/private status of the school
  • Parental awareness meetings - district and public/private status of the school
  • Text messages for teachers - indicator for school being part of teacher training and coaching program
  • Picture-based paper flyers for parents - indicator for school being part of program with teacher training & coaching and parental awareness meetings
Answer: Control - district and public/private status of the school; Teacher training and coaching program - district and public/private status of the school; Parental awareness meetings - district and public/private status of the school; Text messages for teachers - indicator for school being part of teacher training and coaching program; Picture-based paper flyers for parents - indicator for school being part of program with teacher training & coaching and parental awareness meetings
Other randomization methods (randDescrip)

CODING INSTRUCTIONS

Please provide a description of the randomization method and process used in the experiment. Please extract the information verbatim from the paper.

EXAMPLES

None provided

Number of units of analysis in the experiment (unitAnaNum)

CODING INSTRUCTIONS

Count the number of unique units of analysis at which treatment effects are estimated in the experiment.
Multiple outcomes and treatment effects can be estimated at the same unit of analysis. The same unit of analysis should only be counted once.
If treatment effects for the unit of analysis are only reported in an appendix or supplementary materials, please do not count them for the purposes of this field.
For heterogeneous treatment effects (which meet the criteria to be included in IDEAL), please include the corresponding units of analysis in the count.

EXAMPLES

Ashraf et al., 2010: Include treatment effects for the full sample in Tables 2, 3, 4 and 5.

The unit of analysis in Table 2 is "Household" for the outcome "Household purchased Clorin (dummy)".

The unit of analysis in Table 3 is also "Household" for the two outcomes: "Water currently treated with Clorin" and "Drinking water contains free Clorin".

The two outcomes from Table 3 are also in Table 4.

The unit of analysis in Table 5 is "Household" for two outcomes: "Bottle exhausted?" and "Use Clorin for non-drinking water purposes".

Therefore, there is only ONE (1) unit of analysis in this experiment.

Answer: 1
Unit of analysis variable (unitAnaLabel)

CODING INSTRUCTIONS

Enter the unit of analysis variable as it is in the paper. For example, for pregnant women visiting health facilities, the unit of analysis might be referred as woman or patient in different papers, please write down the exact unit used in the paper.
This information can be found in the results, data, and table notes in the paper, or in the supplementary materials.

EXAMPLES

Ozler et al. 2018: Include treatment effects estimated using various units of analysis. The treatment effects on child assessments and behavioral problems (Tables 3&4) were estimated at the child level, so the unit of analysis of those outcomes is child.

Table 5 includes impacts on parenting quality, for which the unit of analysis is "primary caregiver" (see section 2.4.2, page 453).

For impacts on CBCC outcomes in Table 6, the unit of analysis is Community-Based Childcare Center (CBCC).

Answer: child; primary caregiver; Community-Based Childcare Center (CBCC)
Unit of analysis category (unitAnaCV)

CODING INSTRUCTIONS

Select the unit of analysis for the treatment effect using the controlled vocabulary (CV).
This information can be found in the results, data, and table notes in the paper, or in the supplementary materials.

EXAMPLES

Ozler et al. 2018: Include treatment effects estimated using various units of analysis. The treatment effects on child assessments and behavioral problems (Tables 3&4) were estimated at the individual child level, so the unit of analysis of those outcomes is child.

From the pre-specified list of options, a coder would first choose "1. Individual" as the broad category and then choose "Child" as the category.

Table 5 includes impacts on parenting quality, for which the unit of analysis is primary caregiver. Similarly, a coder would first choose "1. Individual" as the broad category and then choose "1.11 Parent" as the unit of analysis category.

For impacts on CBCC outcomes in Table 6, the unit of analysis is Community-Based Childcare Center (CBCC). A coder would first choose "2. Organization or legal entity" and then "2.9 Other organization or legal entity" to type "Childcare center".

Answer: Child; 1.11 Parent; 2.9 Other organization or legal entity
Number of exhibits with treatment effects (tableNum)

CODING INSTRUCTIONS

Count the tables and figures in the main text that report treatment effects for the full evaluation sample, using ITT or any authors' preferred estimand.
In the IDEAL project, the main quantity of interest (or estimand) is the intention-to-treat (ITT) effect using the entire experimental sample. IDEAL also collects treatment effects estimated using authors' preferred estimand other than ITT.
Full sample refers to the full sample relevant for the outcomes included in the estimation of treatment effects; it is meant to contrast with subsamples created to estimate heterogeneous treatment effects. Note treatment effect estimates are available only for children because the outcome variable was only measured for a sample of children (e.g. stunting or child development), this counts as a full sample estimate. Likewise, if a table only includes estimates for healthcare providers because the outcome variable was measured only for this group (e.g. quality of care), this still counts as a full sample estimate even if other tables are concerned with different populations (e.g. patients). On the other hand, if an estimation is restricted to a certain gender or wealth quintile or any other subsample to demonstrate heterogeneity of treatment effects, this would not be included in IDEAL unless there were no full-sample treatment effects reported in the paper.
Count figures even if the main text figure does not report exact estimates of treatment effects or their precision but rather only includes this information in an appendix or supplementary materials. However, do not include the exhibit if there are no estimates and precision values that accompany it in the paper or its appendix.
Count tables that present treatment effects as group means and standard deviations if there is a point estimate and a formal test of treatment effects (e.g., t-test) and precision statistics reported. The treatment effect can be estimated later.
Count treatment effects only reported in the text but not in any of the exhibits as a pseudo table and label it as "Text only". When reporting, please group all those treatment effects in one table although they may appear in different parts of the table.
Do not include tables that report quasi-experimental estimates (e.g. using treatment assignment as an instrumental variable) unless this is the preferred specification of the authors.
Do not include tables that only report heterogeneous treatment effects for only a subgroup or a subsample, EXCEPT: The paper only reports heterogeneous treatment effects, OR Heterogeneous treatment effects are the primary research questions.
Note that if a table only includes estimates for children because the outcome variable was only measured for a sample of children (e.g. stunting or child development), this counts as a full sample estimate. Likewise, if a table only includes estimates for healthcare providers because the outcome variable was measured only for this group (e.g. quality of care), this still counts as a full sample estimate even if other tables are concerned with different populations (e.g. patients).
Do not include tables reporting only robustness checks.

EXAMPLES

Ozler et al, 2018: The paper has a total of 13 tables and 2 figures. Of all the exhibits, 11 tables report treatment effects (i.e. Tables 3 though 13). However, Table 7 reports a robustness check and Table 8 reports quasi-experimental results using treatment assignment as an instrumental variable. These two tables should not be included. Thus, this paper has 9 tables with treatment effects for the full sample.

Answer: 9

Leaver et al. 2011: Include 4 figures and 6 tables. Figure 1 and Tables 1 and 2 report experimental design and baseline characteristics. Figures 2, 3 and 4, and Tables 3, 4 and 5 include treatment effects for the full evaluation sample. Table 6 reports quasi-experimental results. Therefore, there are 6 tables or figures with full sample treatment effects in the paper.

Answer: 6

Riley 2024: Has 6 tables and 2 figures. Figures 1 and 2 present take-up and balance. Table 2 only reports heterogeneous effects, and the rest 5 tables include at least one set of treatment effects for the full sample.

Answer: 5

Ara et al. 2019: The outcome variables, Median duration of EBF and Median duration of any breastfeeding should be included in Table 2 because formal p-values are reported in the text.

Kondylis et al. 2016: The authors report treatment effects in Tables 6-10. However, they only report heterogeneous treatment effects by the gender of the farmer. In this case, we would include tables 6-10 for this paper.

Exhibit label (tableLabel)

CODING INSTRUCTIONS

List the table numbers as they appear in the main text that include any full sample treatment effects.
This is the first of a set of repeated fields and only enter one label at once.
If the tables include letters or words such as "TABLE 1" or "TABLE 1A", please retain the exact label.
For treatment effects only reported in the text but not in any exhibit, please use the table "Text only". All those treatment effects should be grouped in this "Text only" exhibit, no matter where they appear in the paper.
Do not include the caption of the table such as "18-month follow up impact", "Impact on secondary outcomes"

EXAMPLES

Ozler et al, 2018: The labels are: Table 3, Table 4, Table 5, Table 6, table 7, Table 9, Table 10, Table 11, Table 12, and Table 13. First only enter "Table 3" in this field and answer the questions about Table 3, and repeat the process for each of the rest tables.

Answer: Table 3

Leaver et al. 2021: Using the order of appearance in the paper, the labels are: Figure 2, Figure 3, Table 3, Figure 4, Table 4, and Table 5.

Answer: Figure 2
Presence of heterogeneous analysis

CODING INSTRUCTIONS

If the authors include heterogenous analyses in the main paper, select the variable(s) that characterize the subgroups in the heterogeneous analyses.
For subgroups not included in the controlled vocabulary, follow exactly the wording of the authors as they describe the subgroup(s).
Note that not all papers will include heterogeneous analyses, and some of those that do will only include them in the supplementary materials. Only collect subgroups included in the main paper.

EXAMPLES

Examples to be provided

Number of outcome variables in the experiment (outNum)

CODING INSTRUCTIONS

Count all the outcome variables reported in the study for which treatment effects are estimated.
If the outcome variable is an index and the treatment effects of the components are reported in the same exhibit, include both the index and all the components as separate outcomes. If treatment effects for the index components are only reported in an appendix or supplementary materials, do not count these as outcomes for the purposes of this field.
Include outcome variables that are reported in an exhibit, even if the exact point estimates can only be found in supplementary materials.
Do not include auxiliary outcomes that do not appear in the exhibit (but note that this does not include primary outcomes that were only moved to an appendix because they do not show an effect).
Outcomes that are only measured as marginal effects should also be included.

EXAMPLES

Guiteras et al, 2015: The published manuscript does not show any table in the main paper. Figures 1 & 2 report the treatment effects, however, we can not obtain the precise statistics such as point estimates and standard errors directly from the figures. From the notes of Figure 1, "Figure displays the sum of the estimated coefficients and the control group means found in columns (2) and (6) of table S2 and column (2) of table S3. (A) Any latrine access; (B) hygienic latrine access; (C) open defecation among adults", we learn that the estimated coefficients can be found in tables S2 and S3 in the supplementary materials. Figure 1 includes three outcomes: "(A) Any latrine access; (B) hygienic latrine access; (C) open defecation among adults". Figure 2 includes three outcomes: "(A) Any latrine ownership; (B) hygienic latrine ownership; (C) open defecation among adults."

Answer: 6

Ashraf et al., 2010: Include treatment effects for the full sample in Tables 2, 3, 4 and 5. The outcome in Table 2 is "Household purchased Clorin (dummy)". There is only one outcome. There are two outcomes in Table 3. They are "Water currently treated with Clorin" and "Drinking water contains free Clorin". The two outcomes from Table 3 are also in Table 4. The title of Table 5 includes Heterogeneity, however, the table reports the full sample estimates for two outcomes: "Bottle exhausted?" and "Use Clorin for non-drinking water purposes". Note that only full-sample treatment effects and their outcomes should be included.

Answer: 5
Outcome name (outLabel)

CODING INSTRUCTIONS

Enter the complete name of every outcome as it appears in the table.

EXAMPLES

Guiteras et al, 2015: Figures 1 & 2 report the treatment effects of five unique outcomes. The outcome names can be found in the notes below Figure 1 and Figure 2.

Answer: Any latrine access, hygienic latrine access, open defecation among adults, any latrine ownership, hygienic latrine ownership

Ashraf et al. (2010): The outcome names can be found in Tables 2, 3, 4 and 5.

Answer: Household purchased Clorin (dummy), Water currently treated with Clorin, Drinking water contains free Clorin, Bottle exhausted?, Use Clorin for non-drinking water purposes
Outcome Result Type (outType)

CODING INSTRUCTIONS

Select "Full-sample Results" for all outcomes reported in the paper that are estimated using the full sample of treatment and control units.
Only select "Sub-sample Results" if subgroup or heterogenous treatment effects are reported for the outcome. If you select this option, you will need to specify the number of subgroups or the number of interaction terms.
Select both options if exhibits report separate effects, at least one for full sample results, and at least one for sub-sample results.

EXAMPLES

Example 1 - Full-sample results only:

In Table 3 of Ozler et al, 2018, all treatment effects are estimated using the full sample of treatment and control units.

Answer: Full-sample Results

Example 2 - Sub-sample results only:

In Table 4 of Ganimian, Mulralidharan, and Walters, 2023, treatment effects are reported separately for male and female students (subgroup results).

Answer: Sub-sample Results (specify: 2 subgroups)

Example 3 - Both full-sample and sub-sample results:

In Table 5 of Banerjee et al., 2020, the table reports both full-sample treatment effects and heterogeneous effects by gender (interaction terms).

Answer: Full-sample Results; Sub-sample Results (specify: 1 interaction term)
Outcome unit of analysis (outUnit)

CODING INSTRUCTIONS

Select the unit of analysis for the treatment effect. Note that the unit of analysis could be different from the unit of randomization.
Select 'other' and fill out the textbox if the unit of randomization does not exactly match the pre-specified units in the CV.
This information can be found in the results, data, empirical strategy sections, and table notes in the paper, or in the supplementary materials.

EXAMPLES

Ozler et al. 2018: Include treatment effects estimated using various units of analysis. The treatment effects on child assessments and behavioral problems (Tables 3&4) were estimated at the individual child level, so the unit of analysis of those outcomes is child. From the pre-specified list of options, a coder would first choose "1. Individual" as the broad category and then choose "1.12 Other" to enter "Child" as the unit of analysis, because "Child" was not an option in the list. Table 5 includes impacts on parenting quality, for which the unit of analysis is primary caregiver. Similarly, a coder would first choose "1. Individual" as the broad category and then choose "1.12 Other" to enter "Primary caregiver" as the unit of analysis, because that was not an option in the list. For impacts on CBCC outcomes in Table 6, the unit of analysis is childcare center. A coder would first choose "2. Organization or legal entity" and then "2.9 Other organization or legal entity" to type "Childcare center".

Answer: 1. Individual → 1.12 Other (Child); 1. Individual → 1.12 Other (Primary caregiver); 2. Organization or legal entity → 2.9 Other organization or legal entity (Childcare center)
Outcome Sub-sample Type (outSubHTE)

CODING INSTRUCTIONS

Select "Subgroup estimates" if sub-sample results for this outcome are estimated using a separate sample.
Select "interaction terms" if heterogenous treatment effects are reported for this outcome using multiple coefficient estimates related to the same outcome.

EXAMPLES

Example 1 - Subgroup estimates:

In Table 4 of Ganimian, Mulralidharan, and Walters, 2023, treatment effects are reported in separate columns for male and female students.

Answer: Subgroup estimates

Example 2 - Interaction terms:

In Table 5 of Banerjee et al., 2020, heterogeneous treatment effects are reported using interaction terms like "Outcome × Gender" in the same column.

Answer: Interaction terms
Outcome Sub-sample label (outSubLabel)

CODING INSTRUCTIONS

For subgroups, use a dash to separate the outcome and the group label—for example: "Outcome A - Female" and "Outcome A - Male".
For interaction terms, use an "x" to indicate the interaction—for example: "Outcome x Gender".

EXAMPLES

Examples to be provided

Outcome Sub-sample specification (outSubSpec)

CODING INSTRUCTIONS

Select "Outcome + OutcomexSubgroup" when coefficients for both outcome variable alone and the interaction term appear. For example, Outcome and Outcome x Female.
Select "Outcome + Subgroup + OutcomexSubgroup" when coefficients for all, the outcome variable alone, the subgroup variable alone, and the interaction term appear. This is known as a fully interacted model. For example, Outcome + Female + Outcome x Female.
Select "OutcomexSubgroup=A + OutcomexSubgroup=B +...+OutcomexSubgroup=Z" when the coefficient for the outcome alone is not reported, and instead only coefficients for interaction terms with the subgroup are reported. For example: Outcome x Male and Outcome x Female.

EXAMPLES

Examples to be provided

Number of rounds of data collection in the experiment (roundNum)

CODING INSTRUCTIONS

Enter the total number of rounds of data collection reported in the paper, including baseline. A round collects data from the same data source at a given time. In other words, a round signals when different outcomes are measured.
For studies in which data is only collected by surveys of subjects in the sample, the number of rounds typically aligns with the number of surveys (e.g., baseline, midline, endline = 3 rounds).
If more than one survey sample collected by the authors is used to measure outcomes but they are collected around the same time (e.g. a survey of mothers and a survey of teachers), this should count as the same round.
However, any administrative data (election results, census data, etc.) that authors use should be considered a separate data source and thus necessitate a separate round. For example, administrative data collected at the endline should be separate from survey data collected at the endline. If a paper includes a baseline survey, midline survey, endline survey, administrative data collected at midline, and administrative data collected at endline, the number of rounds would be 5 rounds.
For administrative data, sometimes authors do not indicate when they collected the data. In this case, the number of rounds should go by data source (e.g., census data and election data = 2 rounds).
Do not include the rounds collected for the same study but not used or reported in the paper.
Do not include any focus group discussions or qualitative surveys here.
This can be found in the "data collection" or "data sources" section for papers. Sometimes, papers have explanatory diagrams on timeline and implementation which coders can consult to get the different points of time in which data was collected.

EXAMPLES

Freeman et al, 2022: Study collects household surveys and observation-based data at baseline, midline, and endline. In each round, both survey and observation data were collected at the same time [see: Data collection]. Additionally, Figure 2 illuminates the points in time when each of the data collection rounds was conducted.

Answer: 3

Pande & Field, 2008: Only use online endline data on loans and repayment in the current paper.

Answer: 1

Muralidharan et al. 2021: Use administrative data from three sources: 1. Register of landlords, 2. a record of check distribution maintained by the MAOs, and 3. bank records of check encashment [see B. data]. Table 1 suggests that the register data was collected between September and December 2017. Appendix C indicates that the authors received "the up-to-date MAO and bank-based datasets at three points in time: once in July, once in August and once in September 2018. Therefore, there are seven rounds of data collection in the study.

Answer: 7

De Hoyos et al. 2021: Include the following data: i. Student assessments: 2013, 2014, 2015; ii. Student survey: 2013, 2015; iii. Teacher survey: 2013, 2014, 2015; iv. Principal survey: 2014, 2015; v. National assessments: 2016; vi. Internal efficiency: 2013, 2014, 2015, 2016, 2017. Based on data source and time, there are 3 rounds of survey data, as they are conducted at the same time of year, 4 rounds of assessment data (school and national), and 5 rounds of administrative data on internal efficiency.

Answer: 12
Round name (roundLabel)

CODING INSTRUCTIONS

Label the round of data collection with descriptive names, such as "baseline", "midline", "midline phone survey", "endline", "one-year follow-up", "three-year follow up" etc.
Retain any descriptive labels used by the study authors, such as "18-month follow up" or "23-month follow up."
Use the data source to differentiate datasets collected at the same time, for example, phone survey at follow-up and census data at follow-up.
This can be found in the "data collection" or "data sources" section for papers. Sometimes, papers have explanatory diagrams on timeline and implementation which coders can consult to get the different points of time in which data was collected. Tables with treatment effects can also illuminate the different rounds of data collection.

EXAMPLES

Freeman et al, 2022: Study collect household surveys and observations-based data at baseline, midline, and end line. [see: Data collection] Additionally, Figure 2 illuminates the points in time when each of the data collection rounds was conducted.

Answer: Baseline, Midline, Endline

Ozler et al, 2018: Use three rounds of data collection [see: Data sources]

Answer: Baseline, 18-month follow-up, 36-month follow-up

Muralidharan et al. 2021: Draw on three administrative data sources and have seven total rounds.

Answer: Register of landlords September – December 2017, Bank-based dataset July 2018, MAO record dataset July 2018, Bank-based dataset August 2018, MAO record dataset August 2018, Bank-based dataset September 2018, Mao record dataset September 2018

Pande & Field, 2008: Only use online endline data on loans and repayment in the current paper.

Answer: Endline

De Hoyos et al. 2021: Include the following data: vii. Student assessments: 2013, 2014, 2015 viii. Student survey: 2013, 2015 ix. Teacher survey: 2013, 2014, 2015 x. Principal survey: 2014, 2015 xi. National assessments: 2016 xii. Internal efficiency: 2013, 2014, 2015, 2016, 2017 Based on data source and time, there are 3 rounds of survey data, as they are conducted at the same time of year, 4 rounds of assessment data (school and national), and 5 rounds of administrative data on internal efficiency.

Answer: Round 1: Student and teacher surveys in 2013, Round 2: Teacher and principal surveys in 2014, Round 3: Student, teacher, and principal surveys in 2015, Round 4: Student assessment 2013, Round 5: Student assessment 2014, Round 6: Student assessment 2015, Round 7: National assessment 2016, Round 8: Internal efficiency 2013, Round 9: Internal efficiency 2014, Round 10: Internal efficiency 2015, Round 11: Internal efficiency 2016, Round 12: Internal efficiency 2017
Data collection start date

CODING INSTRUCTIONS

Please enter the day, month and year as they appear in the main text of the paper or its supplementary materials.
Select "-99" if the information is not reported in the paper. For example, if only year is reported, select "-99" for month and for day.
This information should be found in the sections on data or experimental design in the main text of the paper. Some papers may also include a timeline of data collection in a figure or in the supplementary materials.

EXAMPLES

Freeman et al, 2022: Has three rounds of data collection: Baseline, Midline and Endline. Figure 2 provides the timeline for data collection.

Baseline: March – May, 2017

Midline: March – May, 2018

Endline: March – May, 2019

Answer: For Baseline, Year = 2017, Month = March, Day = -99
For Midline, Year = 2018, Month = March, Day = -99
For Endline, Year = 2019, Month = March, Day = -99
Data collection end date

CODING INSTRUCTIONS

Please enter the month and year as it appears in the main text of the paper or its supplementary materials.
Select "-99" if the information is not reported in the paper. For example, if only year is reported, select "-99" for month and for day.
Please calculate the corresponding end date for the round of data collection if only the start date and duration are available. For example, if the text is such as "the data collection began in March 2011 and lasted for 2 months", select the calculated month "May 2011" in this field.
This information should be found in the sections on data or experimental design in the main text of the paper. Some papers may also include a timeline of data collection in a figure or in the supplementary materials.

EXAMPLES

Freeman et al, 2022: Has three rounds of data collection: Baseline, Midline and Endline. Figure 2 provides the timeline for data collection.

Baseline: March – May, 2017

Midline: March – May, 2018

Endline: March – May, 2019

The answers to this question would be:

Answer: For Baseline, Year = 2017, Month = May, Day = -99; For Midline, Year = 2018, Month = May, Day = -99; For Endline, Year = 2019, Month = May, Day = -99
Data collection end date calculated from duration

CODING INSTRUCTIONS

Yes if the end date is not directly reported in the paper and the date selected in "Intervention End Date" was based on the coder's calculation using data collection start date and duration reported in the paper.
No if the end date of data collection is reported and entered as it is described in the paper. If the end date is missing, please also select No for this field.

EXAMPLES

Freeman et al, 2022: Has three rounds of data collection: Baseline, Midline and Endline. Figure 2 provides the timeline for data collection.

Baseline: March – May, 2017

Midline: March – May, 2018

Endline: March – May, 2019

The paper reports the start and end time for data collection directly, so the end dates were not calculated based on the duration information.

The answers to this question would be:

Answer: For Baseline, No; For Midline, No; For Endline, No
Stage 1 Table-by-table

For the set of fields below, coder goes through each table identified

Number of comparisons (tableCompNum)

CODING INSTRUCTIONS

List the number of unique comparisons presented in the table. Any reported treatment effect is the result of a comparison of one group against another on an outcome, where a group can be a single study arm or a group of arms.
In a case where there are 4 experimental groups (Treatment A, Treatment B, Treatment A+B, and a Control), many comparisons may be reported. We may see treatment effect for Treatment A relative to the Control, or we may see an estimate for the difference between Treatment A and the combination Treatment A+B. Understanding which comparison is being reported in a table or figure will be important if the row and column labels and table/figure notes do not contain this information.
This information can be present either in the top row or column of a table or it may be reported in the rows as the variables for which treatment effects are being reported. The base arm for comparison is often the omitted category in a regression and/or mentioned in the footnotes of a table. Descriptions of each treatment effect in the results section can also illuminate the contrast under study for each estimate.

EXAMPLES

Ozler et al, 2018 (Table 3):

  • We see 3 comparisons in the first panel of rows:
    1. T2 (teacher training) vs. Control
    2. T3 (T2 + teacher incentives) vs. Control
    3. T4 (T2 + parenting training) vs. Control
  • We see 1 additional comparison in the second panel of rows that pools all treatment groups into 1 group.
    1. Any Treatment (T2, T3, or T4) vs. Control
  • The fourth panel of rows contains the precision but not treatment effects for additional comparisons:
    1. T2 vs. T3
    2. T2 vs. T4
    3. T2 vs. T3
Answer: 4

Mbiti et al. 2019 (Table III): We see five comparisons in Panels A, B, and C: 1. Grants (α1) vs. None, 2. Incentives (α2) vs. None, 3. Combination (α3) vs. None, 4. Combination (α3) vs. Grants+Incentives (α2+α1), 5. Combination (α3) vs. Grants (α1).

Answer: 5
Evaluation arm (armEval)

CODING INSTRUCTIONS

For each comparison, select the arm(s) being evaluated from the list of arms in the study
This information can be present either in the top row or column of a table or it may be reported in the rows as the variables for which treatment effects are being reported. Descriptions of each treatment effect in the results section can also illuminate the contrast under study for each estimate.
If authors pool separate study arms into a single group to compare against another, please select multiple study arms to describe the evaluation arm.
For factorial designs, if an interaction is being evaluated, select ALL the relevant arms in the interaction.

EXAMPLES

In table 3 from Ozler et al, 2018: We see 3 comparisons in the first panel of rows: 1) T2 (teacher training) vs. Control, 2) T3 (T2 + teacher incentives) vs. Control, 3) T4 (T2 + parenting training) vs. Control. We see 1 additional comparison in the second panel of rows that pools all treatment groups into 1 group: 4) Any Treatment (T2, T3, or T4) vs. Control. Thus, there are 4 different evaluation arms here.

Answer: For evaluation arm 1: Select Teacher training only; For evaluation arm 2: Select Teacher training & teacher incentives only; For evaluation arm 3: Select Teacher training & parenting training only; For evaluation arm 4: Select teacher training, teacher training & teacher incentives, and teacher training & parenting training

In Table III from Mbiti et al. 2019, we see five comparisons in Panels A, B, and C: 1) Grants (α1) vs. None, 2) Incentives (α2) vs. None, 3) Combination (α3) vs. None, 4) Combination (α3) vs. Grants+Incentives (α2+α1), 5) Combination (α3) vs. Grants (α1). There are 5 different evaluation arms here.

Answer: For evaluation arm 1: Grants; For evaluation arm 2: Incentives; For evaluation arm 3: Combination; For evaluation arm 4: Combination; For evaluation arm 5: Combination
Reference arm (armNonEval)

CODING INSTRUCTIONS

For each comparison, select the reference arm for the comparison.
In most cases for a parallel design, the comparison is the control arm.
This information can be present in the footnotes of the table such as "Control is the base arm"

EXAMPLES

Ozler et al, 2018 Table 3: The reference arm for this table is the control group. Here, coders should select the arm "Learning kits" that was created when they mapped intervention to arms.

Answer: Learning kits (control group)

Mbiti et al., 2019 Table III: The five reference arms for this table are: 1. None, 2. None, 3. None, 4. Grants + Incentives [select two interventions], 5. Grants.

Answer: None; None; None; Grants + Incentives; Grants
Estimand (tableestFull)

CODING INSTRUCTIONS

In the IDEAL project, the main quantity of interest (or estimand) is the intention-to-treat (ITT) effect using the entire experimental sample. At the same time, IDEAL also collects treatment effects estimated using authors' preferred estimand other than ITT, such as LATE/TOT.
In this question, please indicate all the estimands reported in the table, either they are preferred by IDEAL or the authors.
An effect is intent-to-treat (ITT) if the authors are interested in estimating the effect on everyone assigned to receive treatment, regardless of whether or not they actually received the treatment. When there is perfect compliance - for example, say that in a population of 200, 100 people are randomly assigned to treatment and all 100 people are actually treated, the ITT is the same as the Average Treatment Effect (ATE).
An effect is local average treatment effect (LATE) the authors are estimating the effect among those who comply with treatment assignment.
An effect is treatment on the treated (TOT)/average treatment on the treated (ATET) if the authors are estimating the effect on those who actually take up the treatment and non-compliance is one-sided. That is, the control group cannot or does not get the treatment.
Usage of the ITT/LATE/TOT/ATET terminology is mostly seen in the discipline of Economics. For studies from other disciplines, ascertaining whether or not an effect is meant to be an ITT/LATE/TOT/ATET effect will require inference from what is described in a section on Methods.
Read through the methods sections, tables and table notes to find information on the estimand used for listed outcomes. The estimands may vary across outcomes and rounds of data collection in a paper.
Select "Only ITT estimates" if only ITT is reported for all of the listed outcomes
Select "Only LATE estimates" if only LATE is reported for all of the listed outcomes
Select "Both ITT and LATE/TOT estimates" if both ITT and LATE/TOT estimates are reported for any of the listed outcomes. If ITT is reported for one set of outcomes and LATE/TOT is reported for another set of outcomes, select this option.
Select "Neither" if neither of the estimands is reported for any of the listed outcomes.

EXAMPLES

Baysan, 2022: In the section D. Implementation, the author notes "... Therefore, I estimate only the intent-to-treat (ITT) effect". There are four exhibits in the papers. Figure 1 and Table 2 report full-sample treatment effects using ITT. Table 1 and Figure 2 only present treatment effects by quartile instead of full sample, thus they are not considered in IDEAL data extraction.

Answer: Only ITT estimates

Yoshikawa et al., 2015: The ITT/LATE/TOT/ATET terminology is not used. In the Data Analysis Strategy section, the estimation equation indicates that treatment effects are estimated based on the treatment assignment, i.e. being in a classroom in a FULL UBC prekindergarten. Thus, the estimand is ITT by inference. There are two exhibits in the paper including treatment effects. Table 1 reports the treatment effects for the full sample of teachers, and Table 2 presents the treatment effects for the full sample of children, both using ITT estimand.

Answer: Only ITT estimates

Sigh et al., 2018: Do not use the ITT/LATE/TOT/ATET terminology and do not use an equation to describe the estimation. In 2.9 Statistical Analysis, they write "An analysis of the effectiveness of the intervention was based on the randomization of the product the patients were originally assigned using all available case data including patients missing follow-ups and dropouts. This analysis method handles missing data by fitting a statistical model over all available case data without introducing bias.", which implies the estimand is ITT as the analysis estimates treatment effects on patients originally assigned to receive the product. Table 4 reports the treatment effects through differences in means for all patients.

Answer: Only ITT estimates

Linhares et al., 2022: Performed both ITT and TOT analyses, however, only TOT results are presented in the paper exhibits (Tables 3 and 4). The ITT results are omitted because the effects are not significant. The ITT estimates can not be extracted from the paper.

Answer: Only LATE/TOT estimates

Ganimian, Mulralidharan, and Walters, 2023: Present both ITT and TOT results using two samples (i.e. HH assessments and AWC assessments). The authors explain in the Introduction, "Moreover, treatment-on-the-treated effects obtained by scaling the household sample estimates by the share of children observed at the center at the endline are close to the AWC and common sample estimates. We therefore interpret the AWC estimates as reflecting treatment effects on children who actively attended the centers, while the household estimates capture intent-to-treat-style impacts on the set of eligible children, many of whom had limited treatment exposure." In Tables 2, 5 and 6, both ITT and TOT effects on assessment scores are reported.

Answer: Both ITT and LATE/TOT estimates

Banerjee et al., 2020: ITT estimates are reported for all outcomes (Tables 1 - 4). In addition, LATE estimates are also presented for learning outcomes (in Table 4).

Answer: Tables 1-3: Only ITT estimates; Table 4: Both ITT and LATE/TOT estimates
Outcomes in table (tableOut)

CODING INSTRUCTIONS

Display all the outcomes for which full sample treatment effects are reported in the table
These may appear on the heading row or the first column of the table.

EXAMPLES

Ozler et al, 2018 Table 3: Presents 5 outcomes. These are present in the top row of table 3 under the title "Dependent variable". Since these outcome names were collected previously in the survey, here coders should select:

Answer: Attending CBCC: 2012-13; Enrolled in Primary: 2012-13; Malawi Developmental Assessment Tool: Total Score; Malawi Developmental Assessment Tool: Language Skills; Malawi Developmental Assessment Tool: Fine Motor/Perception Skills

Freeman et al. 2022 Table 2: Reports treatment effects on 10 outcomes. Each outcome is listed as a row in the Indicator column. The full list of outcomes collected for the study would appear as options, and coders should select:

Answer: Anxiety score; Depression score; Emotional distress score; Well-being score; High Anxiety; High Depression; High Emotional distress; Poor well-being
Number of periods in the table (tableRoundNum)

CODING INSTRUCTIONS

List the number of distinct periods over which the treatments effects in the table were estimated. Periods refer to the units of time for which the paper reports treatment effects.
Most times each period only involves one round of data collection. But authors may pool several rounds of data collection into one extended period for some analysis.
When authors combine rounds of data collections, the number of periods in any given table may exceed the total number of rounds of data collection.
In general, the baseline is not a period. This is because studies do not estimate the treatment effect at the time of the baseline survey. However, if the authors use a difference-in-differences design to estimate the treatment effect, then the baseline is included in the period.

EXAMPLES

Hanna et al., 2016 Table 3: Presents 5 periods used to calculate treatment effects. The first row reports a period that includes all four rounds of midline and endline surveys in estimating the treatment effects. The following rows report the treatment effects in each individual period of midline and endline surveys (one per year since the treatment was administered).

Answer: 5

Ozler et al, 2018 Table 3: The title of the table specifies the results are for the 18-month follow up. Only one round of data collection was used in this table.

Answer: 1
Rounds of data collection in the table (tableRound)

CODING INSTRUCTIONS

Select the rounds of data collection for each period reported in the table.
For treatment effects estimated within a period with only one round of data collection, select solely that round. For treatment effects estimated in a period with multiple rounds of data collection, select all rounds that were pooled in that estimation during the period.

EXAMPLES

Ozler et al, 2018 Table 3: The title of the table specifies the results are for the 18-month follow up.

Answer: 18-month follow up

Hanna et al., 2016 Table 3: Presents treatment effects for individual rounds of data collection. For the individual rounds, coders should select the corresponding round names (e.g., 0-12 month survey, 13-24 month survey, etc.). For the pooled round, coders should select all four of the individual rounds.

Answer: 0-12 month survey; 13-24 month survey; 25-36 month survey; 37-48 month survey; All four rounds pooled
Number of empirical specifications in the table (tableSpecNum)

CODING INSTRUCTIONS

Please identify and indicate the number of unique empirical specifications used to estimate treatment effects in the exhibit.
A unique empirical specification is defined by the inclusion (or not) of strata correction, baseline outcome and other controls. This table illustrates all 8 types of specifications, and a coder should count how many of them were used to estimate treatment effect in the exhibit.
Baseline value of outcome refers to the baseline value of the outcome for which the treatment effect is being estimated. Controls for baseline values of other variables are not included in baseline value of outcome.
Controls include those are observed before treatment assignment refer to variables measured at baseline, as well as the ones that are measured after baseline but are considered to be static over the study period - for example, parental education in the context of a program targeting children, gender in the context of a program that is not assumed to affect gender identity, or residence in a flood-prone zone in a context of a program that is not assumed to affect choice of residence.
A simple difference in means with no additional information on the specification used or the presence of any adjustment should be considered as "No other controls". Likewise, a regression with just the outcome of interest as the dependent variable and just an indicator for treatment status as the sole independent variable should be coded as "No other controls."
Information on specifications can be found in both data and statistical analysis sections and table notes. In some papers, authors report the regression specification or empirical model used to estimate treatment effects and/or indicate their specification in column labels in tables. Sometimes, authors add additional details about specifications to table notes only. In other studies, this information must be inferred from the text and/or notes that accompany tables or figures.

EXAMPLES

Riley, 2024: The empirical Strategy describes the specifications in equation (1) that strata dummies and the baseline value of the outcome (if measured at baseline, otherwise excluded) are included in the estimation of treatment effects. In Table 1, where full sample results are reported, the notes indicate "All regressions include strata dummies and include the baseline value of the outcomes."

Answer: 1

Ashraf et al., 2010: According to tables notes, in Table 2, two specifications are used for each outcome: with and without baseline controls (including baseline Clorin usage and water cholorination, general health behaviors and attitudes, household demographics, and locality fixed effects).

Answer: 2

Grossman and Baldassarri, 2012: Report treatment effects using four set of specifications in Table 2 including no controls, individual controls, monitor profile (post-treatment control), and both individual controls and monitor profile. There are in total 4 specifications.

Answer: 4

Stage 2: Set-Up & Fields

Stage 2 Set-Up
Coder name (coder)

CODING INSTRUCTIONS

Please select your name from the list below
If your name does not appear, reach out to your supervisor

EXAMPLES

see the [section] used in the paper to extract

Paper ID (paperID)

CODING INSTRUCTIONS

Select the IDEAL paper ID for which you will start extraction.
If the paper ID does not appear on the list, reach out to your supervisor.

EXAMPLES

see the [section] used in the paper to extract

Paper title (X_titleConfirm)

CODING INSTRUCTIONS

Confirm the name of the paper that appears on SurveyCTO matches the name of the paper on the list assigned to you.
If you have selected no, a message will be sent to your supervisor. Please await a new corrected paper assignment.

EXAMPLES

see the [section] used in the paper to extract

Stage 2 Fields

This survey is going to go through each of the exhibits you specified in Stage 1 (and also verified by a supervisor) that reported treatment effects. The questions will allow us to match and specify the treatment effects that are eligible for data extraction for IDEAL.

Section Notes: Exhibit Information

In Stage 1, you reported that {Exhibit label} includes {Number of comparisons} contrast(s) and {Number of outcome variables in the table} outcome(s). In the next questions you will be asked to: Identify the IDEAL eligible treatment effect for each outcome-contrast pair; Describe the empirical specification and rounds of data collection used to estimate those treatment effects.

Number of eligible treatment effects (tfx_num)

CODING INSTRUCTIONS

Count the number of IDEAL treatment effect estimates you plan to report for this exhibit.

WHAT ESTIMATES TO COUNT

For each possible outcome-contrast pair, the total number of IDEAL-eligible treatment effects depends on the specification and the rounds of data collection.
It is possible that no treatment effect is reported in the table for a given contrast-outcome pair. However, each outcome and contrast in the exhibit should be related to at least one treatment effect.
Only count full sample results.
Do not count estimates from interaction terms.
Count ITT estimates. Only count LATE / TOT estimates if they are preferred by the author or they are the only estimates reported.

HOW TO COUNT DATA COLLECTION ROUNDS

If treatment effects are estimated for multiple rounds of data collection, count each round as a separate treatment effect.
For example, if the same treatment effect is estimated separately for the "first follow-up survey" and the "second follow-up survey," each estimate should be counted as a distinct treatment effect, regardless of the specification used.

HOW TO COUNT SPECIFICATIONS

When multiple treatment effects are reported for a contrast-outcome pair, only two of them will be eligible for IDEAL:
IDEAL-preferred specification: Count the treatment effects estimated using the highest-ranked IDEAL-preferred specification.
Author-preferred specification: Only count the author-preferred specification as separate treatment effect if it differs from the IDEAL-preferred specification.
The ranking of IDEAL-preferred specifications can be found here (click to open new page): https://ideal-consortium.github.io/Schema/IDEAL_Ranking_Index.html

EXAMPLES

see the [section] used in the paper to extract

Outcome – contrast pairs (contrast_col)

CODING INSTRUCTIONS

For each outcome, please select all the contrasts for which treatment effects were estimated and reported in the specified table.

EXAMPLES

see the [section] used in the paper to extract

Estimand for outcome – contrast pairs (estimand_col)

CODING INSTRUCTIONS

For each outcome-contrast pair, please select the estimand for which treatment effects were reported in the specified table.
This question only applies to cases where an exhibit includes both, ITT and TOT/LATE estimates.
If only one estimand is reported for this table in Stage 1, this question will be replaced by a note displaying the corresponding estimand, either "ITT" or "TOT/LATE".

EXAMPLES

see the [section] used in the paper to extract

IDEAL preferred empirical specifications for outcome-contrast pairs (specification_col)

CODING INSTRUCTIONS

Select the single IDEAL-preferred specification per pair (use ranking).
If preferences differ by estimand, report ITT.
"No controls" cannot be combined with other options.
If only one specification exists in Stage 1, skip; a single follow-up describes the one spec for all effects.

EXAMPLES

Summaries from Riley (2024), Barrera-Osorio (2011), Grossman & Baldassarri (2012), Li (2022).

Authors prefer the same empirical specifications as IDEAL (preferred_col)

CODING INSTRUCTIONS

Mark "Yes" if IDEAL-preferred equals Author-preferred; leave unchecked for No.
Skipped if only one specification reported in Stage 1 (implied same).

EXAMPLES

see the [section] used in the paper to extract

Rounds of data collection (period_col)

CODING INSTRUCTIONS

Mark all rounds used for estimating the treatment effects.
Each round should map to a distinct treatment effect estimate.
If only one round appears in Stage 1, this is skipped; a confirmation follows next page.

EXAMPLES

see the [section] used in the paper to extract

IDEAL specification for round of data collection (same_period_col)

CODING INSTRUCTIONS

If unchecked, a follow-up will ask for the specification used in each selected round.
If only one specification is reported in Stage 1, skip; IDEAL-preferred assumed for all rounds.

EXAMPLES

see the [section] used in the paper to extract

Author preferred empirical specifications (author_spec)

CODING INSTRUCTIONS

Select alternatives describing the authors' preferred specification for each effect.
Appears only when Author-preferred differs from IDEAL-preferred.

EXAMPLES

see the [section] used in the paper to extract

Alternative specification for each period (diff_per_spec)

CODING INSTRUCTIONS

Select the specification used for each round of data collection.

EXAMPLES

see the [section] used in the paper to extract

Specification for all treatment effects (spec_single)

CODING INSTRUCTIONS

Select the specification used to estimate all treatment effects in the exhibit.

EXAMPLES

see the [section] used in the paper to extract

Round of data collection used for all treatment effects (spec_single)

CODING INSTRUCTIONS

Select "Yes" to confirm the displayed round is used for all treatment effects in the exhibit.
It is unlikely you will need to select "No". The survey continues regardless.

EXAMPLES

see the [section] used in the paper to extract

Stage 3: Set-Up, Study Details & Estimates

Stage 3 Set-Up
Coder name

CODING INSTRUCTIONS

Please select your name from the drop-down.
If your name does not appear, reach out to your supervisor.

EXAMPLES

See descriptive examples in paper.

Paper ID

CODING INSTRUCTIONS

Select the IDEAL paper ID for which you are starting data extraction.

EXAMPLES

See descriptive examples in paper.

Paper title

CODING INSTRUCTIONS

Please confirm that this title matches the paper you are currently coding.
If the title does not match your assignment but the ID is correct, please alert your supervisor.

EXAMPLES

See descriptive examples in paper.

Request for review: fields

CODING INSTRUCTIONS

Please select all questions you were unsure about.

EXAMPLES

See descriptive examples in paper.

Request for review: detail

CODING INSTRUCTIONS

For each field, explain what you were uncertain about and (if applicable) which options you were considering.

EXAMPLES

For example, if a coder selects the "Estimand is full sample ITT and LATE/TOT" field in the request-for-review section, they could add an explanation about their uncertainty in this field.

Answer: "Unsure regarding the empirical specification of the estimand. The author mentions that "we view the evaluation design as quasi-experimental and use difference-in-differences to estimate program impact." Are the estimates indirectly ITT/LATE/TOT or are they only quasi-experimental estimates?"
Stage 3 Module 1: Study Details
Sampling

Section notes to display in the survey:

The questions in this section will ask how the unit(s) of randomization were sampled and how unit(s) of analysis were drawn from the units of randomization. Sampling of units of randomization and units of analysis often involved multiple steps, and we would like you to describe each of the step of the sampling process.

The best way to approach these questions would be drawing the sampling steps as you read the paper and then answer the questions in this survey. Please read the coding instructions carefully and provide the required information in the corresponding table cell. The table below demonstrates how to fill in the table using the coding instructions.

For each unit of randomization and each unit of analysis (that is not a unit of randomization), you would be asked to fill out a table like the one below with information on sampling.

Illustrated sampling questions for unit of randomization in a survey table

[Unit of randomization] Any inclusion/exclusion criteria (Yes/No) Description of inclusion/exclusion criteria Sampling method (Universe, random, non-random)
Sampling unit 1: Indicate if there were any inclusion/exclusion criteria applied, when selecting the unit of randomization from sampling unit 1. If yes, describe the criteria. After applying the inclusion/exclusion criteria, how were the units of randomization selected from Sampling unit 1?
Sampling unit 2: Fill in the label for the larger unit from which {sampling unit 1} was drawn from.
Sampling unit 3: Fill in the label for the larger unit from which {sampling unit 2} was drawn from.
Sampling unit 4: Country or another unit of randomization

[You reach the end of this unit of questions if you enter "Country" or "Another unit of randomization" in this cell. Please skip the rest of the rows]
Sampling units from which the unit of randomization was drawn

Coding Instructions

  • Please identify and label each of the larger sampling units from which the unit of randomization (or unit of analysis) was drawn. A sampling unit is defined by a specific unit of inclusion and exclusion sampling criteria, for example, districts with over 1 million households.
  • Start from a unit of randomization (or unit of analysis) to another larger unit of randomization (or unit of analysis) or the country (or unit of randomization) of the experiment.
  • There could be more than one sampling unit within the same unit, for instance, schools that advertised a job and schools that had the job filled are two different sampling units at the school level and should be counted as separate units.
  • For sampling units defined by geographic locations or administrative areas, please include the names of the places included/excluded in the label, for example, States (Jalisco, Chiapas, and Hidalgo). If the names are not available, please indicate the total number of the places, e.g. 3 provinces.
  • The sampling information might be located in several places in the paper. Please search the experimental design, sampling, and data sections in the paper and appendix carefully to identify the sampling units for units of randomization.

Examples

Example 1 - Alatas et al., 2012:

In Alatas et al., 2012, the unit of randomization is subvillage.

Subvillages were drawn from villages. [See section B: Sample], so the "Sampling unit 1" should be:

Answer: Villages

Villages were drawn from another larger unit – province, so "Sampling unit 2" should be province. Since province is a geographic unit, the names should also be included in the label, so the answer would be:

Answer: Provinces (North Sumatra, South Sulawesi, and Central Java)

The label for the next sampling unit – "Sampling unit 3" - would be the unit where provinces were drawn. The provinces were from the country, so the answer would be:

Answer: Country

Till country, the sampling unit has reached the largest possible unit and there should be no more sampling unit entered for this unit of randomization. In total, there are three sampling units for "Subvillages".

In the same paper, there are two units of analysis different from the unit of randomization: household and subvillage head (or using answer from stage 1: individual-political/social leader). Both units were drawn from the subvillages, which were the unit of randomization.

So, for both household and subvillage head, the "Sampling unit 1" would be:

Answer: Villages (unit of randomization)

There are no more sampling units to be entered as it reaches a unit of randomization

Example 2 - Leaver et al., 2021:

Leaver et al, 2021, has two units of randomization: labor market (i.e. district-by-subject-family teaching job market) and school. For this paper, you will see the questions for each of the unit of randomization separately.

First, "Schools" were drawn from schools with "at least one new post that was filled and assigned to an upper-primary grade" (see Second-Tier Randomization: Experienced Contracts, page 2220), so "Sampling unit 1" would be:

Answer: Schools with one new post filled in an upper primary grade

Sampling unit 1 was drawn from "Schools to which REB had allocated the new posts to contracts", which should be "Sampling unit 2":

Answer: Schools to which REB had allocated the new posts to contracts

"Schools to which REB had allocated the new posts to contracts" were sampled form "labor markets", which is another unit of randomization. So, "Sampling unit 3" should be:

Answer: Labor markets (unit of randomization)

The sampling units are complete for "Schools" as it reaches another unit of randomization.

The next group of sampling units are for "Labor markets".

Labor markets were drawn from districts. The paper did not specify the names of the districts. "Sampling unit 1" would be:

Answer: Six districts

The districts were directly sampled from the country, so "Sampling unit 2" would be:

Answer: Country

Since country is the largest possible sampling unit, there will be no more sampling unit for this unit of randomization.

Any inclusion or exclusion criteria

Coding Instructions

  • Yes, if sampling criteria were applied before the unit of randomization or unit of analysis was drawn from a larger sampling unit.
  • No, if no sampling criteria were applied before the unit of randomization or unit of analysis was drawn from a larger sampling unit.

Examples

Example - Alatas et al., 2012:

In Alatas et al., 2012, the unit of randomization is subvillage.

Subvillages were drawn from villages. No inclusion or exclusion criteria were mentioned in the sampling process. Therefore, the answer to this field would be:

Answer: No

Villages were drawn from another larger unit – provinces. In Footnote 8, an exclusion criterion was stated: "An additional constraint was applied to the district of Serdang Bedagai because it had particularly large sized subvillages. All villages in this district with average populations above 100 households per subvillage were excluded."

The answer to this field would be:

Answer: Yes
Description of inclusion/exclusion criteria

Coding Instructions

  • Please describe the inclusion or exclusion criteria that were applied when sampling the unit of randomization or unit of analysis from each larger sampling unit.

Examples

Example - Alatas et al., 2012:

In Alatas et al., 2012, the unit of randomization is subvillage.

Subvillages were drawn from villages. No inclusion or exclusion criteria were mentioned in the sampling process.

Villages were drawn from another larger unit – provinces. In Footnote 8, an exclusion criterion was stated: "An additional constraint was applied to the district of Serdang Bedagai because it had particularly large sized subvillages. All villages in this district with average populations above 100 households per subvillage were excluded."

Answer: "All villages in the district of Serdang Bedagai with average populations above 100 households per subvillage were excluded."
Sampling method

Coding Instructions

  • Select the sampling method used to draw the smaller sampling unit from the adjacent larger sampling unit.
  • Total universe: All units (individuals, households, organizations, etc.) of a target population are included in the data collection.
  • Random: All units (individuals, households, organizations, etc.) of a target population have a non-zero probability of being included in the sample and this probability can be accurately determined.
  • Non-random: The selection of units (individuals, households, organizations, etc.) from the target population is not based on random selection. It is not possible to determine the probability of each element to be sampled. Some common non-probability sampling methods include convenience sampling, snowball sampling, random route sampling, judgement sampling, and convenience sampling (e.g. depending on participant's availability).
  • Unknown: if the sampling method can not be determined based on the information reported in the paper.
  • Other: if none of the above apply.

Examples

Example 1 - Alatas et al., 2012:

In Alatas et al., 2012, the unit of randomization is subvillage.

Subvillages were drawn from villages. As the paper notes "For each village, we obtained a list of the smallest administrative unit within it (a dusun in North Sumatra and Rukun Tetangga (RT) in South Sulawesi and Central Java), and randomly selected one of these subvillages for the experiment" [See Section B: Sample, page 1211].

Answer: Random

In the same paper, there are two units of analysis that are different from the unit of randomization: household and subvillage head (Answer from Stage 1: individual-political/social leader). The paper stated that "From this census, we randomly sampled 8 households from each subvillage plus the head of the subvillage".

For household, the response would be:

Answer: Random

For subvillage head, the answer would be:

Answer: Total universe

because all the subvillage heads were included in the sample.

Example 2 - Briaux et al. 2020:

In Briaux et al. 2020 eligible households were selected using a random-route sampling method. It is a non-probability sampling method because the probability of selecting a household is unknow although the "starting points" were selected randomly.

Answer: Non-random
Interventions

Section notes to display in the survey:

In this section, please describe the details of each intervention you specified in Stage 1.

It is possible that information in the paper about the intervention may be spread across different sections of the paper. This information may be located in paper sections such as the introduction or those that discuss research design or experimental design, including the footnotes.

When extracting information verbatim from the paper, please add quotation marks around the words and include the page number of the PDF document (rather than the original journal page number). Use square brackets for any paraphrased text or spelled out acronyms in the quotation.

For example, "The Andilaye intervention focused on three WASH [Water, Sanitation, and Hygiene]-related behavioral themes, informed by formative research: (1) sanitation, (2) personal hygiene, and (3) household environmental sanitation. Within these themes were 11 constituent practices targeted by the intervention; these practices were identified through formative research as ones that could be targeted using demand-side approaches, and were seen as achievable, per stakeholder feedback." (page 6)

Section instructions for data entry mask:

  • The number of interventions and their labels collected, and study arms and their labels in Stage 1 need to be preloaded here to create the questions in the section.
  • The questions on the details of intervention should be presented on the same page in the survey. [Check the word-limit for open-text field in SurveyCTO].
Intervention description - Detailed

CODING INSTRUCTIONS

Please provide a description of the intervention to distinguish it from other interventions in the same study.
If available, the description should include: (1) what the intervention is, (2) who the target group is, and (3) what the intended purpose is.
Please spell out any acronyms that would make it hard to understand the intervention description when read as stand-alone text, and avoid proprietary terms. If you add text to what is otherwise a cited section copied from the paper, please put the added text in square brackets.

EXAMPLES

Example 1 - Technical and vocational education training program:

One intervention in Lyall et al. 2020 is a technical and vocational education training program, and a description of the intervention would be:

Answer: The intervention consisted of three livelihood training components bundled together. "First, participants were enrolled in either three- or six-month TVET [Technical and Vocational Education and Training] courses at one of four VTCs. These courses ranged from motorcycle and mobile phone repair to metal works and computer services to tailoring and English-language tutoring. While content was trade-specific, each course aimed to build practical marketable skills and to improve prospects for full-time employment in the local economy. Second, students were concurrently enrolled in a "soft skills" course designed to bolster business skills and employment opportunities by networking with key local market actors. As part of this course, participants received instruction in time management, decision-making, leadership, and negotiation. Third, participants who successfully completed technical TVET courses were provided with a small start-up kit of trade-specific tools upon graduation.(p130)"

Example 2 - Phone-monitoring intervention:

A description of the phone-monitoring intervention in Muralidharan et al. 2021 could be:

Answer: "The call center placed calls to the mobile phone numbers of sampled farmers. If a call did not connect, the call center would attempt to reach that number up to five more times over the following two days before giving up. If connected, the call center operator verified the respondent's identity and identified themselves as conducting a survey on behalf of the Government of Telangana to understand the respondent's experience with the Rythu Bandhu Scheme. Calls collected information on whether, where, and when the farmer received their check; whether and when they encashed it; any problems receiving or encashing the check (including time costs and bribes); how they used the funds; suggestions for future rounds of RBS; and overall satisfaction with RBS [Rythu Bandhu Scheme]. (p60)"
Details of intervention: eligibility criteria

CODING INSTRUCTIONS

Please describe any criteria applied to determine if an individual or unit is eligible (or ineligible) for the intervention such as age limits, qualification criteria, etc.
Note that eligibility for the intervention can overlap with but is generally distinct from the inclusion/exclusion criteria applied when sampling units or observations for data collection
Eligibility for the intervention may be broader than the sampling inclusion/exclusion criteria (e.g. sampling may be restricted to a certain set of districts the intervention was implemented in more districts), or narrower (e.g. while income of all household members is measured, the intervention is a cash transfer for women 15-45 years of age.)
Nonetheless, often, recruitment into the study sample uses the eligibility criteria for the intervention. In this case, please repeat the information here. For example, in an early childhood health intervention the study may sample households or women with at least one child under 5, but the intervention targets children under 5.

EXAMPLES

Example 1 - Iron interventions:

There were three sets of eligibility criteria for the two iron interventions in Pasricha et al., 2021:

Answer: 1. "Children 7.5 to 8.5 months of age" (p983) were eligible.
2. "Children with marked anemia (a hemoglobin level of <8.0 g per deciliter), current febrile illness, severe acute malnutrition, a known inherited red-cell disorder or previous transfusion, or known developmental delay were excluded" (p983).
3. Children in households with iron levels in drinking water exceeding 1 mg per liter were excluded.

Example 2 - Conditional cash transfer:

The eligibility criteria for the conditional cash transfer intervention in Filmer et al. 2023 were described as:

Answer: "Households are eligible if they have a proxy means test score below the provincial poverty line and contain children ages 0 to 14 years or a pregnant woman" (p329).

Example 3 - INVEST programs:

In Lyall et al. 2020, both interventions in the INVEST programs targeted "at-risk youth" and "internally displaced persons". The paper noted that the recruitment was done by a consortium of actors but there were no data on "individuals who were deemed ineligible for participation" (p132).

Answer: At-risk youth and internally displaced persons
Details of intervention: proprietary name

Coding Instructions

  • Please enter any proprietary name of the intervention, such as Head Start, Be a Man, PROGRESA, etc., if applicable.
  • If the intervention was part of a larger program with a proprietary name, please note that in the answer using the phrase "Component of [prop. name]." Only if a treatment arm receives all components of the proprietary intervention, write [proprietary name]."
  • Enter "None" if there is no proprietary name.

Examples

Example 1 - Lyall et al., 2020:

The economic assistance intervention in Lyall et al 2020 was part of larger program named "INVEST".

Answer: Component of Introducing New Vocational Education and Skills Training (INVEST) program

Example 2 - Muralidharan and Sundararaman (2015):

The private school voucher intervention in Muralidharan and Sundararaman (2015) was called "The AP Private School Choice project".

Answer: The AP Private School Choice project
Details of intervention: study scale same as the implementation scale

Coding Instructions

  • Yes: the scale of the evaluated intervention and the implemented intervention were the same. This is usually the case for researcher implemented programs or pilot programs.
  • Select "yes" if the program was only scaled up after the study implementation.
  • No: the intervention was implemented at a larger scale than the evaluated treatment group either before the study or during the same time period, for example, when a large-scale intervention took place but only a portion of it was being evaluated for various reasons. This could be the case for government implemented programs at scale, or for existing programs where one roll-out wave of a larger program or a subset of "marginal candidates" are used for randomized evaluation.

Examples

Example 1 - Crost, Felter, and Johnston (2016):

Crost, Felter, and Johnston (2016) evaluated part of a conditional cash transfer program (Pantawid Pamilya) in the Philippines. In 2019, the program was scheduled to begin in 19 municipalities of 8 provinces. Among the 19 municipalities, 8 were randomly selected to be part of the evaluated experiment. The remaining received the intervention as scheduled.

In this case, the scale of the implemented intervention was larger than the treatment group in the experiment.

Answer: No, intervention scale larger than study scale

Example 2 - Mbiti et al. (2019):

In Mbiti et al. (2019), the interventions were implemented at the school level, affecting all students in the focal grades. The study only collected data from a randomly selected group of students and households and included them in the analysis. Specifically, 10 students from each focal grade were sampled and 10 households were selected from each school for data collection. [See III.B.Data]

Therefore, the scale of the implemented intervention was larger than the study sample.

Answer: No, intervention scale larger than study scale
Details of intervention: implementation scale

Coding Instructions

  • Describe the scale at which the intervention was implemented, including both the study and non-study participants.
  • Please include information on the number of administrative and geographical units that the intervention reached, if any.

Examples

Example 1 - Crost, Felter, and Johnston (2016):

Crost, Felter, and Johnston (2016) evaluated part of a conditional cash transfer program (Pantawid Pamilya) in the Philippines. In 2019, the program was scheduled to begin in 19 municipalities of 8 provinces. Among the 19 municipalities, 8 were randomly selected to be part of the evaluated experiment. The remaining received the intervention as scheduled.

In this case, the scale of the implemented intervention was:

Answer: 19 municipalities of 8 provinces in the Philippines

Example 2 - Mbiti et al. (2019):

In Mbiti et al. (2019), the interventions were implemented at the school level, affecting all students in the focal grades. The study only collected data from a randomly selected group of students and households and included them in the analysis. Specifically, 10 students from each focal grade were sampled and 10 households were selected from each school for data collection. [See III.B.Data]

Answer: All students and households in treatment schools
Details of intervention: intensity

Coding Instructions

  • Please provide information on the "intensity" of the intervention. This could be the length and frequency of the intervention (e.g. for a training), the amounts given (e.g. for a cash transfer or a subsidy), or the total duration of exposure (e.g. for an ad campaign).
  • If there are multiple components in the same intervention that varied in intensity, please describe the intensity of each component.

Examples

Example 1 - Lyall et al., 2020:

Using the same example from Lyall et al. 2020, the TVET intervention has three components: 1) TVET courses, 2) a "soft skills" course and 3) a start-up kit of trade-specific tools.

Answer: There are three components in the intervention. The TVET courses were either three months or six months long. The duration of the "soft-skill" course was not stated. The third component was a start-up kit of trade-specific tools.

Example 2 - Cardenas, Evans and Holland (2023):

The "Early Education Program (Programa Educación Inicial or PEI)" in Cardenas, Evans and Holland (2023) was an early childhood education intervention that included 65 group sessions during nine months with each session lasting for about 2 hours.

Answer: The intervention included 65 group sessions during a nine-month term. "These sessions included (1) up to 26 sessions for caregivers and parents (men and women), (2) up to 18 sessions for caregivers and parents (men and women) focused on children, (3) up to 5 sessions for parents who are men, and (4) up to 8 sessions for pregnant women. In addition, promotoras [facilitators who receive two weeks of annual training, educational materials, and a small stipend] could organize up to 8 additional sessions for diagnosis, planning, and evaluation." "Sessions were generally held for two hours, and the frequency of these sessions is defined through an initial agreement between the promotoras and participants, often one or more times per week between the sessions of type (a) and (b)." (page 5131)
Details of intervention: reported cost

Coding Instructions

  • Please provide information related to the cost of the intervention mentioned in the paper.
  • Keyword search for 'cost', cost-effective*, and cost-eff* and review adjacent context to determine if any cost information is presented in the paper or in the supplementary materials.
  • The information could be a total cost or a cost of an intervention per beneficiary. Other forms of analysis would include total cost, cost-efficiency metrics, e.g. unit cost, cost per beneficiary, and cost-effectiveness analyses, e.g. benefit cost ratio, incremental cost-effectiveness ratio, etc.
  • If costs are presented in a table or figure, please enter the reported cost for the intervention and include the table or figure number.
  • Enter "None" if there is cost information cannot be found in the paper or in the supplementary materials.

Examples

Example 1 - Barrera-Osorio et al. (2022):

Barrera-Osorio et al. (2022) highlights the cost-effectiveness of the program in Part VI: Program Cost-Effectiveness. The cost data for the intervention and details regarding program cost per student are mentioned in the appendix C.

Answer: Depending on the year type (fiscal, school) and child (enrolled, attending), the annual program cost per student ranges from a low of $77 to a high of $184. (Appendix C)
Additional details of intervention

Coding Instructions

  • Please provide any additional information on the intervention that has not been captured in the previous set of questions. This could include for example details on design or development, etc.
  • Enter "None" if you think all the relevant information has been recorded.

Examples

Example 1 - Andrew et al., 2018:

In Andrew et al., 2018, the authors mentioned that the intervention was an attempt to implement a Jamaican home-visiting model at scale.

Answer: A coder could include this information in this field.
Intervention Start Date

Coding Instructions

  • Please enter the day, month and year as they appear in the main text of the paper or its supplementary materials.
  • Select "-99" if the information is not reported in the paper. For example, if only year is reported, select "-99" for month and for day.
  • This information should be found in the sections on experimental or research design in the main text of the paper. Some papers may also include a timeline of the intervention in a figure or in the supplementary materials.

Examples

Example 1 - Lyall et al., 2020:

For Lyall et al, 2020, the start date is October 2015. which the coder would enter in a month-date format. [See: Study Timeline in Supplementary Material].

Answer: "2015" for "Year", October for "Month" and "-99" for "Day"

Example 2 - Chong et al., 2015:

For Chong et al, 2015, the authors write "We randomly assigned voting precincts to a campaign spreading information on corruption and public expenditure conducted one week before the 2009 municipal elections in Mexico." The coder would write "2009" and then select unsure in the follow-up question. [See: Introduction, Experimental Design and Implementation]

Answer: "2009" for "Year" and "-99" for "Month" and for "Day"
Intervention End Date

Coding Instructions

  • Please enter the month and year as it appears in the main text of the paper or its supplementary materials.
  • Select "-99" if the information is not reported in the paper. For example, if only year is reported, select "-99" for month and for day.
  • Please calculate the corresponding end date for the intervention if only the start date and duration are available. For example, if the text is such as "the intervention began in June 2013 and went for six months", select the calculated month "December 2013" in this field.
  • This information should be found in the sections on experimental or research design in the main text of the paper. Some papers may also include a timeline of the intervention in the main text or the supplementary materials.

Examples

Example 1 - Lyall et al., 2020:

For Lyall et al., 2020, the end date of the intervention is May 2016.[See: Study Timeline in Supplementary Material].

Answer: "2016" for "Year", "May" for Month, and "-99" for "Day"
Intervention end date calculated from duration

Coding Instructions

  • Yes if the end date is not directly reported in the paper and the date selected in "Intervention End Date" was based on the coder's calculation using data collection start date and duration reported in the paper.
  • No if the end date of data collection is reported and entered as it is described in the paper.

Examples

Example 1 - Badrinathan 2021:

For Badrinathan 2021, the timeline provided indicates that outcome measures were collected between May 19 and May 23, 2019.

Answer: No
Outcomes

Section notes to display in the survey:

Answer the questions in the section for each outcome variable listed in Stage 1. In IDEAL, an outcome variable is defined by the way it enters the estimation of a treatment effect. For instance, if raw test scores were standardized in the main regression model, the standardized test score would be used to answer the following questions, not the raw scores.

Section instruction for data entry mask:

The number of outcomes and their labels collected in Stage 1 (and verified in the Stage 1 check) need to be preloaded here to create the questions in the section.

Outcome variable definition

Coding Instructions

  • Provide a clear definition of the construct that the outcome variable measures. This is the underlying attribute or concept the outcome variable is designed to quantify, for example, mental health. The definition should be understandable for someone who has not read the paper and does not necessarily know what the intervention is.
  • The definition should be illustrative to suggest that a higher value of the outcome variable means an increase in the construct being measured. For example, "Behavior" could mean both "Better behavior" or "More behavior problems". The definition inserted here should not have this kind of ambiguity and should be explicit about the meaning of an increased outcome value.
  • Additionally, if an outcome variable is measured (cumulatively) over some reference period, please include the reference period in the description. Examples are "incidence of diarrhea in the last 24 hours" or "monthly income".
  • Much of this information is sometimes omitted in the outcome label presented in the exhibits due to space constraints, but can be found in the text or notes.
  • Be as brief as possible. The description does not need to include the statistical properties or the measurement details of the outcome variable.

Examples

Example 1 - Muralidharan et al. 2021:

In Muralidharan et al. 2021, one outcome variable is listed as "Ever encashed" in the tables (from Table 3). The outcome variable measured whether a farmer ever encashed a benefit check during the valid period.

Answer: Encashed the program benefit check

The unit does not need to be specified here because it will be clear from the unit of analysis field.

Example 2 - Barrera-Osorio et al. 2022:

One outcome variable of Table 4 in Barrera-Osorio et al. 2022 is "Total Scores". The test scores were language and math combined for all children aged 5 to 10 measured at the second follow-up.

Answer: Language and math combined test score for children aged 5 to 10, measured 1.5 years after school in operation

The target population unit needs to be mentioned in this case as the age restriction may not be obvious from the unit of analysis - child.

Example 3 - Chong et al., 2015:

In Chong et al., 2015, one outcome variable in Table 4 is "Turnout". The table notes indicate that the outcome variable refers to "total number of votes divided by number of registered voters multiplied by 100".

Answer: Registered voter turnout (percentage)
Binary outcome variable

Coding Instructions

  • A binary outcome variable is a variable that has only two possible values. Binary variables are often – but not always – categorical variables. Binary variables that describe categories are most often coded as 0 and 1. They may also be called indicator or dummy variables. For example, sex (male/female) or currently attending school (yes/no) are both binary variables and may be coded as, say, 1 for women and 0 for men, or 1 for yes and 0 for no.
  • Note this field is about the variable that enters the estimation, not how the input variable may have been originally measured. For example, the average of multiple 0/1 binary variables represents a fraction and is not a binary variable itself, even though the underlying data is binary – take the example of the outcome variable "average school attendance over the school year (in share of school days)", constructed from a series of 0/1 indicators for every school day whether the child was present in the classroom.
  • Conversely, if an outcome measure was collected as a non-binary variable but transformed into a binary variable when estimating the treatment effect, it should be considered a binary outcome variable. For example, suppose educational attainment was measured with a multiple choice question (i.e. 1=primary education or less, 2=lower secondary education, 3=upper secondary education, and 4=post-secondary education), but in the estimation, the outcome was transformed into an indicator for "has a post-secondary education". The response to this question should be "No" for the school attendance example, but "Yes" for the education level example.
  • Treatment effect estimates for binary outcome variables may use linear probability, probit or logit models.

Examples

Example 1 - Sukhtankar et al., 2022:

In Sukhtankar et al., 2022, the authors measure DIRs, or Domestic Incident Reports, which represent civil complaints of domestic violence. This is the count of DIRs in a given time period at a given police station.

Answer: No

Example 2 - Cheema et al., 2022:

In Cheema et al., 2022, for the measure of women's voter turnout is a variable which is coded as 1 if the respondent voted, and 0 if the respondent did not vote (operationalized by observing the ink from voting day on the respondent's thumb).

Answer: Yes

Example 3 - Freeman et al., 2022:

In Freeman et al., 2022, "Poor well-being" is a binary variable. Please note that the variable was dichotomized from a continuous well-being score, "with scores below 13 indicating poor well-being".

Answer: Yes

Because the variable entering the estimation is a binary variable.

Binary outcome label

Coding Instructions

  • Please give a concise description of what it means when the binary outcome variable takes a value equal to 1.

Examples

Example 1 - Cheema et al., 2022:

In Cheema et al., 2022, the measure of women's voter turnout is a variable which is coded as 1 if the respondent voted, and 0 if the respondent did not vote (operationalized by observing the ink from voting day on the respondent's thumb).

Answer: Respondent voted on voting day
Index outcome variable

Coding Instructions

  • Typically, the authors will state if an outcome is an index. Sometimes an index may be called a "score".
  • If an outcome is constructed from multiple independently measured variables or indicators that are not in the same domain, assess different dimensions of the same concept, or do not share the same unit, it is an index variable. An index typically does not have a unit.
  • For example, household income (say, in Rupees) as the sum of all individual incomes in the household is not an index, but an early childhood development score combining assessments of math and behavioral skills is an index.
  • Index aggregation of multiple variables may involve adding values, taking an average, etc.

Examples

Example 1 - Banerjee et al., 2019:

Banerjee et al., 2019, describe the outcome variable "HIV knowledge" explicitly as an index. The indicators that enter the index are also presented. "HIV knowledge measures how aware an individual is of the methods of transmission, the availability of drugs, and the timing of testing for HIV. Higher values of this index correspond to greater awareness."

Answer: Yes

Example 2 - Barrera-Osorio et al., 2011:

In Barrera-Osorio et al, 2011, one outcome is "monitored school attendance rate" (Table 3). According to the authors, "We collected attendance data during the last quarter of 2005 through direct observation. For this purpose, the team assembled a group of assistants who randomly visited schools and classes. The assistants directly called the roll of all students, and students were marked absent if they were not physically present in the classroom." [C. Data]. This variable is constructed from many separate observations, but it does not combine multiple alternative methods of measuring attendance for the same student, so it is not an index.

Answer: No

Example 3 - Wolf et al., 2019:

In Wolf et al., 2019, "Teacher motivation" is an index outcome although it is not called an index. The "Measures" section mentioned that "Teacher's motivation was measured using five items adapted from Bennell and Akyeampong (2007) as reported in Wolf, Aber et al. (2015)." There are multiple components in the outcome variable, thus it is an index outcome.

Answer: Yes

Example 4 - Barrera-Osorio et al., 2022:

In Barrera-Osorio et al., 2022, "Total score" is an index outcome consisting of language score and math score. The paper does not explicitly state that the variable is an index. However, the Data section stated that children were tested on language and math, so we can infer that "Total score" is an index.

Answer: Yes
Description of index outcome

Coding Instructions

  • Please provide a description of the components, aggregation method and any other information on how the index was constructed from the underlying set of measures.

Examples

Example 1 - Freeman et al., 2022:

In Freeman et al. 2022 (Table 3), the outcome variable "Water and sanitation insecurity scores: Water – HWISE Scale" is an index outcome. The "Outcome of Interest" section and table notes describe the details of the variable.

Answer: Water insecurity was measured through the Household Water Insecurity Experiences (HWISE) scale. HWISE includes 12 items with four response categories (never, rarely, sometimes, often/always). The score is the sum of responses, ranging from 0–36. A higher score indicates greater household water insecurity.

Example 2 - Barrera-Osorio et al., 2022:

In Barrera-Osorio et al., 2022, "Total score" is an index outcome consisting of language score and math score. Both components were also outcome variables. The information in the paper on the "Total score" suggests it is simply the sum of the two scores.

Answer: The total score is the sum of the language and math scores.
Outcome variable measurement tool

Coding Instructions

  • In disciplines like education and psychology, outcomes are often measured with standardized tools or measurement methods developed by others, such as the Bayley scale, IDELA, EGRA, Implicit Association Test (IAT), the Big 5 Inventory (BFI), etc. If one is provided, enter the name of the tool used to measure the outcome. Enter the citation for the measure.
  • Look for the description in the section describing the data used for the outcomes or results.
  • Include any adaptations of the measure or tool, e.g. if a measure used only some items from a longer questionnaire.
  • Enter "None" if there is no name or citation for the measure.

Examples

Example 1 - Knauer et al., 2019:

From Knauer et al, 2019 "we assessed caregiver literacy by asking caregivers to read a simple, five-word (second-grade level) sentence in each language adapted from the Early Grade Reading Assessment (EGRA; Gove & Wetterberg, 2011).", "mental health was measured using an adapted version of the Centers for Epidemiological Studies‐Depression scale CES‐D; Radloff, 1977; scores range 0–60)."

These were obtained from the section "Measures: Caregiver survey". Notice for each tool, there is a citation associated with it.

Answer: Literacy: Early Grade Reading Assessment (EGRA) (Gove & Wetterberg, 2011)
Mental health: Centers for Epidemiological Studies‐Depression scale CES‐D (Radloff, 1977)
Outcome variable standardization type

Coding Instructions

  • Please first determine whether the outcome variable is standardized or not. If standardized, choose the type of standardization.
  • An outcome is standardized if it is converted from the original values to a z-score using mean and standard deviation.
  • If an outcome variable is standardized using the mean and standard deviation of any group of the study sample, then it is internally standardized, for example, using the control group distribution.
  • If an outcome variable is standardized using the distribution of a normative sample outside the study sample, it is externally standardized. For example, anthropometric measures for children under five years of age, such as Weight-for-Height or Arm Circumference, or the Peabody Picture Vocabulary Test (PPVT) are standardized externally using a reference group at "typical" level of development.

Examples

Example 1 - Pickering et al., 2019:

For the outcome variable "Weight for age Z-score" in Table 2 of Pickering et al. (2019), the standardization is external because it uses the WHO child growth reference distribution.

Answer: Externally standardized

Example 2 - Leaver et al., 2021:

In Leaver et al. (2021), "Student learning" in Table 3 was internally standardized. Table 2 notes suggest that "student learning IRT scores are standardized based on the distribution in the experienced FW [Fixed-wage contract] arm".

Answer: Internally standardized
Outcome variable unit of measurement

Coding Instructions

  • Select the unit of measurement for each outcome variable that entered the estimation of treatment effects.
  • Start with the broad category and then choose or specify the actual unit of measurement.
  • For the unit of count (quantity), specify the object being measured. For example, for the number of prenatal checks, first select "count (quantity)" and then type "prenatal checks".
  • For transformed variables, select unitless or other, and then specify the transformation method, for example, log, sine, and inverse hyperbolic function, and the underlying unit.

Examples

Example 1 - Haushofer and Shapiro (2016):

In Haushofer and Shapiro (2016), the unit of measurement of "Value of nonland assets (US$)" (in Table VI) is currency. According to the table notes, the currency was US$, PPP in 2012.

Answer: Currency-US$, PPP-2012-real

Example 2 - Pickering et al. (2019):

For the outcome variable "Weight for age Z-score" in Table 2, the unit of measurement is standard deviation because Z-score is measured in standard deviations.

Answer: Standard deviation

Example 3 - Pickering et al. (2019):

For "Detectable total Cl (proportion)" in Table 3, the unit of measurement is fraction as it was measured in proportion.

Answer: Percent (0-100)

Example 4 - Pickering et al. (2019):

For "E coli log (cfu/100 mL)" (Table 3), the unit of measurement is "other".

Answer: Unitless or other, specify: Log of cfu/100ml
Additional details of the outcome variable

Coding Instructions

  • Include any additional information about the outcome variable that is found in the paper and not recorded elsewhere, including its construction or processing (such as winsorizing or imputation of missing values), validation and quality control (such as double entry or back checks), measurement (such as exact procedures for an educational test conducted), etc.

Examples

Example 1 - Haushofer and Shapiro (2016):

In Haushofer and Shapiro (2016), the "Value of nonland assets (US$)" (in Table VI) variable was "top-coded for the highest 1% of observations". This detail was not covered by any of the previous fields. The coder should include it in this field.

Answer: Top-coded for the highest 1% of observations
Randomization
Number of randomization units in study arm

Coding Instructions

  • For each study arm, provide the number of randomization units assigned to this arm as reported by the paper.
  • If there are more than one unit of randomization, please enter the assigned units for each of them and separate them with commas, following the order of the displayed units of randomization in the hint. For example, the units of randomization are districts and villages (as in the hint), please enter {number of districts in the arm, number of villages in the arm} in the answer.
  • It is possible that this information may not exist at the study arm level or for some of the study arms. If this is the case, enter "-99" in the corresponding field if the information cannot be found in the paper.
  • This information is mostly found in the research design sections of the paper, specifically in the description of the random assignment, which is sometimes included in a separate sub-section of the paper. The information may also be found in participant flow diagrams (e.g. the CONSORT flow diagram) or a table that disaggregates information by treatment arms, such as a balance table, or even a treatment effects table, especially if the randomization unit and the unit of analysis are identical.

Examples

Example 1 - Lyall et al., 2020:

In Lyall et al, 2020, the number of units assigned to each study treatment arm is available in the paper. See Figure 2 in Randomization Section. The coder would input the following for each treatment arm, which was identified in Stage 1:

Answer: TVET treatment and UCT treatment: 313
TVET treatment and UCT control: 312
TVET treatment and Non-UCT Group: 673
TVET control and UCT treatment: 273
TVET control and UCT control: 270
TVET control and Non-UCT Group: 756

Example 2 - Badrinathan 2021:

In Badrinathan 2021, the number of units assigned to each study arm is only available for the control arms (n = 406). The author does note that an equal proportion were assigned to each of the three treatment arms but does not provide an exact number for the two treatment arms.

Answer: Treatment arm 1: -99
Treatment arm 2: -99
Control arm: 406

Example 3 - Garbiras-Diaz and Montenegro 2022:

For the "call to action" intervention in Garbiras-Diaz and Montenegro 2022, the number of units assigned to each arm is presented in Figure 1 (Randomization Design) as follows:

Answer: Placebo Control Group: 225
Information Message: 158
Call-to-action Message: 156
Information + Call-to-action Message: 159
Number of randomization units in study

Coding Instructions

  • Provide the total of randomization units assigned to all study arms as reported in the study.
  • If there are more than one unit of randomization, please enter the total number of units in the study and separate them with commas, following the order of the displayed units of randomization in the hint. For example, the units of randomization are districts and villages (as in the hint), please enter {number of districts in the arm, number of villages in the arm} in the answer.
  • This information is mostly found in the research design sections of the paper, specifically in the description of the random assignment, which is sometimes included in a separate sub-section of the paper. The information may also be found in participant flow diagrams (e.g. the CONSORT flow diagram) or a table that disaggregates information by treatment arms, such as a balance table, or even a treatment effects table, especially if the randomization unit and the unit of analysis are identical.

Examples

Example 1 - Badrinathan 2021:

In Badrinathan 2021, the total number of randomization units is 1,224 as noted in Abstract, Introduction, Sample and Timeline, etc. The study only reports the number of randomization units for the control group and the total number of randomization units.

Answer: 1,224
Quality and robustness
Compliance

Coding Instructions

  • For each treatment arm, compliance refers to any treatment unit that received the treatment as intended. The unit does not need to have taken the full treatment or have taken up the offered treatment to be considered in compliance.
  • Sometimes compliance is not separately reported from take up. Non-compliance should only capture cases where a mistake in the randomization led to the treatment not being offered, or offered to the wrong units. In all other cases, record compliance as "not available."
  • For the status quo control group, authors may refer to spillover or treatment contamination and report the share who received a treatment (the non-compliance rate), rather than the share who correctly did not receive the treatment (the compliance rate). Please always report the compliance rate.
  • Please enter fraction in this field, for example, 15/16 indicating 15 out of 15 randomization units complied with the treatment status or 49/100 for a compliance rate of 49%.
  • Enter "-99" if the information is not mentioned in the paper.
  • Enter "-88" if compliance rate cannot be entered as numeric values. Please specify the details.

Examples

Example 1 - Bos et al., 2024:

In Bos et al., 2024, the authors discussed the compliance issue of the treatment arms in [4.2. Receipt and use of program materials by households]. "As per the intervention guidelines, households in the treatment group should have received four materials: a child development card, a household picture book, a nature picture book, and a key message booklet. However, Table 6 shows that due to imperfect compliance, the differential likelihood of receipt of the child development card, household picture book, and nature picture book between treatment and control households was approximately 49 percentage points (instead of 100 under perfect compliance). Furthermore, 2%–3% of households in the control group received these materials."

In this example, although "imperfect compliance" is used to describe the fact that only 49% of treatment households received the intervention materials. That was actually a result of low implementation fidelity rather than non-compliance to assigned treatment status.

However, the receipt of materials by the control households was a non-compliance issue as they were not assigned to get the intervention (i.e. materials).

Answer: Treatment study arm: -99
Control study arm: -88 (Specify details: "2%–3% of households in the control group received these materials")
Take-up

Coding Instructions

  • For each treatment arm, take-up measures the share of treatment units that actually participated in or adopted some portion of the assigned interventions. The treatment units do not need to have participated fully to be considered part of the group.
  • Please enter percentage points in this field, for example, 82 for 82% of the treatment unit took up the treatment.
  • Enter "-99" if the information is not mentioned in the paper.
  • Enter "-88" if take-up cannot be entered as numeric values. Please specify the details.

Examples

Example 1 - Brudevold-Newman et al., 2024:

'Just over 61 percent of those assigned to the franchise treatment attended at least one day of business training (which was the first component of the franchise treatment), and 44 percent completed the program and launched a business.' (page 8)

In this question, we want the proportion of a treatment that participated in 'some proportion of treatment' (i.e. take-up), so the answer will then be 61%. For the grant arm, in Table A2 [Compliance and Attrition] from appendix [column: Grant], we see that 95% of the grant arm received the grant.

Although both the section in the paper and the appendix Table A2 were titled “Compliance and Attrition”. The “compliance” rates for the treatment groups were technically “take-up” rates.

Answer: Take-up for franchise arm: 61
Take-up for grant arm: 95
Balance test

Coding Instructions

  • Please indicate whether there is a balance test table in the main paper or appendix.
  • A balance test table often includes a set of balance tests to examine differences in observable characteristics between study arms. Balance can be tested individually by covariate or jointly, using an omnibus test for overall balance.
  • Note that the balance test table may not be presented as a separate table but presented as part of a descriptive statistics table.

Examples

Example 1 - Abimpaye et al., 2020:

Table 2 is a balance table reporting characteristics by study arm.

Answer: Yes — Table 2

Example 2 - Carneiro et al., 2024:

Two balance tables are presented: P1128 Table 1 (Baseline Balance, Household and Child Characteristics) and P1130 Table 2 (Balance of Household and Child Characteristics at Follow-Up).

Answer: Yes — Table 1 and Table 2
Partners and Funders
Implementers of the experiment

Coding Instructions

  • Enter the names of the entities that implemented the experiment as they appear in the paper and separate each of them by a comma.
  • The implementers could be agencies, institutions, or individuals (for example, researchers).
  • If there is no information on the implementers, please enter "Not reported".
  • Please note data collection agencies should not be included as implementers.

Examples

Example 1 - Chong et al., 2015:

For Chong et al. 2015, the "Innovations for Poverty Action" implemented the intervention.

Answer: Innovations for Poverty Action

Example 2 - Gaikwad and Nellis 2021:

For Gaikwad and Nellis 2021, the experiment was implemented by "an NGO" without specifying the name.

Answer: An anonymous NGO

Example 3 - Carneiro et al., 2024:

In Carneiro et al., 2024, the implementer was not specified in the paper.

Answer: Not specified
Implementer type

Coding Instructions

  • Select the types of all the entities that implemented the experiment.
  • Please select all that apply. If there are both government and an NGO involved, choose both options.
  • NGOs include both non-profit and for-profit non-governmental organizations that are self-managed.
  • If a government contracts a private firm within the public sector management system, the implementer should still be considered as "government".

Examples

Example 1 - Chong et al., 2015:

The implementer for the experiment was "Innovations for Poverty Action".

Answer: NGO

Example 2 - Özler et al., 2018:

"Under PECD, the Government implemented the following interventions – in partnership with Save the Children and UNICEF" (p.4).

Answer: Government; NGO; Multilateral or bilateral international organizations
Acknowledgements

Coding Instructions

  • Please copy and paste the acknowledgement section of the paper in this field. The information should include the funders, other entities that supported the study in pecuniary and non-pecuniary terms.
  • In some papers, there are sections dedicated to acknowledging support for the study including funders, referees etc. In other papers, those could be in a footnote at the beginning or the end of the paper.
  • Sometimes, the information can also be found in the “Conflict of interest” statement.

Examples

Example 1 - Özler et al., 2018:

There is an “Acknowledgements” section in the paper (page 19). The text in the section should be copied in this field.

Answer: We acknowledge funding from three World Bank trust funds - Rapid Social Response Multi-Donor Trust Fund (TF098514), Strategic Impact Evaluation Fund (TF013561), and Impact Evaluation to Development Impact Trust Fund (TF018796).

Example 2 - Chong et al., 2015:

The acknowledgment is stated in footnote 1. Therefore, the text of footnote 1 should be copied here.

Answer: This article circulated previously with the title: “Looking Beyond the Incumbent: Exposing Corruption and the Effect on Electoral Outcomes.” We acknowledge partial funding from the Inter-American Development Bank. Supplementary material for this article is available at the “Supplements” link in the online edition. Data and supporting material necessary to reproduce the numerical results for this article are available at http://anadelao.commons.yale.edu.
Resources
Registry Name

Coding Instructions

  • Please select the name of the registry or registries in which the study is registered only if it is mentioned in the main text of the paper or its supplementary materials/appendices.
  • Do not search for this information beyond what is included in the paper.
  • Sometimes information on trial registration is mentioned in the footnotes.
  • Searching for the exact terms throughout the text such as "registry", "pre-registration" or "pre-analysis plan" can be a good approach to double-check whether the trial registry is mentioned anywhere in the text or supplementary appendix.
  • If the name of the trial registry is not mentioned in the paper or its supplementary materials/appendices, please select "Not stated".

Examples

In Brudevold-Newman et al., 2023, the name of the organization where the trial is registered is stated in the footnote page 1: "The study was registered at the AEA RCT registry under ID number AEARCTR-0000459."

Answer: AEA RCT registry
Registration ID

Coding Instructions

  • Please enter the registry ID or IDs of the study only if it is mentioned in the main text of the paper or its supplementary materials/appendices.
  • Record full ID with prefixes if included (e.g. RIDIE-STUDY-ID-64be2e6e750).
  • There could be different terminologies: the AEA RCT Registry gives each entry an "RCT ID", while ClinicalTrials.gov gives each entry a "ClinicalTrials.gov Identifier".
  • This information is usually presented in the acknowledgements or ethics statement sections of a paper, or in the supplementary materials/appendices.
  • If no registration ID is provided, please write "Not stated".

Examples

In Brudevold-Newman et al., 2023, the ID registration is stated in the footnote page 1. 'The study was registered at the AEA RCT registry under ID number AEARCTR-0000459.'

Answer: AEARCTR-0000459
Number of IRBs reported

Coding Instructions

  • Enter the number of ethics reviews or IRBs mentioned in the main text of the paper or in the supplementary materials/appendices.
  • Studies can have multiple IRB or ethics board approvals.
  • This information is usually present in the acknowledgements or ethics statement sections of a paper. If it is not present there, it may be present in the supplementary materials/appendices.
  • If not mentioned in the paper or its supplementary materials/appendices, please write 0.

Examples

Abimpaye et al., 2020 reports one review with "Rwanda National Ethics Committee" as the ethics review body.

Answer: 1

Athey et al., 2023 obtained ethical reviews from three committees: Cameroon’s National Ethics Committee (CNERSH; decision no. 2019/08/1183/CE/CNERSH/SP), administrative authorization from the Ministry of Health’s DROS (decision no. D30-760/L/MIN-SANTE/SG/DROS), and the authors’ institutional review board (decision no. 780/CIERSH/DM/2018).

Answer: 3
Review board name

Coding Instructions

  • For each ethics review board, include the complete name of the review board as it appears in the main text of the paper or in the supplementary materials/appendices.
  • This information is usually in the acknowledgements or ethics statement sections of a paper or in the supplementary materials/appendices.
  • If not mentioned in the paper or its supplementary materials/appendices, please write "Not stated".

Examples

Abimpaye et al., 2020 explicitly states that "This study was reviewed and approved by the Rwanda National Ethics Committee."

Answer: Rwanda National Ethics Committee

The study Barrera-Osorio et al., 2022 has an IRB approval number with the organization Columbia University.

Answer: Columbia University
Review number

Coding Instructions

  • For each IRB approval, enter the approval number/ID as it appears in the main text of the paper or its appendices.
  • Record full ID including prefixes. Copy the number or ID exactly as it appears in the paper.
  • This may appear next to the review board name with "#". This information is usually present in the acknowledgements or ethics statement sections of a paper or in the supplementary materials/appendices.
  • If not mentioned in the paper or its supplementary materials/appendices, please enter "Not stated".

Examples

Abimpaye et al., 2020 notes that "This study was reviewed and approved by the Rwanda National Ethics Committee”, but no reference or case number was reported.

Answer: Not stated

In Barrera-Osorio et al., 2022, the IRB approval number is reported as AAAF4126.

Answer: AAAF4126
Stage 3 Module 2: Estimates

Section notes to display in the survey

This section will go through each of the treatment effects confirmed in Stage 2 to collect the estimates of treatment effects. A treatment effect is defined by the outcome variable (that enters estimation), the comparison between the evaluation arm and the reference arm, the estimand, the empirical specification, and the periods (the data rounds used for the estimation and, if relevant, how they are pooled). Please review the pre-loaded information on the treatment effect carefully before answering the estimate questions.

Instructions for data entry mask

This set of questions on estimates should loop through each treatment effect confirmed in Stage 2. The preloaded prompts for coders include:

- [Exhibit number]

- [Outcome name]

- [Unit of analysis]

- [Eval arm]

- [Reference arm]

- [Estimand]

- [Empirical specifications]

- [Period (including rounds)]

This information needs to be presented for every treatment effect at the beginning of the estimate questions.

Estimation parameter

Coding Instructions

  • Please select the estimation parameter of the treatment effect.

Examples

See descriptive examples in paper.

Estimation model

Coding Instructions

  • Please select the statistical model used to estimate the treatment effect.
  • The information is usually found in sections that focus on methods, analytical strategy, or results or in the table notes.
  • If you are not sure which model was used, select "Other, specify" and enter the information found. Flag in the request-for-review if needed.

Examples

In Wolf et al., 2019, the Impact analysis suggests multi-level modeling was used; select "Multi-level or hierarchical model/regression" unless otherwise specified.

For "teacher turnover" in Table 4, multinomial logistic regressions were used; select "Logistic regression" for those effects.

In Ara et al., 2019 Table 3, select "T-test (mean-comparison test)".

Null hypothesis

Coding Instructions

  • For each estimate, select the null hypothesis that was tested.
  • In most cases, the null is 0 for non-binary outcomes and 1 for odds/risk/hazard ratios unless otherwise stated.

Examples

See descriptive examples in paper.

Linear combination of coefficients for treatment effect

Coding Instructions

  • Enter 1 when the estimate is a single coefficient; select "More than one" when it is a linear combination and specify.
  • Read the regression tables and results closely to identify the relevant coefficients.

Examples

In Leaver et al., 2021 Table 3, the pooled-period learning effect for one comparison is a single coefficient (answer: 1). Another comparison requires a linear combination of two coefficients under Model B (answer: More than one, specify: 2).

Estimate of the treatment effect

Coding Instructions

  • Enter the numerical value exactly as reported (sign and decimals preserved).
  • If multiple parameters are used (e.g., interactions), enter each coefficient separately.
  • Enter "-99" if not available in the main paper or appendix and flag for review.

Examples

In Leaver et al., 2021 Table 3 (Pooled, Model A), the point estimate is 0.01. For a comparison requiring two coefficients in Model B, enter 0.12 and -0.03 separately.

Section notes to display in the survey

We need several precision statistics to standardize effect sizes using reported treatment effects. The survey will guide you to help extract a sequence of precision values, and it will stop once the minimum set of values are captured. The information on the type of precision statistics and the value is often found in the notes in tables that report the treatment effects in the main paper or in the appendix. It is possible that different precision values for the same treatment effect are reported in various parts of the paper. Please check both the main paper and the appendix carefully.

Guidance

For continuous variables:

- We always get the SE and t-stat. If both present, we stop.

- If either one is missing, we pick up the p-value (not adjusted for multiple inference) which can be used with other fields to back out SE and t-stat.

- If none are present, pick up the CI and significance level, and F-ratio for one-way ANCOVA.

For binary variables:

- We always get the Z-stat. If Z-stat is not present, pick up the t-stat or p-value (not adjusted for multiple-inference).

Standard error of treatment effect estimate

Coding Instructions

  • Enter the SE exactly as reported; leave blank if not reported.
  • If specified, record whether the value is unadjusted or adjusted and the adjustment method (e.g., robust, clustered, bootstrap). Use "Unknown" if not specified.

Examples

See descriptive examples in paper.

T-statistic of treatment effect estimate

Coding Instructions

  • Enter the t-statistic exactly as reported; leave blank if not reported.
  • Capture whether unadjusted or adjusted and note adjustment method if provided (robust, clustered, bootstrap). Use "Unknown" if not specified.

Examples

See descriptive examples in paper.

Z-statistic of treatment effect estimate

Coding Instructions

  • Enter the Z-statistic exactly as reported; leave blank if not reported.
  • Record unadjusted/adjusted status and adjustment method if provided. Use "Unknown" if not specified.

Examples

See descriptive examples in paper.

P-value of treatment effect estimate

Coding Instructions

  • Enter the p-value exactly as reported; leave blank if not reported.
  • When multiple adjusted p-values are provided, capture each (e.g., unadjusted, covariate-adjusted, multiple-hypothesis corrections, bootstrap, small-sample, permutation, unknown, other).

Examples

See descriptive examples in paper.

Confidence interval

Coding Instructions

  • Enter the lower and upper bounds exactly as reported.
  • If CI not reported, enter "-99" for both bounds.

Examples

See descriptive examples in paper.

Confidence interval significance level

Coding Instructions

  • Select the CI level indicated; choose "Not reported" if not specified.

Examples

See descriptive examples in paper.

F-Ratio

Coding Instructions

  • Enter the F-statistic exactly as reported; leave blank if not reported.

Examples

See descriptive examples in paper.

Additional precision information

Subsection notes: The following three questions are meant for you to include any information about any additional precision statistics reported by the authors that have not been captured in the previous questions. An open-ended question will appear to encode the information, if any, you may want to report.

Additional precision value

Coding Instructions

  • Enter each non-sampling-based precision value if reported (often in appendix).

Examples

See descriptive examples in paper.

Additional precision value type

Coding Instructions

  • Select the type corresponding to the value reported.

Examples

See descriptive examples in paper.

Additional precision value inference method

Coding Instructions

  • Select the method reported for the additional precision value.

Examples

See descriptive examples in paper.

Subsection notes

The following questions ask the mean, standard deviation, and sample size for the outcome of the treatment effect. They loop through the study arms in comparison at baseline and at the period over which the treatment effect was estimated. Please read the prompts carefully before entering your answers.

SurveyCTO instructions: For the evaluation arm, reference arm and the two arms combined, display the mean, Standard deviation, Standard error and Sample size question for each of them, at baseline and at the period (used to estimate the treatment effect). Please see a presentation of the questions in tables, which would be ideally how the questions presented without the limitations of SurveyCTO. [Please check if we can use this table grid plug-in https://github.com/surveycto/table-grid].

Table 1: Baseline

(1) (2) (3) (4)
Mean Standard deviation Standard error Sample Size (N)
Eval arm
Reference arm
Eval+Reference combined

Table 2: {Period} (pulled from Stage 2 that is associated with the treatment effect)

(1) (2) (3) (4)
Mean Standard deviation Standard error Sample Size (N)
Eval arm
Reference arm
Eval+Reference combined

A different format of Baseline outcome is reported [Yes/No] - If yes, specify the format of baseline outcome

Baseline outcome variable format

Coding Instructions

  • Indicate whether the unit of measurement at baseline is the same as the one used in estimation.
  • If authors report baseline statistics in a different unit (e.g., original raw scale) than used in estimation (e.g., standardized), select "No" and specify the baseline unit.
  • Select "Yes" if the same unit is used at baseline and in estimation.

Examples

See descriptive examples in paper.

Outcome variable mean

CODING INSTRUCTIONS

Enter the mean of the outcome, exactly up to the decimal point as reported in the paper.

DESCRIPTIVE EXAMPLES FOR CODING

The outcome of "Intimidation during voting" has three means at baseline Asunka et al, 2019 (Table 1): Full sample, Treatment and Control. This question will be asked three times for each of the means.

Answer: Control: 0.12; Treatment: 0.05; Full sample (control + treatment): 0.07
Outcome variable standard deviation

CODING INSTRUCTIONS

Enter the standard deviation of the outcome variable mean, exactly up to the decimal point as reported in the paper.
This information may be found in the descriptive statistics tables, bottoms of results tables, and sometimes in results tables notes or the text. This is also often reported in supplementary materials, so check there if it is not found in the main text of the paper.
Enter "-99" if the main text of the paper or its supplementary materials do not mention the standard deviation of the mean.

DESCRIPTIVE EXAMPLES FOR CODING

The outcome of "Intimidation during voting" has three means at baseline Asunka et al, 2019 (Table 1): Full sample, Treatment and Control. The corresponding standard deviation for the three means are:

Answer: Control: 0.33; Treatment: 0.22; Full sample (control + treatment): 0.26
Outcome sample size

CODING INSTRUCTIONS

Enter the sample size of the outcome variable for the specified study arms and period.

DESCRIPTIVE EXAMPLES FOR CODING

In Ganimian, Mulralidharan, and Walters 2023, only one sample size is reported for the outcome "Math" at baseline (Table 1) for all arms combined:

Answer: Sample size: 4,675