KiVa Antibullying Program

An anti-bullying program for grades 2-6 with 20 hours of student lessons. Primarily implemented in Europe, the program aims to improve mental health outcomes in addition to reducing bullying.

Program Outcomes

Anxiety
Bullying
Violent Victimization

Program Type

Bullying Prevention
School - Environmental Strategies
School - Individual Strategies

Program Setting

School

Continuum of Intervention

Indicated Prevention
Universal Prevention

Age

Late Childhood (5-11) - K/Elementary

Gender

Both

Race/Ethnicity

Endorsements

Blueprints: Promising
Crime Solutions: Promising
OJJDP Model Programs: Promising

Program Information Contact

Christina Salmivalli
Department of Psychology
University of Turku
Assistentinkatu 7
20014 Turun yliopisto, Finland
Email: eijasal@utu.fi
Website: https://www.kivaprogram.net/

Program Developer/Owner

Christina Salmivalli
University of Turku

Brief Description of the Program

KiVa includes both universal actions to prevent the occurrence of bullying and indicated actions to intervene in individual bullying cases. The program has three different developmentally appropriate versions for Grades 1-3 (Unit 1), 4-6 (Unit 2), and 7-9 (Unit 3). Blueprints has certified the evaluation evidence for grades 2-6 only.

Indicated actions. In each school, a team of three teachers (or other school personnel), along with the classroom teacher, addresses each case of bullying that is witnessed or revealed. Cases are handled through a set of individual and small group discussions with the victims and with the bullies, and systematic follow-up meetings. In addition, the classroom teacher meets with two to four prosocial and high-status classmates, encouraging them to support the victimized child.

Universal actions. The KiVa program includes 20 hours of student lessons (10 double lessons) given by classroom teachers during a school year. The central aims of the lessons are to: (a) raise awareness of the role that the group plays in maintaining bullying, (b) increase empathy toward victims, and (c) promote children's strategies of supporting the victim and thus their self-efficacy to do so. The lessons involve discussion, group work, role-play exercises, and short films about bullying. As the lessons proceed, class rules based on the central themes of the lessons are successively adopted one at a time.

See: Full Description

KiVa includes both universal and indicated actions to prevent the occurrence of bullying as well as to intervene in individual bullying cases. The program has three different developmentally appropriate versions for Grades 1-3, 4-6, and 7-9 (i.e., for 7-9, 10-12, and 13-15 years of age). Blueprints has certified the evaluation evidence for grades 2-6 only.

Universal actions. The KiVa program for Grades 4-6 includes 20 hours of student lessons (10 double lessons) given by classroom teachers during a school year. The central aims of the lessons are to: (a) raise awareness of the role that the group plays in maintaining bullying, (b) increase empathy toward victims, and (c) promote children's strategies of supporting the victim and thus their self-efficacy to do so. The lessons involve discussion, group work, role-play exercises, and short films about bullying. As the lessons proceed, class rules based on the central themes of the lessons are successively adopted one at a time.

A unique feature of KiVa is an antibullying computer game included in the primary school versions of the program. Students play the game during and between the lessons described earlier. Students acquire new information and test their existing knowledge about bullying, learn new skills to act in appropriate ways in bullying situations, and are encouraged to make use of their knowledge and skills in real-life situations.

KiVa provides prominent symbols such as bright vests for the recess supervisors to enhance their visibility and signal that bullying is taken seriously in the school and posters to remind students and school personnel about the KiVa program. Parents also receive a guide that includes information about bullying and advice about what parents can do to prevent and reduce the problem.

Training days and school network meetings. Support to implement the program is given to teachers and schools in several ways. In addition to two full days of face-to-face training, networks of school teams are created, consisting of three school teams each. The network members meet three times during the school year with one person from the KiVa project guiding the network.

KiVa naturally shares some features with existing antibullying programs, such as the Olweus's bullying prevention program. Both Olweus and KiVa include actions at the level of individual students, classrooms, and schools, both tackle acute bullying cases through discussions with the students involved, and both suggest developing class rules against bullying. KiVa, however, has at least three features that, when taken together, differentiate it from Olweus and other antibullying programs. First, KiVa includes a broad and encompassing array of concrete and professionally prepared materials for students, teachers, and parents. Second, KiVa harnesses the powerful learning media provided by the Internet and virtual learning environments. Third, while focusing on the bystanders, or witnesses of bullying, KiVa goes beyond "emphasizing the role of bystanders" that is mentioned in the context of several intervention programs; it also provides ways to enhance empathy, self-efficacy, and efforts to support the victimized peers. Although other programs share some of these features, none of them has assembled these features into the coordinated whole-school, multilayered intervention that is the hallmark of the KiVa program.

Outcomes

Study 1

Karna et al. (2011a), Williford et al. (2012a), Salmivalli et al. (2011), and Williford et al. (2013) found significant differences between the groups such that, compared to control students, intervention students:

At Wave 2 (seven months after pretest):

Had a lower level of peer-reported victimization.
Defended the victims more and they had more antibullying attitudes and empathy toward victims.

At Wave 3 (one year after pretest, nine months of intervention):

Showed significantly greater improvement on 7 of 11 criterion variables, including self- and peer-reported victimization and self-reported bullying.
Assisted and reinforced the bully less often, and they had higher self-efficacy for defending and well-being at school.
Had more positive perceptions of peers and reduced anxiety (Williford et al., 2012).
Were less likely to be bullied (Salmivalli et al., 2011).
Were less likely to have experienced cybervictimization (Williford et al. 2013).
Were less likely to have engaged in cyberbullying (Williford et al. 2013).

Study 3

Karna et al. (2012) found for grades 2-3, compared to the control group, the KiVa program significantly reduced:

Self-reported bullying.

Brief Evaluation Methodology

Primary Evidence Base for Certification

Of the seven studies Blueprints has reviewed, two studies (Studies 1 and 3) meet Blueprints evidentiary standards (specificity, evaluation quality, impact, dissemination readiness). Both studies were conducted by the developer.

Study 1

Karna et al. (2011a), Salmivalli et al. (2011), Williford et al. (2012a), and Juvonen et al. (2016) randomly assigned 78 Finnish schools to intervention (39 schools, 4,207 students) and control conditions (39 schools, 4,030 students). Data collection took place three times: (1) in May 2007, (2) December 2007 or January 2008, and (3) May 2008 (the end of the first year of the intervention). Assessments included measures such as self-reported bullying and victimization, participation in bullying situations, antibullying attitudes, perceptions of peers, anxiety and depression. Salmivalli et al. (2011) investigated the success of the KiVa program in reducing nine different forms of being bullied but used only the pretest (May 2007) and posttest (May 2008) data.

Study 3

Karna et al. (2012) examined 147 Finnish schools and surveyed students in grades 2-3 and 8-9. The sampled schools were either randomized to intervention and control conditions or added to the intervention after having been in the control condition of an earlier study (Karna et al., 2011). Data collection took place three times: (1) in May 2008 (pretest), (2) December 2008-February 2009 (midway through the program), and (3) May 2009 (the end of the first year of the intervention). Assessments included measures of self-reported bullying and victimization and reports on bullying and victimization of peers.

Blueprints Certified Studies

Study 1

Karna, A., Voeten, M., Little, T. D., Poskiparta, E., Kaljonen, A. & Salmivalli, C. (2011a). A large-scale evaluation of the KiVa antibullying program: Grades 4-6. Child Development, 82(1), 311-330.

Williford, A., Boulton, A., Noland, B., Little, T. D., Karna, A., & Salmivalli, C. (2012a). Effects of the KiVa anti-bullying program on adolescents' depression, anxiety and perception of peers. Journal of Abnormal Child Psychology, 40, 289-300.

Study 3

Karna, A., Voeten, M., Little, T. D., Alanen, E., Poskiparta, E., & Salmivalli, C. (2012). Effectiveness of the KiVa antibullying program: Grades 1-3 and 7-9. Journal of Educational Psychology. Advance online publication. doi: 10.1037/a0030417

Risk and Protective Factors

Risk Factors

Individual: Bullies others*, Favorable attitudes towards antisocial behavior*

School: Low school commitment and attachment*

Protective Factors

Individual: Clear standards for behavior, Problem solving skills, Prosocial behavior, Refusal skills, Skills for social interaction

School: Opportunities for prosocial involvement in education, Rewards for prosocial involvement in school

* Risk/Protective Factor was significantly impacted by the program

Subgroup Analysis Details

Subgroup differences in program effects by race, ethnicity, or gender (coded in binary terms as male/female) or program effects for a sample of a specific racial, ethnic, or gender group:

Study 1 (Karna et al., 2011a; Williford et al., 2013; Juvonen et al., 2016) tested for subgroup effects by gender and found equal benefits for males and females.

Sample demographics including race, ethnicity, and gender for Blueprints-certified studies:

For Study 1 (Karna et al., 2011a), the sample included students who were mostly native Finns (i.e., Caucasian) and was evenly split between males and females.

For Study 3 (Karna et al., 2012), the authors provided no sociodemographic information on the schools or students, saying only that the sample schools were diverse and located throughout Finland.

Training and Technical Assistance

KiVa is a European program and has not been assessed by Blueprints for dissemination readiness in the United States.

Certified KiVa trainers provide pre-implementation training for end users (school staff) over two days (6-7 hours per day). The first day primarily covers the nature and mechanisms of bullying (especially the view of bullying as a group process) and the universal actions included in the KiVa program. The second day covers the indicated actions of the KiVa program, i.e. tackling the cases of bullying coming to the attention of school staff. The training includes lectures, pair and group discussions, learning-by-doing exercises and demonstrations of how cases of bullying are tackled.

Training Certification Process

To become a certified KiVa trainer, a person attends a four-day training for trainers organized in Finland. Certified trainers update their training every second year in a one-day workshop. The training of trainers includes information and exercises about 1) bullying, 2) universal and indicated actions included in the KiVa program, 3) managing and supporting the implementation of KiVa, and 4)visiting a school that has been implementing KiVa for several years.

Benefits and Costs

Source: Washington State Institute for Public Policy
All benefit-cost ratios are the most recent estimates published by The Washington State Institute for Public Policy for Blueprint programs implemented in Washington State. These ratios are based on a) meta-analysis estimates of effect size and b) monetized benefits and calculated costs for programs as delivered in the State of Washington. Caution is recommended in applying these estimates of the benefit-cost ratio to any other state or local area. They are provided as an illustration of the benefit-cost ratio found in one specific state. When feasible, local costs and monetized benefits should be used to calculate expected local benefit-cost ratios. The formula for this calculation can be found on the WSIPP website.

No information is available

Program Developer/Owner

Christina SalmivalliUniversity of TurkuDepartment of PsychologyAssistentinkatu 7Finlandeijasal@utu.fi

Program Outcomes

Anxiety
Bullying
Violent Victimization

Program Specifics

Program Type

Bullying Prevention
School - Environmental Strategies
School - Individual Strategies

Program Setting

School

Continuum of Intervention

Indicated Prevention
Universal Prevention

Program Goals

An anti-bullying program for grades 2-6 with 20 hours of student lessons. Primarily implemented in Europe, the program aims to improve mental health outcomes in addition to reducing bullying.

Population Demographics

KiVa is aimed at elementary and middle school students. Blueprints certifies the program for grades 2-6, as evaluation results in grades 8-9 show no consistent patterns of effects.

Target Population

Age

Late Childhood (5-11) - K/Elementary

Gender

Both

Race/Ethnicity

Subgroup Analysis Details

Subgroup differences in program effects by race, ethnicity, or gender (coded in binary terms as male/female) or program effects for a sample of a specific racial, ethnic, or gender group:

Study 1 (Karna et al., 2011a; Williford et al., 2013; Juvonen et al., 2016) tested for subgroup effects by gender and found equal benefits for males and females.

Sample demographics including race, ethnicity, and gender for Blueprints-certified studies:

For Study 1 (Karna et al., 2011a), the sample included students who were mostly native Finns (i.e., Caucasian) and was evenly split between males and females.

For Study 3 (Karna et al., 2012), the authors provided no sociodemographic information on the schools or students, saying only that the sample schools were diverse and located throughout Finland.

Risk/Protective Factor Domain

Individual
School

Risk/Protective Factors

Risk Factors

Individual: Bullies others*, Favorable attitudes towards antisocial behavior*

School: Low school commitment and attachment*

Protective Factors

Individual: Clear standards for behavior, Problem solving skills, Prosocial behavior, Refusal skills, Skills for social interaction

School: Opportunities for prosocial involvement in education, Rewards for prosocial involvement in school

*Risk/Protective Factor was significantly impacted by the program

Brief Description of the Program

Description of the Program

Theoretical Rationale

KiVa enjoys a multifaceted theoretical background. Social-cognitive theory is used as a framework for understanding the processes of social behavior. Recent research suggests that bullying behavior is at least partly motivated by a pursuit of high status and a powerful position in the peer group. Bullying is also a group phenomenon, in which bystanders can contribute to the maintenance of bullying by assisting and reinforcing the bully and by giving bullies the position of power they seek. KiVa is predicated on the idea that a positive change in the behaviors of classmates can reduce the rewards gained by bullies and consequently their motivation to bully in the first place. KiVa places concerted emphasis on enhancing the empathy, self-efficacy, and antibullying attitudes of onlookers, who are neither bullies nor victims. The aim is to make bystanders show that they are against bullying and to make them support the victim, instead of encouraging the bully. As another equally important component, the KiVa program includes procedures for handling the acute bullying cases that come to the attention of the school personnel.

Theoretical Orientation

Cognitive Behavioral
Normative Education
Social Learning

Brief Evaluation Methodology

Primary Evidence Base for Certification

Study 1

Study 3

Outcomes (Brief, over all studies)

Primary Evidence Base for Certification

Study 1

Karna et al. (2011a), Salmivalli et al. (2011), Williford et al. (2012a), Williford et al. (2013), and Juvonen et al. (2016) found that the program was successful in reducing many components of bullying and victimization. Although changes by Wave 2 (seven months after pretest) were modest, more consistent changes occurred by Wave 3 (one year after pretest). At the later time point, 7 of 11 criterion variables showed significantly greater improvement in the intervention than the control schools. The improvements occurred for self-reported and peer-reported measures, for victimization and bullying measures, and for measures of bystander actions. When dichotomizing self-reported victimization and bullying, the odds of being a victim were about 1.5-1.8 times higher for a control school student than for a student in an intervention school, and the odds of being a bully were 1.2-1.3 times higher for a control school student than for a student in an intervention school.

Salmivalli et al. (2011) found that, after 9 months of intervention, control school students were 1.32 to 1.94 times as likely to be bullied as students in the intervention schools.

Williford et al. (2013) found that control school students were 1.29 times as likely to have experienced cybervictimization as students in the intervention schools. Control school students were also 1.34 times as likely to have engaged in cyberbullying as students in the intervention schools.

Study 3

Karna et al. (2012) found that for grades 2-3, the intervention significantly reduced self-reported bullying.

Outcomes

Study 1

At Wave 2 (seven months after pretest):

Had a lower level of peer-reported victimization.
Defended the victims more and they had more antibullying attitudes and empathy toward victims.

At Wave 3 (one year after pretest, nine months of intervention):

Showed significantly greater improvement on 7 of 11 criterion variables, including self- and peer-reported victimization and self-reported bullying.
Assisted and reinforced the bully less often, and they had higher self-efficacy for defending and well-being at school.
Had more positive perceptions of peers and reduced anxiety (Williford et al., 2012).
Were less likely to be bullied (Salmivalli et al., 2011).
Were less likely to have experienced cybervictimization (Williford et al. 2013).
Were less likely to have engaged in cyberbullying (Williford et al. 2013).

Study 3

Karna et al. (2012) found for grades 2-3, compared to the control group, the KiVa program significantly reduced:

Self-reported bullying.

Mediating Effects

Study 1 (Williford et al., 2012a) showed that peer-reported victimization was both influenced by the intervention and influenced by perception of peers, depression and anxiety.

Effect Size

In Study 1 (Karna et al., 2011a; Salmivalli et al. 2011; Williford et al., 2013), effect sizes were generally small (mostly Cohen's d < .20).

Generalizability

Two studies meet Blueprints standards for high quality methods with strong evidence of program impact (i.e., "certified" by Blueprints): Study 1 (Karna et al., 2011a) and Study 3 (Karna et al., 2012). Both studies took place in basic education schools throughout Finland and compared the intervention group with a business-as-usual control group.

Potential Limitations

Study 2 (Karna et al., 2011b)

QED with non-random assignment and limited matching

Karna, A., Voeten, M., Little, T. D., Poskiparta, E., Alanen, E., & Salmivalli, C. (2011b). Going to scale: A nonrandomized nationwide trial of the KiVa antibullying program for grades 1-9. Journal of Consulting and Clinical Psychology, 79(6), 796-805.

Study 4 (Yang & Samivalli, 2015)

Incorrect level for data analysis
No tests for differential attrition

Yang, A., & Salmivalli, C. (2015). Effectiveness of the KiVa antibullying programme on bully-victims, bullies and victims. Educational Research, 57(1), 80-90.

Study 5 (Nocentini & Menesini, 2016)

Several group differences at baseline

Nocentini, A., & Menesini, E. (2016). KiVa anti-bullying program in Italy: Evidence of effectiveness in a randomized control trial. Prevention Science, 17, 1012-1023.

Study 6 (Huitsing et al., 2020)

Attrition and analysis sample sizes are missing details
Unclear if intent-to-treat analyses were done and which participants were included in or excluded from analyses
No information on reliability of measures
Incomplete tests for baseline equivalence
Incomplete tests for differential attrition

Huitsing, G., Lodder, G. M. A., Browne, W. J., Oldenburg, B., & Van der Ploeg, R. (2020). A large-scale replication of the effectiveness of the KiVa Antibullying Program: A randomized controlled trial. Prevention Science, 21, 627-638. https://doi.org/10.1007/s11121-020-01116-4

Study 7 (Axford et al., 2020)

Cluster RCT but different consent rates across conditions
Teacher measures may not be independent
No reliability or validity information for study sample
Adjusted for clustering but the level-2 sample of 21 may be too small
No significance tests or effect sizes for baseline equivalence
Some evidence of differential attrition
No effects on behavioral outcomes

Axford, N., Bjornstad, G., Clarkson, S., Ukoumunne, O. C., Wrigley, Z., Matthews, J., Berry, V., & Hutchings, J. (2020). The effectiveness of the KiVa bullying prevention program in Wales, UK: Results from a pragmatic cluster randomized controlled trial. Prevention Science, 21, 615-626. https://doi.org/10.1007/s11121-020-01103-9

Endorsements

Blueprints: Promising
Crime Solutions: Promising
OJJDP Model Programs: Promising

Program Information Contact

Christina Salmivalli
Department of Psychology
University of Turku
Assistentinkatu 7
20014 Turun yliopisto, Finland
Email: eijasal@utu.fi
Website: https://www.kivaprogram.net/

References

Study 1

Juvonen, J., Schacter, H., Sainio, M., & Salmivalli, C. (2016). Can a school-wide bullying prevention program improve the plight of victims? Evidence for risk x intervention effects. Journal of Consulting and Clinical Psychology, 84, 334-344.

Certified

Salmivalli, C., Karna, A., & Poskiparta, E. (2011). Counteracting bullying in Finland: The KiVa program and its effects on different forms of being bullied. International Journal of Behavioral Development, 35(5), 405-411.

Williford, A., Boulton, A., Noland, B., Little, T. D., Karna, A. & Salmivalli, C. (2012b). Erratum to: Effects of the KiVa anti-bullying program on adolescents' depression, anxiety and perception of peers. Journal of Abnormal Child Psychology, 40, 301-302.

Certified Williford, A., Boulton, A., Noland, B., Little, T. D., Karna, A., & Salmivalli, C. (2012a). Effects of the KiVa anti-bullying program on adolescents' depression, anxiety and perception of peers. Journal of Abnormal Child Psychology, 40, 289-300.

Williford, A., Elledge, L. C., Boulton, A. J., DePaolis, K. J., Little, T. D., & Salmivalli, C. (2013). Effects of the KiVa antibullying program on cyberbullying and cybervictimization frequency among Finnish youth. Journal of Child & Adolescent Psychology, 42(6), 820-833.

Study 2

Study 3

Certified Karna, A., Voeten, M., Little, T. D., Alanen, E., Poskiparta, E., & Salmivalli, C. (2012). Effectiveness of the KiVa antibullying program: Grades 1-3 and 7-9. Journal of Educational Psychology. Advance online publication. doi: 10.1037/a0030417

Study 4

Yang, A., & Salmivalli, C. (2015). Effectiveness of the KiVa antibullying programme on bully-victims, bullies and victims. Educational Research, 57(1), 80-90.

Study 5

Nocentini, A., & Menesini, E. (2016). KiVa anti-bullying program in Italy: Evidence of effectiveness in a randomized control trial. Prevention Science, 17, 1012-1023.

Study 6

Study 7

Study 1

Summary

At Wave 2 (seven months after pretest):

Had a lower level of peer-reported victimization.
Defended the victims more and they had more antibullying attitudes and empathy toward victims.

At Wave 3 (one year after pretest, nine months of intervention):

Showed significantly greater improvement on 7 of 11 criterion variables, including self- and peer-reported victimization and self-reported bullying.
Assisted and reinforced the bully less often, and they had higher self-efficacy for defending and well-being at school.
Had more positive perceptions of peers and reduced anxiety (Williford et al., 2012).
Were less likely to be bullied as students (Salmivalli et al., 2011).
Were less likely to have experienced cybervictimization (Williford et al. 2013).
Were less likely to have engaged in cyberbullying (Williford et al. 2013).

Evaluation Methodology

Design: To recruit schools, letters describing the KiVa project were sent in the fall of 2006 to all 3,418 schools providing basic education in mainland Finland. In this first phase of program evaluation (Grades 4-6), the 275 volunteering schools were stratified by province and language and 78 of them were randomly assigned to intervention or control conditions (special-education-only schools were excluded). The participating schools were located throughout the country and resembled other comprehensive schools in such characteristics as class size and proportion of immigrant students. As such, they can be considered representative of Finnish comprehensive schools.

Data collection took place three times: in May 2007, December 2007 or January 2008, and May 2008. Students filled out Internet-based questionnaires in the schools' computer labs during regular school hours. The process was administered by the teachers, who received detailed instructions about two weeks prior to data collection. At the beginning of the session, the term bullying was defined for the students in the way formulated in the Olweus Bully/Victim Questionnaire, which emphasizes the repetitive nature of bullying and the power imbalance between the bully and the victim.

The target sample at Wave 1 included 78 schools with 429 classrooms and a total of 8,237 students in Grades 3-5 (mean ages = 9-11 years). A total of 7,564 students (91.7% of the target sample) received consent from a parent to participate in the study. One whole school dropped out before the data collection because of problems related to their school facilities. By Waves 2 and 3 some changes in the student composition had taken place, with 251 students leaving the schools and 463 entering them.

Between Waves 1 and 2, two control schools (51 students) dropped out, and five more (640 students) dropped out between Waves 2 and 3. There were no missing values in predictor variables, and for outcome variables the percentages of missing values were not high, except for control schools at Wave 3. Students were excluded from the analyses if: (a) they were denied permission to participate in the study but had somehow answered the questionnaire and (b) they left school after Wave 1. With missing data imputed, the analysis included 77 schools and 8,166 children. Although not stated explicitly, it appears that missing data was imputed for all students in the seven schools that dropped out between Waves 1 and 3 as well as for students missing data on particular measures.

As the evaluation is about the school year in which the intervention took place, researchers assigned all students to the classrooms they belonged to during that school year. Classroom changes were not taken into account in the models, as the data indicated that about 82% of the classrooms remained the same at Wave 2 as they had been at Wave 1.

Sample: The final sample used for the analyses had 77 schools and 8,166 students (4,201 in the intervention and 3,965 in the control condition). Altogether, 50.1% of the respondents were girls and 49.9% boys. Most students were native Finns (i.e., Caucasian), with the proportion of immigrants being 2.4%.

Measures: The study includes both self-reported and peer-reported measures of victimization and bullying.

Self-reported bullying and self-reported victimization: The questionnaire started with demographic questions (e.g., gender and age) followed by questions about bullying and victimization. To measure bullying and victimization, the global items from the revised Olweus Bully/Victim Questionnaire were utilized: "How often have you been bullied at school in the last couple of months?" and "How often have you bullied others at school in the last couple of months?" Students answered on a 5-point scale (0 = not at all, 4 = several times a week).

Participant roles in bullying situations and peer-reported victimization: When answering the Participant Role Questionnaire, students were instructed to think of situations in which someone was bullied. They were presented with items describing different ways to behave in such situations, and they were asked to nominate, from a list of classmates presented on the computer screen, an unlimited number of classmates that usually behave in the way described in each item. They were allowed also to choose "no one." The 12 items used in this study form four scales reflecting different participant roles: bullying ("Starts bullying," "Makes the others join in the bullying," "Always finds new ways of harassing the victim"), assisting the bully ("Joins in the bullying, when someone else has started it," "Assists the bully," "Helps the bully, maybe by catching the victim"), reinforcing the bully ("Comes around to watch the situation," "Laughs," "Incites the bully by shouting or saying: Show him/her!"), and defending the victim ("Comforts the victim or encourages him/her to tell the teacher about the bullying," "Tells the others to stop bullying," "Tries to make the others stop bullying"). To measure peer-reported victimization, students nominated classmates treated in the following ways: "He/She is being pushed around and hit," "He/She is called names and mocked," "Nasty rumors are spread about him/her." They were allowed to make an unlimited number of nominations, or to answer "no one." Peer nominations received were totaled and divided by the number of classmates responding, resulting in a score ranging from 0.00 to 1.00 for each student on each item.

Antibullying attitudes: The original 20-item Provictim scale was modified into a 10-item version to better fit the study context. Students responded on a 5-point scale (0 = I disagree completely, 4 = I agree completely) to items such as: "It's okay to call some kids nasty names." All 10 items loaded highly on one factor in an exploratory factor analysis. After six negatively keyed items were reversely coded, scores on all 10 items were averaged.

Empathy toward victims: A seven-item empathy scale consisting of items such as "When a bullied child is sad I feel sad as well" was utilized. Students evaluated how often the statements were true for them, responding on a 5-point scale (0 = never, 4 = always). An exploratory factor analysis supported a single factor. The items were averaged, creating a single empathy score (ranging from 0 to 4), with higher numbers indicating greater empathy toward victims.

Self-efficacy for defending behavior: Students evaluated how easy or difficult it would be for them to defend and support the victim of bullying. The three items used in the scale were derived from the participant role questionnaire items for defending behavior, for instance "Trying to make the others stop the bullying would be …" The answers were given on a 4-point scale (0 = very difficult for me, 3 = very easy for me). Scores were averaged across the three items to create a single self-efficacy score.

Well-being at school: Students' well-being at school was measured with items that were initially developed by the Finnish National Board of Education, including general liking of school (e.g., "My school days are generally nice"), academic self-concept (e.g., "Learning brings me joy"), classroom climate (e.g., "There is a good climate in our class"), and school climate (e.g., "I feel safe at school"). Students responded to 14 items on a 5-point scale (0 = I disagree completely, 4 = I agree completely). All items loaded highly on one factor and thus were combined into one scale by averaging the item scores.

Analysis: Based on imputed missing data, multilevel modeling was used with MLwiN 2.11 to estimate the intervention effects in the presence of the nested data structures. Four-level models were fitted, with the first level representing change over time, the second level representing individual student differences, the third level representing differences between classrooms, and the fourth level representing between-school differences. With the randomization of schools, the measurement of the intervention at the school level means the analysis was done at the proper level. Also, the differences between KiVa schools and control schools were examined after controlling for baseline levels of the outcome variable as well as for gender, age, and language of instruction at school (Finnish or Swedish).

The statistical significance (.05 two-tailed) of the intervention effects was tested with model deviance values. The intervention effect at Wave 2 was tested by deleting the Intervention × T2 interaction term from the model and conducting a deviance test. Next, the Intervention × T2 interaction was entered back into the model, and the significance of the intervention effect at T3 was examined in a similar way, after which the Intervention × T3 term was reentered into the model. Last, the significance of second-order interaction terms involving gender and age were tested with deviance values. Insignificant second-order interaction terms were dropped from the equation. Significance tests for other variables were done with the usual Wald tests based on the coefficients and standard errors.

The deviance tests were necessary with the method used to impute missing data. Rather than average the model estimates from 100 imputed data sets, as is usual, the analysis first aggregated the means across the 100 imputed data sets and then estimated a single model. This procedure underestimates standard errors and requires use of the deviance tests.

Eleven criterion variables were used: self-reported and peer-reported bullying and victimization, three bystanders' behaviors in bullying situations, antibullying attitudes, empathy toward victims, self-efficacy for defending, and well-being at school. On the basis of the distributions of the variables, skew corrections were used, except for empathy toward victims and self-efficacy for defending. Variables with skewed distributions were transformed into normal scores.

Outcomes

Baseline equivalence: Intervention and control schools did not differ statistically on the criterion variables.

Differential attrition: The paper does not describe the differential attrition analysis but directs readers to a website. However, the discussion argues that, with the multiple imputation of missing data, the analysis was able to mitigate any impact of selective attrition that is related to other variables in the data set.

Posttest effects: Intervention effects were examined with interaction terms for intervention × T2 and intervention × T3. Gender and age were used as control variables in estimating intervention effects at Wave 2 and Wave 3, even when not statistically significant at baseline. The control variables did not have any consistent pattern of effects on the change in the dependent variables.

Compared with the control school students at Time 2, students in KiVa schools showed significant improvements on four of the 11 criterion variables. The intervention significantly reduced peer-reported victimization, and it increased peer-reported defending, anti-bullying attitudes, and empathy toward victims. By Time 3, the intervention had more consistent effects. Compared to control schools, students in KiVa schools showed significant improvement on seven of 11 criterion variables. Positive intervention effects emerged for self-reported victimization, self-reported bullying, peer-reported victimization, peer-reported assisting, peer-reported reinforcing, self-efficacy for defending, and well-being at school. The intervention at Time 3 thus benefited victims and bullies, but also had some positive effects on the bystanders' behaviors as well. Still further, by Time 3, the intervention increased self-efficacy for defending and well-being at school. In general, the intervention had equal effects on boys and girls and students of different ages with only one exception: the intervention effects were larger for older students at both Waves 2 and 3.

The intervention was effective in reducing victimization according to both self- and peer reports, but the effect size was almost twice as large for peer reports of victimization compared to self-reports. Compared to victimization, the intervention effects on bullying were smaller for both self-reports and peer reports. Overall, however, effect sizes were small: of 11 criterion variables at Time 2 and at Time 3 (or 22 effects sizes in total), only one had a Cohen's d above .2 (.33 for peer-reported victimization). More than half of the effects sizes were below .10.

When dichotomizing self-reported victimization and bullying, the odds of being a victim were about 1.5-1.8 times higher for a control school student than for a student in an intervention school, and the odds of being a bully were 1.2-1.3 times higher for a control school student than for a student in an intervention school. In terms of percentages at Time 3, 12.7% of students in the control schools were self-reported victims, compared to 8.9% in the intervention schools. And 3.8% of the students were self-reported bullies in the control schools compared to 3.1% in the intervention schools.

Long-term effects: Because the program is designed to be operated continuously in all grades, there were no tests for sustained effects after the end of the intervention.

Salmivalli et al. (2011) & Williford et al. (2013)

This paper extends the Karna et al. (2011) study by investigating the success of the KiVa program in reducing nine different forms of being bullied (i.e., victimization) rather than global measures of bullying.

Using the same sample of schools as Karna et al. (2011), this paper analyzed 7,303 students who responded at pretest for the correlation analysis regarding the relations between different forms of bullying. For the pretest-posttest comparisons, 5,651 students were involved (3,347 intervention and 2,304 control). The sample size is smaller than in Karna et al. (2011a) because missing data were not imputed.

As in Karna et al. (2011), the pretest took place in May 2007, and the posttest took place in May 2008. However, the article misreported the pretest as May 2006 and the posttest as May 2007. In correspondence, the author corrected the error in dates, stating that the pretest and posttest were the same as in the Karna et al. (2011) study.

Measures: Nine different types of bullying were assessed from self-reported measures. The nine types were derived from the Revised Olweus Bully/Victim Questionnaire. These include:

Verbal - 'I was called mean names, was made fun of or teased in a hurtful way'

Exclusion - 'Other students ignored me completely or excluded me from things or from their group of friends'

Physical - 'I was hit, kicked or shoved'

Manipulative - 'Other students tried to make others dislike me by spreading lies about me'

Material - 'Somebody took money or other things from me or damaged my things'

Threat - 'I was threatened or forced to do things I would not have wanted to do'

Racist - 'I was bullied by calling me names, making remarks or gestures about my ethnicity or skin color'

Sexual - 'I was bullied by sexual name calling, sexual actions or gestures'

Cyber - 'I was bullied by cell phone or through the internet'

Analysis: Bullying victims were defined as those who self-reported being bullied 'two or three times a month,' and analyses compared this group to others. The percentage changes in self-reports of being bullied were reported in a simple pretest-posttest analysis. Here, changes in the prevalence of students categorized as victims of bullying at posttest were reported, relative to their baseline frequency. Odds ratios for being bullied by each of the types of bullying were also reported for the control group relative to the intervention group. The standard errors for the odds ratios are adjusted for clustering at the school level. The study reports 80% confidence intervals because of the rare occurrence of some types of bullying.

Outcomes: Overall, there were substantial differences between intervention and control schools. Odds ratios reveal that students in the control schools were 1.32 to 1.94 times as likely to be bullied as students in the intervention schools. The most substantial changes were seen in the reduction of material bullying, physical bullying and cyber bullying. However, there were several instances where both intervention and control schools saw reductions in bullying. All forms of being bullied correlated positively with each other and with the global question, indicating that when a child is bullied, they are often targeted by several forms of bullying.

Students in the control schools were 1.29 times as likely to experience cybervictimization as students in intervention schools. Students in control schools were 1.34 times as likely to engage in cyberbullying as students in intervention schools. These findings are statistically significant, but minimally effective; Cohen's d=.14 and .16, respectively.

Williford et al. (2012a)

This paper examined the effects of the KiVa program on students' anxiety, depression and perceptions of peers in grades 4-6. This study also examined whether reductions in peer-reported victimization predicted changes in the outcome variables. A total of 7,741 students (3,685 in the control condition and 4,056 in the intervention) were included in the analysis. The analysis used the same three waves of data as Karna et al. (2011).

Measures: Measures for this analysis included peer-reported victimization (all 3 waves) as well as perception of peers, depression and anxiety (Waves 1 and 3). Covariates included gender, age, language of classroom instruction (some classrooms spoke Swedish, and others Finnish), and immigration status.

Peer-reported victimization: Victimization was measured using a peer-nominated process through which each student was nominated by peers as either a victim or non-victim. Students were allowed to nominate as many of their classmates as they felt appropriate. The number of peer nominations for each student was totaled and a proportion was calculated by dividing the number of raw nominations received for each student by the number of students providing nominations within each classroom, resulting in a score ranging from 0.0 to 1.0 (alpha = .84).

Perception of peers: Students were asked to rate their beliefs about their peers in general. Student beliefs were measured using the Generalized Perception-of-Peers Questionnaire. This scale assesses the extent to which peers are considered supportive, kind and trustworthy as opposed to unsupportive, hostile and untrustworthy (alpha = .89).

Depression: Levels of depression were measured using seven items adapted from the Beck Depression Inventory. Participants were asked to describe their feelings in the last two weeks about their mood and how they feel about themselves (alpha = .89).

Anxiety: Items from the Fear of Negative Evaluation and the Social Avoidance and Distress scales were used to measure students' level of anxiety. These measures included items to assess stress, worry and avoidance of social interaction (alpha = .88).

Analysis: The program effects were tested using structural equation modeling. This approach examined the relationships between hypothetical constructs. Two structural equation models were used. The first examined the mean differences on the outcome variables between the study conditions. The second model was a cross-lagged panel model and was evaluated to determine if changes in victimization predicted changes in other outcome variables.

Outcomes: Adjusted means comparison revealed that the KiVa program was effective for reducing students' internalizing problems and improving their peer-group perceptions. Additionally, a cross-lagged panel model demonstrated that changes in anxiety, depression and positive peer perceptions were found to be predicted by reductions in victimization.

Differential attrition: Issues of differential attrition are explained above (Karna et al. 2011).

Baseline equivalence: At Wave 1, group means did not differ significantly, although the intervention group scored higher on peer-reported victimization (effect size = .13).

Mean comparisons

Students in the intervention condition had significantly less peer-reported victimization at Wave 2 and Wave 3, when compared to students in the control condition. Additionally, there were decreases in anxiety in both conditions over time that did not differ significantly across groups.

Students' positive perceptions of their peers decreased over the course of the study, but the decrease was significantly smaller in the intervention condition (effect size = .20). Additionally, mean depression levels increased for both intervention and control conditions, but changes did not differ significantly across conditions.

Structural relations

A structural analysis was designed to investigate whether reductions in victimization positively influenced other important areas of students' well-being, and whether such effects differed between the study conditions.

A multiple-group, cross-lagged panel model revealed that reductions in victimization over time resulted in increases in students' positive peer evaluations and lower levels of depression. Reductions in victimization over time predicted subsequent reductions in anxiety for both conditions, but more strongly for the intervention group.

Juvonen et al. (2016)

This paper examined additional outcomes of perceptions of a caring school climate, attitudes toward school, depression, and self-esteem, and it tested for program moderation by baseline experience of victimization and grade level.

Recruitment: Recruiting for this study is described above in Karna et al. (2011).

Assignment: A total of 78 schools were randomly assigned to conditions. Of the 7,312 students completing the baseline assessment, the analysis used the 7,010 participants with baseline victimization data and posttest data. For the analytic sample, the intervention group included 3,775 students and the control group included 3,235 students.

Attrition: Attrition for the full sample is described above in Karna et al. (2011). For this sample, the analysis of 7,010 of the 7,312 students with baseline data suggests attrition of only 4.1%. The study used assessments at baseline and 12-months from baseline (9 months after completion of the intervention).

Sample: Half of participants were girls (50.6%) and the average age of the sample at baseline was 11.2 years. The majority of students were Finns and 2.1% of the sample subjects were unspecified immigrants.

Measures: The study examined four self-reported outcomes, each with good reliability. First, perceptions of caring school climate used the Finnish National Board of Education questionnaire, in which students rated their feelings of security, comfort, and acceptance at school on a 5-point scale. Second, attitudes toward school were measured with an adapted version of the Health Behavior of School Age Children Survey, which has students rate their feelings about going to school on a 5-point scale. Third, depression was measured with the Beck Depression Inventory, which asks about depression symptoms over the preceding 2 weeks. Finally, the study used the Rosenberg Self-Esteem Scale to ask students to rate their sense of themselves among peers.

Analysis: The study fit two-level regression models (Level 1=students, Level 2=school) with level of victimization interacted with intervention condition. The study also tested intraclass correlations between schools and found the majority of variation to exist between individuals.

Intent-to-Treat: The models used all available data with full information maximum likelihood (FIML) estimation for missing data.

Outcomes

Implementation Fidelity: The study said that overall fidelity was good but did not present quantitative measures.

Baseline Equivalence: The study reported no differences in mean levels of victimization or mental health of students in intervention versus control schools but said nothing more about other baseline measures.

Differential Attrition: Attrition appears to be low.

Posttest: The study found a significant overall intervention effect on perception of caring school climate and attitude towards school. In addition, the study found that students with higher scores of victimization at baseline were most helped by the program in regards to perception of caring school climate.

The program failed to reduce depression or increase self-esteem in 4th and 5th grade students. It had positive effects on both outcomes but only for 6th grade students with high baseline victimization.

Long-Term: No long-term follow-up reported for this study.

Study 2

Summary

Karna et al. (2011b) used a quasi-experimental, cohort-longitudinal design in which posttest data from Finnish students in each grade cohort were compared to pretest data from same-age students within the same school (the previous cohort) who had not yet been exposed to the intervention. For example, data from first graders in May 2010 (after they had been exposed to KiVa for 1 year) were compared with data from students who were first graders in May 2009 and who were not yet exposed to the intervention program.

Karna et al. (2011b) found for the intervention group, compared to the control group, KiVa produced significant reductions in:

Rates of bullying by 14%.
Rates of victimization by 15%.

Evaluation Methodology

Design: This evaluation was based on a quasi-experimental, cohort-longitudinal design. Because all participating schools were implementing the KiVa program, a true experimental design was not possible. For this evaluation, posttest data from students in each grade cohort were compared to pretest data from same-age students within the same school (the previous cohort) who had not yet been exposed to the intervention. For example, data from first graders in May 2010 (after they had been exposed to KiVa for 1 year) were compared with data from students who were first graders in May 2009 and who were not yet exposed to the intervention program.

Sample Attrition: Only schools that participated in both pretest and posttest measures were included in the final sample. Letters were sent to 3,218 schools in Finland, and 1,827 were willing to adopt the KiVa program. Because of limited resources, only 1,450 schools implemented the program. Of the 1,450 schools that adopted the program, 1,189 participated in a web-based, pretest survey. A total of 301 schools were excluded from the final analysis due to lack of posttest measurement, resulting in a final sample of 888 schools.

In addition, a total of 403 individual respondents were excluded from the analysis because of contradictory responding. Final control and intervention samples were 156,634 and 141,103 for victimization and 156,629 and 141,099 for bullying. Response rates for Waves 1 and 2 were 78% and 70%, respectively. Data did not reveal systematic difference between the dropouts and the study sample with regard to program implementation fidelity.

Sample: The final sample included 888 schools with approximately 150,000 students in 11,200 classrooms in grades 1-9. Students were 8-16 years of age; 51% were boys and 49% girls. Data on socioeconomic status and ethnic background of the students were not collected, but given the large sample size it should be considered fairly representative of Finnish schools in general.

Measures: Bullies and victims were identified with global questions from the Revised Olweus Bully/Victim Questionnaire (described above). This questionnaire asks students about bullying others, being bullied, telling about bullying, attitudes related to bullying and classroom/school atmosphere. The investigators dichotomized the bullying scales for purposes of analysis.

Analysis: Program effects were examined by calculating odds ratios based on a cohort-longitudinal design, correcting the standard errors for clustering. Bullying data was dichotomized; details of the dichotomization process are provided in Karna et al. (2011 a).

Outcomes:

Baseline Equivalence: A small number of schools (n = 29) had prior involvement in the KiVa program. In these schools, the prevalence of victimization was slightly lower (-1.5%), but the prevalence of bullying was equal to the rest of the sample. Of the 301 schools that did not respond at posttest (and were therefore excluded from the study), victimization and bullying were slightly more prevalent (victimization = +1.4% and bullying = +1.1%) than the rest of the sample. No other efforts were made to establish baseline equivalence between the control and intervention cohorts.

Differential Attrition: The study notes that bullies and/or victims may drop out more easily than others, thereby inflating the intervention effects. A simulation based on a worst-case scenario assumes that dropouts have higher bullying and victimization rates than completers. Using all cases, including assumed values for the dropouts, the KiVa program still produced statistically significant intervention effects.

Outcomes at Posttest and Follow-up

The KiVa program significantly reduced both victimization and bullying, with a control/intervention group odds ratio of 1.22 for victimization and 1.18 for bullying. The odds ratios correspond to reductions of 15% in the prevalence of victimization and 14% in the prevalence of bullying. In general, the intervention effects increased from grade 1 to grade 4 (where intervention effects were largest) and became statistically insignificant in grades 7 and 8 for victimization and grades 7, 8 and 9 for bullying. Odds ratios did not reveal any gender differences in the overall effectiveness of the program, and the program was found to be effective in both mainstream and special education schools.

Dose Response: A limited dose response analysis was conducted. The analysis revealed that teachers had used less time for implementing the lessons and themes than was recommended by the KiVa team. In support of the program, a significant, positive correlation was found between program dosage and reductions in bullying and victimization.

Study 3

Summary

Karna et al. (2012) found for grades 2-3, compared to the control group, the KiVa program significantly reduced:

Self-reported bullying.

Evaluation Methodology

Design:

Of 3,418 schools in Finland who were sent letters on the program, 275 volunteered to participate and 125 were randomly selected for study. In addition, 31 schools that had been randomized into the control condition for Study 1 were assigned to the intervention group in this study. The sample of 156 schools comprised 79 for Grades 1-3 and 78 for Grades 7-9, with only one school participating in both the lower and higher grades. Although the selected schools may differ from those contacted, the authors stated that the schools are diverse, are located throughout the country, and can be considered representative of Finnish schools with an active interest in implementing the KiVa program.

The first group of 125 schools was stratified by province and language and then randomly assigned to intervention and control conditions. All 31 schools from the previous study were placed in the intervention group rather than randomized, possibly compromising the randomization and creating a quasi-experimental design. The procedure resulted in a sample of 156 schools: 79 control and 78 intervention schools, split equally between the lower and higher grades.

The study collected three waves of data: May 2008 for the pretest, December 2008-February 2009 for the mid-program assessment, and May 2009 for the posttest. At the school level, 10 (6%) dropped out, either before providing any data (9) or after providing pretest data (1). The analysis did not use these schools, reducing the sample to 147 schools.

Among the 147 schools participating in all waves, 6,927 consented students, 397 classrooms, and 74 schools in grades 1-3 were available for all assessments and were included in the analysis. The analysis excluded 304 students who did not return to the sample schools after the pretest. In grades 7-9, 16,503 consented students, 1,000 classrooms, and 73 schools were available for all assessments and were included in the analysis. The analysis excluded 261 students who did not return to the sample schools after the pretest. Missing data from attrition and incomplete answers ranged from 8.2% to 18.4% on self-report measures and from 3.2% to 7.7% on the peer report measures.

However, the central analysis included only grades 2-3 and 8-9 because students in grades 1 and 7, despite participating in the program, had not been enrolled the previous spring at the time of the pretest. The authors reported on a separate posttest-only analysis of grades 1 and 7 in supplementary material but not in the published article. The sample size fell to 4,704 students, 273 classes, and 74 schools grades 2-3 and to 11,070 students, 686 classes, and 73 schools in grades 8-9.

Sample:

The study provided no sociodemographic information on the schools or students, saying only that the sample schools were diverse and located throughout Finland.

Measures:

Students completed internet-based questionnaires in school computer labs during regular school hours and under the supervision of teachers (teachers read the questions aloud for the lower grades but not the higher grades). Assessment sessions defined bullying and kept student responses under password protection.

The measures of self-reported bullying, self-reported victimization, participant roles in bullying situations, and peer-reported victimization appear identical to those used in Study 1 (described above). High correlations among the measures of bullying and victimization indicate construct validity and high alpha values for the scales indicate reliability.

For grades 2-3, the study gathered only the measures of self-reported victimization and self-reported bullying. For grades 8-9, the study gathered these two self-reported measures plus five peer-reported measures of victimization, bullying, assisting, reinforcing, and defending.

Analysis

The four-level multilevel models nested time (level 1) within students (level 2), which allowed for use of subjects having incomplete data. Level 3 represented classrooms and level 4 represented schools. Estimation of regression and logistic regression models with full information maximum likelihood adjusted for loss of data from differential attrition. The models controlled for baseline outcomes as well as gender, age, and language of instruction. The models appropriately treated the intervention as a school-level variable, and included intervention-by-time interactions to test for program effectiveness.

Classroom membership was based on posttest location and did not take account of changes from one classroom to another. The intraclass correlation values for the combined classroom and school levels ranged from .07 to .25 and for schools alone ranged from .02 to .05.

The analysis dropped 565 students who were randomized in the spring but did not return to a sample school and undergo the program in the fall. The randomized students dropped from the analysis may violate the intent-to-treat requirement but make up only about 2% of the full sample.

Outcomes

Implementation fidelity: Teachers completed questionnaires on program activities. Teachers reported, on average, that they completed nine of the ten lessons in the higher grades and four of five components in the lower grades.

Baseline equivalence: The text noted that differences in pretest outcome measures "were small (ranging from 0.00 to 0.01)." The tables included tests for the effects of the intervention on the intercept, which are equivalent to tests for baseline equivalence. None of the intervention main effects reached statistical significance. Otherwise, the authors did not report results for sociodemographic differences across conditions.

Differential attrition: The study did not compare the 10 schools that dropped out to the 147 schools that remained for all waves. At the student level, attrition was higher for the intervention group in grades 2-3, but higher for the control group in grades 8-9. Posttest non-responders had higher levels than completers on some peer-reported behaviors: victimization (Cohen's d = .11), defending (d = .08), bullying (d = .07), and assisting in bullying (d = .06). Non-responders also had higher self-reported bullying in grades 2-3 (d = .10) and in grades 8-9 (d = .05). On page 5, the study noted that further comparisons of differences between non-responders and completers within each condition suggested some potential to inflate the intervention effects in self-reported victimization in grades 2-3 and in self-reported bullying and peer-reported defending in grades 8-9.

The models adjusted for differential attrition with imputation of missing data.

Posttest

Although results for the mid-program assessment are included in the tables, the text discussed only the posttest results. For the posttest in grades 2-3, the intervention significantly reduced self-reported bullying. It significantly reduced self-reported victimization among girls, but only when the proportion of boys in the classroom was high. It did not significantly reduce victimization among boys, but the benefit grew when the proportion of boys in the classroom was high.

For the posttest in grades 8-9, the intervention failed to significantly affect self-reported victimization or bullying. It did have effects on the five peer-reported measures. The intervention

reduced peer-reported victimization for younger students.
reduced peer-reported bullying among boys when the proportion of boys in the classroom was high.
reduced peer-reported assisting of bullies for girls and boys, but more strongly for boys.
reduced peer-reported reinforcing among boys.
reduced peer-reported defending of victims - an iatrogenic effect.

Effect sizes were small. Significant odds ratios for self-reported victimization ranged from 1.10 to 1.63, and effect sizes ranged from .01 to .19 for peer-reported measures.

Long-term

Not tested.

Study 4

Summary

Yang and Salmivalli (2015) use a cluster randomized controlled trial to examine 23,520 students in 195 Finnish schools between the ages of 8 and 15 years from 738 intervention classrooms and 647 control classrooms. Randomization to intervention and control groups occurred at the school level. Data measuring bullying and victimization were collected at baseline and posttest, approximately 12 months later.

Yang and Salmivalli (2015) found that a posttest, compared to control schools, there were greater reductions among students in intervention schools in the:

Risk of being bully-victims, bullies or victims.

Evaluation Methodology

Design:

Recruitment/Sample size: A total of 23,520 students between the ages of 8 and 15 years from 738 intervention classrooms and 647 control classrooms participated. The study reported that the data used in this study came from a different study. This study referenced Salmivalli et al. (2010a) which is in an edited handbook and could not be obtained for this write up. The sample size reported in this study (n=195 schools) differs from the previous studies and it is unclear where the additional schools in this study came from. No detailed information was provided in this article on recruitment or consent procedures.

Study type/Randomization/intervention: Randomization was conducted at the school level. The study notes on page 85 that, because the sample covered two years, some schools that were assigned to the control condition year one were assigned to the intervention in year two. No other explanation was provided.

Assessment/Attrition: Data were gathered at baseline in the spring of the previous school year and posttest, approximately 12 months later and after 9 months of program participation. No information was provided on attrition rates for assessments at posttest. However, the proportion missing data was up to 18.6% for the self-reported bullying measures and up to 14.7% for the peer-report measures.

Sample Characteristics: No information was provided on characteristics of the sample schools or children.

Measures: A total of six measures of self-reported and peer-reported bullying and victimization status were gathered using the same measures described in Study 1. No information was provided on reliability or validity.

Analysis: Two-level multinomial logistic regressions with random intercepts for classroom were used to examine the program impact on bully/victim status. Despite randomization at the school level, the study treated the program as a classroom-level measure and adjusted for clustering only within classrooms. Pretest status and gender were used as covariates.

Missingness was accounted for using Full Information Maximum Likelihood estimation. It appears that the study complied with the intent-to-treat principle by including all students in the analysis regardless of dose received.

OUTCOMES

Implementation Fidelity: No information was provided.

Baseline Equivalence: No formal tests were provided, though the pretest means for the outcomes in Table 1 look similar.

Differential Attrition: No information was provided.

Posttest: When controlling for pretest status and gender, the program significantly reduced the risk of being bully-victims, bullies or victims as per both self-report and peer-report. Odds ratios were small and ranged between 1.20 and 1.63.

Study 5

Summary

Nocentini and Menesini (2016) conducted a cluster randomized controlled trial with 2,042 students enrolled in grades 4 and 6 in 13 Italian schools. Schools were randomly assigned to an intervention group or a business-as-usual control group. Data measuring bullying and victimization were collected at baseline and posttest, approximately 9 months later.

Nocentini and Menesini (2016) found at posttest, compared to control schools, participants in intervention schools showed significant reductions in:

Bullying.
Victimization.
Attitudes toward bullying, victimization, and empathy for victims.

Evaluation Methodology

Design:

Recruitment: Schools were recruited for participation via letters sent by the Regional School Board of Tuscany; letters were sent to 35 schools in three provinces (Florence, Siena, and Lucca). To be recruited, schools had to meet the following criteria: 1) they were comprehensive institutes comprising both elementary and middle schools; and 2) they had an average level of academic performance and socio-economic background. A total of 13 schools, comprising 97 4th through 6th grade classrooms, agreed to participate. Of the 2,184 students enrolled in these classrooms, 2,050 students consented and 2,042 completed at least some measures.

Assignment: The 13 schools were randomly assigned to either intervention (n=7 schools; 1,039 students) or control groups (n=6; 1,003).

Attrition: Between baseline and posttest there was an overall attrition rate of 6.5% (n=132) due to absence, refusal to participate, or changing schools.

Sample:

The sample was split between grade 4 (48%) and grade 6 (52%) and was evenly split on gender (49% male). The mean age at baseline was 8.8 for grade 4 students and 10.9 for grade 6 students. The majority of students (92%) were of Italian background.

Measures:

Data were collected at baseline (the start of the school year before program implementation) and at posttest at the end of the school year, approximately 5 months after program completion.

Self-reported bullying and self-reported victimization: To measure bullying and victimization, the global items from the revised Olweus Bully/Victim Questionnaire were utilized: "How often have you been bullied at school in the last couple of months?" and "How often have you bullied others at school in the last couple of months?" Students answered on a 5-point scale (0 = not at all, 4 = several times a week), with those admitting to any bullying or victimization over the past few months classified as bullies or victims. Further information on frequency and type of perpetrated or experienced bullying was gathered using the Florence Bullying and Victimization Scales, which is comprised of three subscales measuring physical, verbal, and indirect bullying-victimization, α=.82-.86.

Anti-bullying attitudes: The Questionnaire on Attitudes toward Bullying scale was employed to evaluate student attitudes, with 6 items focused on bullying and 6 examining victimization. Students responded on a 5-point scale (0 = I disagree completely, 4 = I agree completely) to items such as: "It's okay to call some kids nasty names." Items for each subscale were averaged, creating a pro-bullying attitude score and a pro-victim attitude score, α=.62-.71.

Empathy toward victims: A seven-item empathy scale consisting of items such as "When a bullied child is sad I feel sad as well" was utilized. Students evaluated how often the statements were true for them, responding on a 5-point scale (0 = never, 4 = always). The items were averaged, creating a single empathy score, with higher numbers indicating greater empathy toward victims, α=.82-.83.

Analysis:

The effect of the intervention was evaluated using linear mixed-effects models that accounted for nesting within individuals (though only two time points were included) and schools (the unit of randomization). These models implicitly adjust for baseline outcomes.

Intent-to-Treat: All available information across time was used. Missing data were treated as missing at random.

Outcomes

Implementation Fidelity:

Implementation fidelity was not measured in the Italian schools, though authors state there were notable differences in implementation across schools.

Baseline Equivalence:

There were some significant differences between groups; at the primary school level, the control group had higher levels of both pro-bullying attitudes and empathy towards the victim. At the middle school level, the experimental group reported significantly higher levels of victimization, pro-bullying attitudes, and bullying. Otherwise, the groups were similar on attitudes and demographics.

Differential Attrition:

There was a total attrition rate of 6.5% between baseline and posttest, with attrition analyses revealing no significant differences in groups, demographics, or in baseline-by-condition tests.

Posttest:

At both the primary and middle school levels there were significant improvements in behavioral measures of self-reported bullying and victimization as well as benefits for protective factors like pro-bullying attitudes, pro-victim attitudes, and empathy towards the victim.

Long-Term:

Posttest occurred approximately 5 months after program completion.

Study 6

Study 6

KiVa is a Blueprints promising program designated as European (i.e., not dissemination ready for U.S.). Study 6 (Huitsing et al., 2020) includes three program implementation adaptations made for a Dutch sample: only offering schools the non-confronting support group approach as indicated action, introducing materials that were originally developed for grades 4-6 in grades 3 and 4, and testing a new version of the program, KiVa+, which included an additional intervention component of network feedback to teachers. The study combined KiVa and KiVa+ intervention groups vs. the control group in primary analyses due to finding no differences between the intervention conditions.

Summary

Huitsing et al. (2020) conducted a cluster randomized controlled trial in the Netherlands with 100 Dutch primary schools (n=4,724 students in grades 2-3, Dutch grades 4-5) assigned to intervention or control groups. Measures of bullying and victimization came from students at baseline and four follow-up assessments over a 2-year study period.

Huitsing et al. (2020) found that, compared to control school students, intervention school students showed significantly greater reductions in:

Victimization.
Bullying at interim and posttest assessments.

Evaluation Methodology

Design:

Recruitment: Primary schools in the Netherlands (N=6,966) were recruited for participation via letters describing the KiVa project in the fall of 2011. Of the 132 schools that volunteered for the study, 100 agreed to participate in the May 2012 baseline assessment. Page 630 reports a sample of 3,309 in 65 intervention schools and 1,415 in 33 control schools. The sum equals 4,724 children in grades 2-3 (ages 7-9, Dutch grades 4-5) in 98 schools.

Assignment: Using a cluster randomized trial design and a stratified randomization procedure, 100 schools were assigned (after baseline) to KiVa (n=34 schools) or KiVa+ (n=33 schools) intervention conditions or a wait list, care as usual control group (n=33 schools). KiVa+ included an additional intervention component of network feedback to teachers. After random assignment, one school from the KiVa group and one from the KiVa+ group dropped out because they did not want to implement the program. The baseline sample thus had 98 schools (n=4,383 students across 245 classrooms), including 3,054 total students in the intervention groups and 1,329 students in the control group.

Attrition: Five assessments were conducted over two years: T1 baseline (May 2012), T2 (October 2012, fall of the first school year), T3 (May 2013, spring of the first school year), T4 (October 2013, fall of the second school year), and T5 (May 2014, spring of the second school year). The program was ongoing during the study so that the assessments represent interim and posttest program evaluations rather than long-term program evaluations. Along with the two schools (2%) lost immediately after randomization, two additional schools (2%) were lost to follow up after year 2 (T4-T5) of the study. The authors presented student non-response rates based on the analysis samples, which appeared to include new students entering the schools after the study start and did not include the students lost from dropout schools. According to the CONSORT diagram, student attrition rates based on the T1 baseline randomized sample (n=4,724) were 3.6% at T2, 4% at T3, 7.1% at T4, and 8.4% at T5.

Sample:

The baseline randomized sample (n=4,724 students) consisted of about 50% boys and had an average child age of 8.7 years. Student ethnic compositions were 80.2% Dutch, 2.6% Moroccan, 2.5% Surinamese, 2% Turkish, and 1.1% Dutch Antilleans. The remaining 12% of children reported another Western (6.2%) or non-Western (5.6%) ethnicity.

Measures:

The two outcome measures (victimization and bullying) came from student self-report surveys across five timepoints. The authors used maximum scores on victimization/bullying as well as dichotomized versions of the scales for a total of eight separate outcomes (i.e., ordinal and logistic forms of global bullying, global victimization, maximum bullying, and maximum victimization). The measures came from well-known sources, but no scale reliability coefficients were reported.

Analysis:

Intervention effects were tested using cross-classified ordered multinomial models and binomial logistic regression models, with Markov Chain Monte Carlo estimation. The cross-classified multilevel models account for children moving to different classrooms and schools over the 2-year study period, and nesting of measurement wave within students and nesting students within the combination of classroom and schools. Models estimated the main effects for measurement wave (with T1 as the reference category), intervention condition, and their interaction (wave by intervention condition). The models adjusted for baseline outcomes, gender, and grade, and included gender by intervention condition and grade by intervention condition interaction terms. Primary analyses combined KiVa and KiVa+ intervention groups vs. the control group.

Missing Data Strategy: There was no mention of missing data strategies used in the current study, but the multilevel estimation used in the analysis typically includes both participants with complete and incomplete data.

Intent-to-Treat: Based on the CONSORT diagram, it is unclear whether all participants with complete data were included in analyses, and if only those lost to follow-up were excluded.

Outcomes

Implementation Fidelity: Not reported.

Baseline Equivalence:

The authors presented baseline demographic characteristic rates (p. 630) and baseline outcome rates (Tables 1 and 2) for intervention group students and control group students that excluded the dropout schools. Although rates appeared similar and the authors stated that rates were comparable across conditions, no significance tests were conducted. However, the multilevel models for the analysis sample rather than the randomized sample showed no condition difference at baseline (indicated by the KiVa coefficient) on four outcome measures (see Table 3).

Differential Attrition:

Differential attrition analyses used a sample that included new students entering the schools and therefore did not isolate attrition among the original randomized students. For this sample, the overall student attrition rate of 8.4% at T5 and a difference in condition attrition rates of 1.5% met both the WWC cautious and optimistic standards. The same calculations for school attrition (3 of 67 in the intervention group and 1 of 33 in the control group) also met both WWC standards. In addition, as noted above, baseline equivalence of conditions for the analysis sample of students showed no differences for the outcomes (but did not examine sociodemographics). Still further, the use of FIML estimation may moderate potential bias from attrition.

Posttest:

Compared to control schools, intervention school students showed significantly greater reductions in global victimization, maximum victimization, global bullying, and maximum bullying scores from baseline to T3 and T4. Further, there were significant intervention effects on maximum victimization scores at T2 and maximum bullying frequency at T3. From baseline to T5 there were significant intervention effects on the maximum bullying score. Further, logistic regressions showed that at T5, intervention schools vs. control schools had significantly lower scores for maximum and global bullying frequency, maximum bullying occurrences, and maximum victimization frequency and occurrences.

Additional analyses showed no significant outcome differences between gender, grade, or KiVa vs. KiVa+ intervention conditions.

Long-Term:

Not examined.

Study 7

Summary

Axford et al. (2020) conducted a cluster randomized controlled trial with 21 elementary schools in Wales and 3,214 students. After assignment of schools to the intervention group or a waitlist control group, the study assessed bullying and behavioral and emotional problems at baseline and 12 months from baseline.

Axford et al. (2020) found no significant effects on any of the behavioral outcomes or risk and protective factors.

Evaluation Methodology

Design:

Recruitment: The study recruited 22 state-maintained primary schools in Wales. Recruitment occurred in the middle of the 2012/13 academic year. Participation was offered on a first-come-first-served basis to schools that attended a conference on the program and confirmed, in writing, their commitment to deliver the program and participate in the evaluation. Eligible students attending the 22 schools were in years 2-5 (U.S. grades 1-4) at baseline. A total of 3,480 students were recruited and 3,214 (92.4%) consented and completed the baseline assessment.

Assignment: Schools (clusters) were randomly allocated on a 1:1 basis to intervention and control conditions. Randomization was carried out by an independent registered trials unit after stratifying by size of school and proportion of children eligible for free school meals. The 11 intervention schools had 1,588 eligible students, and the 11 control schools had 1,892 eligible students. Of the eligible intervention students, 1,578 (99.4%) consented and completed the baseline assessment. Of the eligible control students, 1,636 (86.5%) consented and completed the baseline assessment - the low rate due in large part to one school dropping out before baseline. Control schools were asked to continue with their usual practices in line with their bullying policy.

Assessments/Attrition: Baseline assessments occurred in June/July 2013 for students in Years 2 to 5 (i.e., about to enter Years 3 to 6) and 12 months post-baseline (June/July 2014) for students coming to the end of Years 3 to 6. Two control schools withdrew during the first year - one before and one after baseline data collection. Retention rates were 82.5% for the consented students who completed the baseline assessment and 76.2% for all eligible students.

Sample:

The sample was 45% male and averaged 8.9 years of age. The predominately white sample (68%) also included 3% Asian, 1% black, and 3% mixed. About 26% of the students were missing race information.

Measures:

The primary outcome was student self-reported victimization measured with the Bully/Victim Questionnaire, which is part of the KiVa online survey that the students complete. No monitoring of survey implementation was undertaken by the research team. Secondary outcomes included self-reported bullying perpetration, also measured with the Bully/Victim Questionnaire, six measures of teacher-reported behavior and emotional well-being using the Strengths and Difficulties Questionnaire, and a measure of half-day absences using school records. The authors noted that the "follow-up SDQs were completed by different teachers as students had moved to a different class." However, the teachers doing the follow-up rating helped deliver the program.

The study used well-validated measures and cited other studies but did not report reliability or validity information for the sample. The authors only noted (p. 623) that "there appeared to be variation in how the student survey was implemented, although its impact on results is unclear."

Analysis:

The analyses used Generalized Estimating Equations (GEEs) with information sandwich ("robust") estimates of standard errors assuming an exchangeable correlation structure. Linear, logistic, and Poisson GEE models were used depending on the measurement scale of the outcome. Analyses were adjusted for the following covariates: the baseline score for the outcome; the school-level variables of school size and free school meals eligibility at baseline; and child gender, age, special education needs status, and free school meals status.

All methods allowed for the correlation of outcomes within schools (clusters). However, the level-2 sample size of 21 may not be large enough to accurately estimate the standard errors, and the result may be to overstate the significance of the tests.

Missing Data Method: The analyses used multiple imputation to impute missing data for participants who consented and completed the baseline assessment.

Intent-to-Treat: The analysis excluded only the one control school and students without baseline data. All others were analyzed according to their randomized condition, irrespective of the level of intervention received.

Outcomes

Implementation Fidelity:

Self-completed teacher lesson records were missing for 58% of the sample but suggested that adherence was good where reported. For the available lesson records, teachers reported delivering 90% of lesson components on average. Regarding dosage, average lesson delivery times (mean of 79 minutes) were substantially less than the recommended 90 minutes. The mean total score for the school observation measure was 8.0 out of 14.

Baseline Equivalence:

Table 1 presents condition means for five sociodemographic measures and eight outcome measures but without significance tests or effect sizes. Some of the percentages for sociodemographic measures differed widely across conditions, due largely to more missing data in the control schools (e.g., 14.8% of intervention students versus 31.9% of control students were missing the race measure).

Differential Attrition:

The study did not test for differential attrition but used multiple imputation for missing data. The overall attrition rate and the condition differences in attrition rates showed mixed evidence of differential attrition bias. Based on the standards of the What Works Clearinghouse and using attrition from the consented sample with baseline data, attrition meets the optimistic but not the cautious standard.

Posttest:

The study found no significant effects of the intervention on the primary outcome measure of child-reported victimization or any of the secondary outcomes. Also, there was little evidence that the effect of the intervention on victimization differed by gender or between children who were and were not victimized at baseline.

Long-Term:

Not examined.

KiVa Antibullying Program

Program Outcomes

Program Type

Program Setting

Continuum of Intervention

Age

Gender

Race/Ethnicity

Endorsements

Program Information Contact

Program Developer/Owner

Brief Description of the Program

Outcomes

Brief Evaluation Methodology

Risk Factors

Protective Factors

Subgroup Analysis Details

Training Certification Process

No information is available

No information is available

Program Developer/Owner

Program Outcomes

Program Specifics

Program Type

Program Setting

Continuum of Intervention

Program Goals

Population Demographics

Target Population

Age

Gender

Race/Ethnicity

Subgroup Analysis Details

Risk/Protective Factor Domain

Risk/Protective Factors

Risk Factors

Protective Factors

Brief Description of the Program

Description of the Program

Theoretical Rationale

Theoretical Orientation

Brief Evaluation Methodology

Outcomes (Brief, over all studies)

Outcomes

Mediating Effects

Effect Size

Generalizability

Potential Limitations

Endorsements

Program Information Contact

References

Study 1

Study 2

Study 3

Study 4

Study 5

Study 6

Study 7

Study 1

Study 2

Study 3

Study 4

Study 5

Study 6

Study 7

Contact

Sign up for Newsletter