About the Project
Bullying is a pervasive and serious problem in K-12 schools across the United States and around the world. Research consistently shows that students who experience bullying — whether as targets, perpetrators, or bystanders — suffer significant negative consequences for their mental health, academic performance, and long-term well-being. Early identification of students who are vulnerable to bullying is a critical step toward intervention and prevention.
Deerhold was engaged to design and develop a software solution that leverages machine learning to assess vulnerability for bullying among K-12 students. The goal of the project was to create a tool that could help school administrators, counselors, and educators identify at-risk students before bullying incidents escalate, enabling timely and targeted interventions.
The Challenge
Identifying students who are vulnerable to bullying is inherently complex. Bullying behavior is influenced by a wide range of individual, social, and environmental factors, including a student's social network, their academic performance, demographic characteristics, and prior history of bullying involvement. Traditional approaches — such as surveys and self-reporting — are limited by their reliance on students' willingness and ability to accurately report their experiences.
The challenge for Deerhold was to design a machine learning-based system that could:
- Integrate and analyze multiple data sources — including academic records, attendance data, disciplinary records, and social network data — to build a comprehensive picture of each student's risk profile.
- Develop predictive models capable of identifying students at elevated risk of bullying victimization or perpetration with sufficient accuracy to be actionable.
- Present findings in a way that was interpretable and useful to school administrators and counselors — not just a black-box risk score, but actionable insights that could guide interventions.
- Protect student privacy and ensure that the system complied with applicable regulations, including FERPA and COPPA.
Our Software Development Approach
Deerhold's approach to this project combined deep expertise in machine learning and data science with a rigorous, human-centered design process. We began with an extensive literature review and consultation with education researchers and school psychologists to identify the key risk factors associated with bullying vulnerability.
Our data science team then designed a multi-stage machine learning pipeline that:
- Data Ingestion and Preprocessing: Ingested data from multiple school information systems, normalized it to a consistent schema, and applied privacy-preserving techniques — including de-identification and differential privacy — to protect student information.
- Feature Engineering: Developed a rich set of features capturing individual student characteristics, social network topology (derived from classroom seating assignments and group project participation records), and temporal patterns in academic and behavioral data.
- Model Development: Trained and evaluated multiple machine learning models — including gradient boosting, random forests, and neural networks — on labeled training data derived from historical bullying incident reports. Selected the best-performing model based on cross-validated AUC-ROC and precision- recall metrics.
- Explainability: Applied SHAP (SHapley Additive exPlanations) to provide interpretable explanations for individual risk predictions, enabling counselors and administrators to understand which factors were driving a particular student's risk score.
The Software Solution
The result of this work was a web-based application designed for use by school administrators and counselors. The application provided:
- A risk dashboard displaying a ranked list of students by estimated bullying vulnerability risk, with color-coded risk tiers (high, medium, low) to enable efficient triage.
- Individual student profiles showing detailed risk factor breakdowns, historical trends, and SHAP-based explanations of the factors driving each student's risk score.
- A network visualization tool allowing counselors to explore the social graph of a classroom or grade level, identifying socially isolated students and potential clique dynamics that may contribute to bullying risk.
- An intervention recommendation engine that suggested evidence-based intervention strategies tailored to each student's specific risk profile.
- Robust access controls and audit logging to ensure that sensitive student data was accessible only to authorized personnel and that all access was logged for compliance purposes.
Software Development Outcomes
The bullying vulnerability assessment software developed by Deerhold delivered measurable outcomes for the schools and districts that piloted the system:
- Early Identification: The predictive model achieved an AUC-ROC of 0.82 on held-out test data, significantly outperforming baseline approaches and enabling counselors to identify at-risk students weeks before bullying incidents were reported.
- Actionable Insights: Counselors and administrators reported that the SHAP-based explanations and intervention recommendations made the tool genuinely useful in practice — not just a risk score, but a starting point for targeted conversations and interventions.
- Reduced Incident Rates: In pilot schools, reported bullying incidents declined by approximately 18% in the semester following deployment of the tool, compared to the same period in the prior year.
- Privacy Compliance: The solution passed legal and privacy reviews conducted by the school districts' legal counsel, confirming compliance with FERPA and applicable state student privacy laws.
- Scalability: The cloud-native architecture of the solution enabled it to scale to support multiple school districts simultaneously, with no significant degradation in performance.