Transcription

Using Predictive Modelingto Improve OutcomesFor Children in Allegheny County

Key PartnersResearch TeamEvaluators- Rhema Vaithianathan, Auckland University ofTechnologyProcess- Emily Putnam-Hornstein, USCHornby-Zellar Associates- Irene de Haan, University of Auckland- Marianne Bitler, UC IrvineImpact- Tim Maloney, Auckland University of TechnologyStanford University- Nan Jiang, Auckland University of TechnologyEthicsTechnology- Tim Dare, University of AucklandImplementation- Eileen Gambrill, UC BerkeleyDeloitte

Today: Using Integrated Datato Inform Decision-MakingIn Allegheny County, rich data are availableto case workers to help inform initialmaltreatment screening decisions at thechild protection hotline, but No standardized protocols for usingthese data to make referralscreening decisions No method for systematicallyweighting this information in anequitable manner across all referrals No understanding of whatinformation is correlated / predictsfuture adverse outcomes forchildren5

Developinga Screening Score The screening score is from 1 to 20 The higher the score, the higher the chance of the future event(e.g., abuse, placement, re-referral) according to the data

Researchers built a screening model based oninformation that we already collectThey identified more than 100 factorsthat predict future referral or placementTo test if the model might improve the accuracy ofscreening decisions, we scored thousands ofhistorical maltreatment calls and then followed thechildren in subsequent referrals to see how oftenthe model was correct 7

The Results: Re-Referrals

The Results: Out-of-Home Placements

The Results: Out-of-Home Placements

Under current practice:27% of highest risk caseswere screened out —of these, 1 in 3 are re-referred and placed within2 years of the initial screened out call48% of lowest risk caseswere screened in —and yet only 1.4% of thoseare placed within 2 years.11

Children’sHospital Validation Allegheny County entered into a research agreement with Children’s Hospitalof Pittsburgh of UPMC into order to study relationships between the childwelfare risk modeling and injury data. Child welfare referrals were matched with hospital event data (includingemergency department visits and in-patient admissions) from February 3,2002 to December 31, 2015.12

.008Proportion of incidence.002.004.006 Over a broad range ofinjury types there is apositive correlationbetween the scores1 atcall referral and the rateof hospital events.Figure 5: Self-inflicted Injury0Children’sHospitalValidation0510Risk score category15Note: age of children is restricted to between 7 and 17 for self-inflicted injuries.1maximum placement risk score ever received for each child in the referral data20

Preparing for Implementation

Ethics Assessment Tool independently reviewed by ethicists from University of Auckland andUC Berkeley Concluded that there would be “significant ethical issues in not using themost accurate risk prediction measure.” Among key opinions: The tool does not access any data that workers were not already ableto utilize in decision-making It is likely more accurate and more transparent than existingdecision-making processes The tool may reduce burdens of stigmatization by allowing for moreeffective targeting of services(cont.)15

Ethics AssessmentAmong key opinions (cont.): Racial disparities are already present in data at many decision points, andcontinued vigilance will be required to avoid reinforcement of past biases.However, the writers note that: The predicted designation of risk is designed to prompt further in-depthinvestigation into the family’s actual risk status; and The resulting potential interventions are designed to assist families. Training and ongoing monitoring will be key to ensuring and maintainingeffectiveness While identifying at-risk families more effectively, it is further ethically requiredthat the eventual services offered are effective16

Implementing and EvaluatingPredictive ModelingThe Allegheny Family Screening Tool

Family Screening Tool Appearance18

Monitoring Performance 7 months of data through end of February Frequent internal monitoring and support activities:o Bi-monthly leadership meetings with updated data analyseso Tool modifications, functionality fixes as neededo Auto-generated weekly support reports regarding “high scores”screened-outo Informal interviews with screeners, supervisorso Ongoing support activities for contracted process and impactevaluations19

Early Scores Differed from ExpectationsAs data accrued and trends materialized, the first months of the tool yielded: More “No Scores” than expected, including a disproportionate impact onreferrals involving newborns or other very young children Fewer “High” scores than expectedIn response to this, made an alteration of the tool to: Relax the tool’s requirement for a child to have a prior MCI (insteadallowing for a score if any individual is known) Implemented client-matching functionality to gather data from duplicateIDs20

November 29th ImprovementsThis tool modification went live on November 29th, and changed the relativeprevalence of GPS scores in intended ways The rate of “Mandatory” referrals roughly doubled from 4% to 9%Referrals generating no scores dropped roughly in half“High” scores have become the most common score range, supplanting “Medium”21

November 29th Improvements, cont.22

Use of the Tool Since implementation, overall screening rates have remained stable with theprior year’s same period Generally, referrals with higher scores are being screened-in more frequentlyReferral data from 8/1/2016 through 3/6/201723

Score Demographics Racial disparities have been a monitoring priority at all stages of research andimplementation. Race was not explicitly invoked in the algorithms, but the outputs of the tool neverthelessshowed a tendency for black children to receive higher scores than white children. To datethis has borne out in practice as well. The impact evaluation will be assessing racial disparity in greater detail to see if theintroduction of the tool made any positive or negative changes to biases at call screening.Referral data from 8/1/2016 through 1/20/201724

Impact EvaluationThe impact evaluation is underway, and will be focusing on: Accuracy of decisions Reduction in unwarranted variation in decision-making Reduction in disparities Overall referral rates and workloadOutcomes assessed will include: Overall rate of screen-ins Likelihood of screen-outs leading to re-referrals or other adverse outcomes Likelihood of screen-ins not being accepted for services Unwarranted variation in screening decisions Disparity in screening decision25

Process Evaluation Findings 82% felt “somewhat” or “very well” prepared to use the toolfollowing the training. In the early weeks of the tool, 69% reported “occasionally,” “almostalways,” or “always” consciously using the tool to informrecommendations. Some voiced objections to the tool illustrate the tension betweenimmediate allegation and longer-term risk propensity: “the Tool does not take the human element of judgment” intoaccount; “the score frequently has nothing to do with what is actuallygoing on with the situation at hand”26

PublicationReleases Methodology Report (Spring 2017) Independent Ethics Review and County Ethics Response (Spring 2017) Frequently Asked Questions (Summer 2017) Process and Impact Evaluation Reports (TBD)27

OPPORTUNITY #1:Improving Child WelfareDecision MakingOPPORTUNITY #2:Rethinking Prevention ofChild Abuse & Neglect

How well do our child serving systemschoose the right child at the right time?

Not very well:4 in 5 children in this county who died (or nearly died)as a result of abuse were never referred to child welfarebefore the incident.

Generating a “Needs”Score at BirthAs soon as the birth is registeredwe could assign a needs score between 1and 20Predicting a child protectioncase opening by age 3 Vision would be to prioritize highneeds births for upstream earlyintervention support in the hopes ofpreventing the need for later childprotection involvement

Generating a Scoreat BirthOf those who receiveda risk score of 20, 40%of them resulted in anopen case by age 3

Opportunities for Prevention Offer voluntary services at the time of birth Use needs score to prioritize home visiting services through coordinatedintake Use needs score to provide extra support to familes who engage at a familysupport center Proactively reach out to high-risk families who live in a catchment area forfamily support centers Build needs score into screening at Children‘s Hospital33

Using Predictive Modelingto Improve OutcomesFor Children in Allegheny County

Key PartnersResearch TeamEvaluators- Rhema Vaithianathan, Auckland University ofTechnologyProcess- Emily Putnam-Hornstein, USC- Irene de Haan, University of Auckland- Marianne Bitler, UC Irvine- Tim Maloney, Auckland University of Technology- Nan Jiang, Auckland University of TechnologyHornby-Zellar AssociatesImpactStanford UniversityTechnologyEthics- Tim Dare, University of Auckland- Eileen Gambrill, UC BerkeleyImplementationDeloitte

Today: Using Integrated Datato Inform Decision-MakingIn Allegheny County, rich data are availableto case workers to help inform initialmaltreatment screening decisions at thechild protection hotline, but– No standardized protocols for usingthese data to make referralscreening decisions– No method for systematicallyweighting this information in anequitable manner across allreferrals– No understanding of whatinformation is correlated / predictsfuture adverse outcomes forchildren37

Developing a Screening Score The screening score is from 1 to 20The higher the score, the higher the chance of the future event(e.g., abuse, placement, re-referral) according to the data

Researchers built a screening model based oninformation that we already collectThey identified more than 100 factorsthat predict future referral or placementTo test if the model might improve the accuracy ofscreening decisions, we scored thousands ofhistorical maltreatment calls and then followedthe children in subsequent referrals to see howoften the model was correct 39

The Results: Re-Referrals

The Results: Out-of-Home Placements

The Results: Out-of-Home Placements

Under current practice:27% of highest risk caseswere screened out —of these, 1 in 3 are re-referred and placedwithin 2 years of the initial screened out call48% of lowest risk caseswere screened in —and yet only 1.4% of thoseare placed within 2 years.43

Children’s Hospital Validation Allegheny County entered into a research agreement with Children’sHospital of Pittsburgh of UPMC into order to study relationships betweenthe child welfare risk modeling and injury data. Child welfare referrals were matched with hospital event data (includingemergency department visits and in-patient admissions) from February 3,2002 to December 31, 2015.44

1Over a broad range ofinjury types there is apositive correlationbetween the scores1 atcall referral and the rateof hospital events.008Proportion of incidence.002.004.006 Figure 5: Self-inflicted Injury0Children’sHospitalValidation0510Risk score category15Note: age of children is restricted to between 7 and 17 for self-inflicted injuries.maximum placement risk score ever received for each child in the referral data20

Preparing for Implementation

Ethics Assessment Tool independently reviewed by ethicists from University of Aucklandand UC BerkeleyConcluded that there would be “significant ethical issues in not using themost accurate risk prediction measure.”Among key opinions:– The tool does not access any data that workers were not alreadyable to utilize in decision-making– It is likely more accurate and more transparent than existingdecision-making processes– The tool may reduce burdens of stigmatization by allowing for moreeffective targeting of services(cont.)47

Ethics AssessmentAmong key opinions (cont.):– Racial disparities are already present in data at many decision points, andcontinued vigilance will be required to avoid reinforcement of past biases.However, the writers note that: The predicted designation of risk is designed to prompt further in-depthinvestigation into the family’s actual risk status; and The resulting potential interventions are designed to assist families.– Training and ongoing monitoring will be key to ensuring and maintainingeffectiveness– While identifying at-risk families more effectively, it is further ethicallyrequired that the eventual services offered are effective48

Implementing and EvaluatingPredictive ModelingThe Allegheny Family Screening Tool

Family Screening Tool Appearance50

Monitoring Performance 7 months of data through end of February Frequent internal monitoring and support activities:o Bi-monthly leadership meetings with updated data analyseso Tool modifications, functionality fixes as neededo Auto-generated weekly support reports regarding “high scores”screened-outo Informal interviews with screeners, supervisorso Ongoing support activities for contracted process and impactevaluations51

Early Scores Differed from ExpectationsAs data accrued and trends materialized, the first months of the tool yielded: More “No Scores” than expected, including a disproportionate impact onreferrals involving newborns or other very young children Fewer “High” scores than expectedIn response to this, made an alteration of the tool to: Relax the tool’s requirement for a child to have a prior MCI (insteadallowing for a score if any individual is known) Implemented client-matching functionality to gather data from duplicateIDs52

November 29th ImprovementsThis tool modification went live on November 29th, and changed the relativeprevalence of GPS scores in intended ways The rate of “Mandatory” referrals roughly doubled from 4% to 9%Referrals generating no scores dropped roughly in half“High” scores have become the most common score range, supplanting “Medium”53

November 29th Improvements, cont.54

Use of the Tool Since implementation, overall screening rates have remained stable with theprior year’s same period Generally, referrals with higher scores are being screened-in more frequentlyReferral data from 8/1/2016 through 3/6/201755

Score Demographics Racial disparities have been a monitoring priority at all stages of research andimplementation. Race was not explicitly invoked in the algorithms, but the outputs of the tool neverthelessshowed a tendency for black children to receive higher scores than white children. To datethis has borne out in practice as well. The impact evaluation will be assessing racial disparity in greater detail to see if theintroduction of the tool made any positive or negative changes to biases at call screening.Referral data from 8/1/2016 through 1/20/201756

Impact EvaluationThe impact evaluation is underway, and will be focusing on:– Accuracy of decisions– Reduction in unwarranted variation in decision-making– Reduction in disparities– Overall referral rates and workloadOutcomes assessed will include:– Overall rate of screen-ins– Likelihood of screen-outs leading to re-referrals or other adverse outcomes– Likelihood of screen-ins not being accepted for services– Unwarranted variation in screening decisions– Disparity in screening decision57

Process Evaluation Findings 82% felt “somewhat” or “very well” prepared to use the toolfollowing the training. In the early weeks of the tool, 69% reported “occasionally,” “almostalways,” or “always” consciously using the tool to informrecommendations. Some voiced objections to the tool illustrate the tension betweenimmediate allegation and longer-term risk propensity: “the Tool does not take the human element of judgment” intoaccount; “the score frequently has nothing to do with what is actuallygoing on with the situation at hand”58

Publication Releases Methodology Report (Spring 2017)Independent Ethics Review and County Ethics Response (Spring 2017)Frequently Asked Questions (Summer 2017)Process and Impact Evaluation Reports (TBD)59

OPPORTUNITY #1:Improving Child WelfareDecision MakingOPPORTUNITY #2:Rethinking Prevention ofChild Abuse & Neglect

How well do our child serving systemschoose the right child at the right time?

Not very well:4 in 5 children in this county who died (or nearly died)as a result of abuse were never referred to child welfarebefore the incident.

Generating a “Needs”Score at BirthAs soon as the birth is registeredwe could assign a needs score between 1and 20Predicting a child protectioncase opening by age 3– Vision would be to prioritize highneeds births for upstream earlyintervention support in the hopes ofpreventing the need for later childprotection involvement

Generating aScore at BirthOf those who receiveda risk score of 20, 40%of them resulted in anopen case by age 3

Opportunities for Prevention Offer voluntary services at the time of birthUse needs score to prioritize home visiting services through coordinatedintakeUse needs score to provide extra support to familes who engage at a familysupport centerProactively reach out to high-risk families who live in a catchment area forfamily support centersBuild needs score into screening at Children‘s Hospital65

Aug 01, 2017 · Using Predictive Modeling to Improve Outcomes For Children in Allegheny County . Research Team - Rhema Vaithianathan, Auckland University of Technology - Emily Putnam -Hornstein, USC - Irene de Haan, University of Auckland - Marianne Bitler, UC Ir