Predicting Opioid Overdoses Post-Prison Release Artwork

EPITalk: Behind the Paper

This stimulating podcast series from the Annals of Epidemiology takes you behind the scenes of groundbreaking articles recently published in the journal. Join Editor-in-Chief, Patrick Sullivan, and journal authors for thought-provoking conversations on the latest findings and developments in epidemiologic and methodologic research.

All Episodes

EPITalk: Behind the Paper

Predicting Opioid Overdoses Post-Prison Release

August 28, 2024 • Annals of Epidemiology

0:00 | 29:45

Send a text

Dr. Prasad Patil joins us to discuss his article, “Using decision tree models and comprehensive statewide data to predict opioid overdoses following prison release” published in the June 2024 (Vol. 94) issue of Annals of Epidemiology. In this study, the researchers employed decision tree models to predict opioid overdose within 90 days of release from Massachusetts state prisons, as well as identified a variety of factors that were most influential in predicting opioid overdose.

Read the full article here:
https://www.sciencedirect.com/science/article/abs/pii/S1047279724000620

Episode Credits:

Executive Producer: Sabrina Debas
Technical Producer: Paula Burrows
Annals of Epidemiology is published by Elsevier.

Patrick Sullivan: 0:12

Hello, you're listening to EpTalk Behind the Paper, a monthly podcast from the Annals of Epidemiology. I'm Patrick Sullivan, editor-in-chief of the journal, and in this series we take you behind the scenes of some of the latest epidemiologic research featured in our journal. Today we're talking with Dr Prasad Patil about his article Using Decision Tree Models and Comprehensive Statewide Data to Predict Opioid Overdoses Following Prison Release. You can find the full article online in the June 2024 issue of the journal at www. annalsofepidemiology. org. So let me introduce our guest today. Dr. Prasad Patil is an assistant professor of biostatistics at the Boston University School of Public Health. His research interests include machine learning, applications in public health, reproducibility and replicability and training prediction models under multi-study heterogeneity. Areas of application include tuberculosis genomics, air pollution source apportionment, opioid overdose prediction, which is what we'll be talking about today, and analyses of indices of well-being at various spatial resolutions. Dr. Patil, thank you so much for joining us today.

Prasad Patil: 1:30

Thank you so much for having me and for highlighting this work. It's really an awesome opportunity.

Patrick Sullivan: 1:32

Well, it's work that we were really excited about when we saw it, because the topic of opioid overdoses, I think, is such an important one in our time and in epidemiology, and then the methods that you used were also of great interest. So can you just start out by giving us a little background about the problem that you describe? Why is this issue important?

Prasad Patil: 1:52

Sure. So our paper studies the risk of opioid overdose in the 90 days after release from incarceration. So we're specifically looking at a subpopulation of incarcerated individuals in the state of Massachusetts. This data was collected from 2015 to 2020.

Prasad Patil: 2:11

And well, why the issue is important, I think to us at least, I find this to be an intersection of two extremely vulnerable and stigmatized populations. So you have those suffering from opioid use and potential opioid overdose and those who are in touch with the incarceration system, who have been previously incarcerated. Because of this sort of intersection, this is a subpopulation that is often not studied all that well, I would say, and the risk factors that affect individuals who are being released from incarceration can sometimes be different than those that are sort of important to look at for a general population, even among those who are opioid users. So things like reduced tolerance, increased stress, especially increased instability these things can greatly increase the risk of overdose post-release, and so we were specifically interested in trying to assess what are these risk factors look like and what sorts of methods can we apply to what I'll describe later as a very unique data setting and data opportunity. But in general, it's very difficult to kind of study this question due to a lack of really research quality data for this group.

Patrick Sullivan: 3:21

Great. So you mentioned a little bit about the purpose of the study, but can you walk us through your study design and, given these challenges with accessing data, you know how did you address that and develop these data sources to be able to carry out this research with this methodology?

Prasad Patil: 3:37

Sure, yeah, I mean I would say and you know we can touch upon this later but I think the thing we learned the most from this work is that it was extremely hard to kind of apply the methods we were trying to apply here and really glean any actionable insights. That being said, the data set that we worked on is a fairly unique linkage data set that the state of Massachusetts has put together. It's called the Public Health Database, and they have kind of painstakingly and almost amazingly been able to link various administrative databases together. So we get identifiers for individuals who have been seen by various state services, and so our methodology kind of had two phases here.

Prasad Patil: 4:24

Although these databases are linked, they're not necessarily in a form in which we are in a ready-to-analyze form. So the first phase was really trying to build what we called incarceration-overdose pair. So we wanted to pair up effectively lengths of incarceration and really 90 days post the release of incarceration with potential overdose events, and so in this PHD, public health database, there's many different data resources. What we looked at specifically were Department of Corrections records this is where we got information about incarceration and other predictors, and we tried to link these with acute care, hospitalization, ambulance records and death records to try to effectively match up the same individual across these databases to say that this individual is released at this time in the 90 days post-release, Do we observe an event in any of these other databases?

Prasad Patil: 5:20

And so this in and of itself was a fairly difficult process. So we had to do a bunch of merges and transpositions to kind of connect these things together. And you know, you had individuals who were incarcerated multiple times and so thinking about how to deal with the fact that we have sort of repeat observations of the same individual. In the end we ended up treating these as independent records, which kind of reduces the amount of information we're working with to some extent, because we can't really compare longitudinally here. But we built these records for each individual that tell you whether or not an event occurred 90 days post-incarceration for that given sort of incarceration stay. And then the second piece of this was we fit what are called decision trees this is a machine learning algorithm to the entire cohort that we had built, as well as race stratified subsets, and I can kind of expound upon why we did that.

Patrick Sullivan: 6:11

Yeah, I think it may be helpful for people just to understand. I mean, we see this like decision tree models, but how would you sort of explain that to like an earlier career epidemiologist who's not so familiar with what a decision tree model does, and why was it a good choice in this particular study?

Prasad Patil: 6:29

Yeah, absolutely. You know, even for myself, coming from a biostatistics background, I've had to learn a lot about machine learning before I sort of got into all of these applications.

Prasad Patil: 6:38

But decision tree algorithms, the way these basically work, it's kind of a different way of thinking about trying to find out what variables are important for predicting a given outcome, and so their goal 100% is to try to predict something. So here we're trying to predict whether or not an event occurs after an individual is released from prison. What this algorithm does is you give it the entire data set, you give it all the variables that you've measured as potential predictors and it will go through each one and build a binary decision rule, right? So the decision rule, for example, let's say one of the variables we're including in the model is age, and let's say the range of ages in our data set is 18 to 65. It will go through every value of age that is represented in the data set and build a rule that says let's group our observations into those that are under 20 and over 20, under 21 and over 21, for every value going from 18 to 65.

Prasad Patil: 7:33

And with this rule it'll then see how well have I separated this thing that I'm trying to predict. So here I'm trying to predict whether or not an overdose event occurs. If my rule is, everyone less than 20 go to the left, everyone over 20 go to the right, have I separated out the overdoses from the non-overdoses and you'll have sort of what you call a loss function to measure whether or not you've done that well, and you'll do that for every possible rule. You can build for every variable and pick the one rule that does the separation the best.

Patrick Sullivan: 8:04

The model really looks at all those, optimizes the cut point, for example for continuous variables. You're saying it'll check each possible cut point and see what explains the biggest amount of variance, essentially.

Prasad Patil: 8:17

Exactly right, or what separates the thing you're trying to classify in this case? Yeah, and so then, the way that this algorithm is what's called recursive. So now, once you've split the data set into two pieces with this initial rule that you found to be the best, you do it again on each of the two pieces, right? And you continue to split until you've sort of reached some predefined endpoint for this. And so, for those who have a background in, you know, regression, modeling and things like that, first of all you can see how this is quite different in terms of a data approach, and you can also start to see why some of these algorithms are a little harder to interpret and are a little bit more greedy about trying to find the best possible predictive option.

Prasad Patil: 8:59

So what you end up with is it's called a decision tree, because it kind of looks like a tree, right, it starts with a single rule, and then the next level has two rules and the next level goes on with these binary rules until you end up at some end point. And so, for a new observation, you would check where it falls on either side of each rule, right? So for a new observation, let's say they're over 20 years old. So we go to the right in our tree. Then our next rule let's say, checks how long their length of incarceration was and you make a decision going left or right and you continue kind of cascading down the tree until you end up at some endpoint that assigns a prediction for that person, and that endpoint will be either mostly all overdose individuals or all non-overdose individuals. That'll kind of determine what your eventual prediction is.

Prasad Patil: 9:45

But again I want to emphasize that the goal is to predict, and so then going back to this and figuring out how things are associated, what interactions look like and things like that is a bit of a challenge with this type of method.

Patrick Sullivan: 9:59

So, given that goal of prediction, what were some of the main findings after you applied this method? What were some of the key takeaways after you applied this method? What were some of the key takeaways from the analysis that you did?

Prasad Patil: 10:08

Sure.

Prasad Patil: 10:08

So, to describe the data a little bit, our final data set had about 5% overdoses in it, so we had 14,000 or so observations and about 5% of those actually we were actually able to measure an overdose for that individual in the data.

Prasad Patil: 10:23

That doesn't necessarily mean that other people didn't have one, it's just that we weren't able to measure based on what we had, and we fit this decision tree algorithm to the entire data set and we found that it exhibited pretty good sensitivity.

Prasad Patil: 10:37

We did some things what's called case weighting to try to prioritize predicting overdoses over non-overdoses, because we have so few in relation, and we were trying to increase the accuracy and the sensitivity of this method, and the sensitivity overall was pretty good. But we found that this was mostly driven by white, non-hispanic individuals. They made up the majority of the data set and they made up the majority of the overdoses and so put into this bucket of no overdose, which we knew was not true for Black individuals, for Hispanic, for Asian individuals. Even in our data set, of which we had a few, there were some overdoses that were not being picked up by this method, and so we fit these race stratified models to try to understand what. Does the picture look different if we try to fit a model specifically within these subgroups rather than overall, and we found some different risk factors for Black individuals, for Hispanic individuals, that looked different than this overall model.

Prasad Patil: 11:39

So we found the overall one was not working very well, but these metrics were more balanced when we broke it up, stratified in that manner, and overall we found the most important things across a number of these models. Although none of these models performed particularly well, I would say, in terms of their accuracy, they all found that spending a longer time at the most recent facility was associated with a decreased risk of overdose, and involuntary commitment was associated with an increased risk of overdose.

Patrick Sullivan: 12:14

So when you say time at the facilities, this is more time at the last sort of incarceration facility.

Prasad Patil: 12:21

Yeah, that's right. So this was something we found out later in the analysis. So initially this variable was coded as length of stay effectively, and so we thought that meant the length of their term effectively. What we came to find out after some discussion with Department of Corrections is that this variable coded the length of stay in the most proximal facility, which means for those who aren't familiar, people are often moved around from institution to institution within the system, and so this variable just captures how long you were at the last place that you were imprisoned.

Patrick Sullivan: 12:53

So might that be a marker for people who have shorter duration at the last facility, for individuals who may have complex behavioral or medical problems that get moved around for the purpose of managing those? I mean, is it really about the duration or do you think that might be a marker for, like, what does it mean to be moved frequently in the system? What is that confounded with? How do you interpret that? Finding but that's what's going on in my head is like what are the characteristics of people who are moved more or less frequently?

Prasad Patil: 13:20

Right- No, it's absolutely- It's complicated and I don't think there's any overarching characteristic that would define these people. So, for example, it could be a short stay because it's a short sentence, right, so that it could just be that you were put in for something that is associated with a short sentence, then you're let out. Or it could be that you're put in a holding facility and then you're moved to a different facility, depending upon what's going on with your case, or something along those lines. And I have to say I don't want to go too far into this because this is not my expertise. I'm more on the method sets. I don't. I definitely don't want to say something wrong, but this is how I understand it.

Patrick Sullivan: 13:57

But I think I mean from a perspective as epidemiologists, I think sometimes the important thing is to identify our findings as hypothesis generating and then either handing it over to the folks who know more deeply what that would mean, but I think like a conversation with folks in the correctional system to say, like what are those, what are these things associated with? I mean, I think it's just an interesting conversation and I really appreciate delineating our, like our roles as epidemiologists and how far our knowledge goes, because some of these are really deeply idiosyncratic questions about how the correctional system works. So this leads back and should raise those questions.

Prasad Patil: 14:37

For sure, and I think our- the expertise in our group we centered around this notion of instability which I had mentioned before.

Prasad Patil: 14:45

Part of what we had been doing in this project as a whole is we actually did a bunch of lit review on identified risk factors for opioid overdose post incarceration and we did community outreach. So folks from our research team ran focus groups and sort of showed people who have been in the system or who work in the system, who are social services workers, some of the risk factors we'd identified and asked them to kind of fill in the gaps and tell us what seems relevant, what seems irrelevant. And they spoke a lot more about more abstract things like instability was a big one fear, stigma and how much these things influenced the desire to use again or the risk of overdose, and so part of that made us want to link moving around shorter terms, being in and out of incarceration with this kind of overarching principle of instability, made us want to link, you know, moving around shorter terms, being in and out of incarceration with this kind of overarching principle of instability.

Patrick Sullivan: 15:39

It is kind of always just fascinating to me that when you talk about like the complex web of causality and these constructs like fear and instability and vulnerability, it's sometimes it's amazing to me that we find signals in our we have some pretty crude, pretty dist are the actual levers of change and there's a whole nother process that sits behind that. So I think the continuum from the empirical findings and then all the other kinds of pieces that you described focus groups, individual in depth interviews, expert interviews to try and figure out what sits behind that. But sometimes it does surprise me that we get strong, clear signals of things that when you unpack them are a very complex, you know, set of like, set of social determinants.

Prasad Patil: 16:42

Absolutely, and I think I mean you know, if you, if you look at this finding on its face, right, basically, a rule that was very common across a lot of these decision trees we fit was that if you had a longer length of stay, you were predicted to not have an overdose. Now, if you want to, you know, take a very simplistic view on what you should do. Based on that information, you might conclude that you should, you know, assign longer sentences. Right, and, of course, that is incorrect and it is why we need to, you know, to partner with people who understand the actual situation and who can actually provide insight on what these different things mean, before we jump to what, on their face, seem like useless conclusions. Right?

Patrick Sullivan: 17:28

Yeah, so can you sort of talk about. We've gotten into this a little bit, but what do you see as some of the main strengths and some of the important limitations of your study? You talked about them some in the paper, but can you just sort of recap for us what's strong about this method and what are some of the limitations folks need to consider?

Prasad Patil: 17:45

Yeah, I think the greatest strength of this work was really the ability to work with this sort of state curated data warehouse and it's again I want to highlight that it's, I think, unique in the country.

Prasad Patil: 17:58

I don't know many, if any states that have linked together these types of databases at this level yet, and just the fact that we were able to conduct this analysis, I think, is worth talking about right, and it's worth highlighting to other state agencies to say that these kinds of things are possible if we start to curate our resources and link them together.

Patrick Sullivan: 18:21

And shout out to Massachusetts for organizing this, because a lot of states don't. I think it often takes state resources, investment of state resources, to do this and I think in a lot of states there's not the priority given to this. So props to Massachusetts for sure.

Prasad Patil: 18:35

Absolutely. And I think the other big thing is here we have some semblance of quantitative information that kind of backs some of the, I would say, more qualitative findings in this field previously. So, like I described, a lot of the risk factors come from smaller studies, they come from talking with the community, and so now we wanted to try to supplement that with some of this algorithmic modeling to say like, well, what happens when we actually try to predict something like overdose? What do we find? How does that agree with or disagree with the existing findings? And I think it adds to that conversation to say that you can fit these types of models, these are the associations or the predictions that we're sort of getting, and it kind of shows that there is some efficacy of applying these machine learning types of algorithms to this problem.

Prasad Patil: 19:24

I think in terms of weaknesses or limitations. Well, you know, the models are not good enough to use, we don't understand them well enough and they don't have, you know, prediction metrics that would say, you know, let's apply this to new individuals to predict their risk of overdose. They're not nearly at that point. I mentioned the kind of crude quantification of overdose. We only have those who we can see. So there are a lot of people who have likely overdosed but have not come in contact with the state service, and so we are probably grossly underestimating the overdoses that actually occurred. And there were a number of computational limitations. One of the reasons we used decision trees was because it was like the most sophisticated ML technique we were able to apply in this environment. You know, we had to actually go to the state department to run code or send code to our liaisons at the state department, because all this data is kind of under lock and key. So the process was pretty challenging and we weren't really able to do something super sophisticated.

Patrick Sullivan: 20:25

So I want to turn now to a part of the podcast we call Behind the Paper, and it's really just to try, you know, especially for people who are earlier in their career and who see this kind of published work, to help us think about, like, how we actually are able to do this work as humans, you know, as people. So I wonder what the biggest challenge you faced was from it's often getting funding, like in the conduct of the research, like what did you find challenging and how did you overcome that?

Prasad Patil: 20:53

Yeah, I think I mean for me again, coming from a more method standpoint, I think, the data quality and the computational limitations that we faced in trying to work on this problem, I would say we probably we really didn't overcome these.

Prasad Patil: 21:08

We did the best that we could under the, you know, data conditions that we were able to work in. But I think if you read the paper you'll see that, and like I mentioned, these are not actionable findings yet. They're telling us something, but if you compare the application of machine learning methods and other facets, even of medicine to what we're trying to do here, we're not even close really, and I think it speaks to this more general issue of you know, this is already a population that's difficult to measure and difficult to survey, and so the data that we get from them are not usually of very high quality to try to do these analyses, and so there's already a gap, and so then we can't apply very nice and fancy methods to these data and I feel like we can't learn as much as we can in other settings. Necessarily.

Patrick Sullivan: 21:56

Yeah, all of our sort of EPI 101 things about misclassification of data are just like the characteristics of data, it seems like, and there's reasons, I think, for systems not to have the data, and also for individuals. Sometimes there's disincentives or incentives to reporting you know behavior, so there's all kinds of input issues.

Patrick Sullivan: 22:19

I do wonder, like if you've sort of talked about this method, but if there's a listener who wants to learn more about, like, the field of decision tree models, are there any resources or sort of papers or websites? How would someone take a first step into this? Where would you send them to sort of look?

Prasad Patil: 22:36

The great thing is that there's almost every resource under the sun on machine learning is available. It's like a very popular topic lately, so I mean you can start off on YouTube or something like that and just look at a few videos to get familiar with what these methods are doing. I think it's not so much about access, right. If you're a student, your school, your university probably has a course or you have open courseware and things like that that are really quite rich in detail these days. It's more to me a question of what it is you want to learn. Do you want to learn how to apply these methods? Do you want to learn how they work?

Prasad Patil: 23:11

It really depends on what level you want to enter in. I think there's from the very theoretical to the very applied. My personal opinion is it's really worth seeking out material where they provide you some semblance of detail of what the algorithm is actually doing so kind of like we talked through what decision trees try to do, because it's very easy to apply these things and it's not so easy to understand what they're doing and what it means. So I think my first question to those who want to learn would be like why do you want to learn? What do you want to do? to learn. What do you want to do?

Patrick Sullivan: 23:47

touch on the idea of algorithmic equity. So what is algorithmic equity and why is it important?

Prasad Patil: 23:54

So as far as I know, this is not a very well-defined term. This is something that I've been thinking about as I worked on this project and again to give some background. So algorithmic bias is a huge topic of interest in the machine learning world and folks may have heard about these very, you know, like disturbing cases where people have deployed machine learning or artificial intelligence algorithms on data sets where structural biases exist and those things then get propagated. So, for example, again in this world of incarceration, there was an algorithm that was trained to try to help with prison sentencing and the goal was basically to help judges decide, you know, what sort of sentence should be assigned, given historical information, and that historical information has racial biases in it. Effectively, individuals who were non-white were given longer sentences, even if all their other characteristics were exactly the same as a white individual, and so the algorithm picked up that pattern and propagated that issue right. And so there are a lot of people working on you know what we do to take these complicated algorithms and reduce the risk of bias right, to try to, you know, hide potentially risky information from these algorithms and try to make them less biased on these sort of societal levels. What I think about as algorithmic equity is something slightly different, which is kind of what I was describing before.

Prasad Patil: 25:17

So here we're working on opioid overdose in the incarcerated population. This is a vulnerable population that's understudied. There's already a big gap, right, they're needing, and not enough work is being done in this realm. Add to that the fact that the data is really complicated and it's not a very attractive place for algorithmic innovation either. Right, so we use decision trees here for many reasons, but this is like a very old, classical method. There's a lot better stuff out there now that we wish we could have used here, but it's simply not suited for this problem, and so my question really is like how do we get people who are working in methods and working on algorithms to try to improve what we can do for these difficult data settings and not so much, you know, improve on algorithms that already work really well on rich and clean data sets?

Patrick Sullivan: 26:09

Yeah, I think this idea that the kind of selection biases or the confounding that occurs in the data sources may be picked up by the algorithms as signal and given the history of racial inequity in this criminal justice system, this is a special problem.

Patrick Sullivan: 26:25

So thank you for calling it out and I think that naming it is often helpful. Sure, yeah, so I'm going to make a hard turn here to one last question. But I'm just very interested in how, as professionals, we navigate our careers and how we make these transitions from being in educational settings to be in sometimes in postdoctoral settings, to be in faculty settings. So I wonder if you could give your younger scientific self, you know, one piece of advice. Thinking back to a point in your training or in your postdoctoral preparation that was a challenging point for you Like what insight do you have like being able to do this work that you might feed back to that earlier you, at an earlier point in your career about you know what's been helpful or what seemed hard, that didn't matter, or just sort of. What encouragement would you give to your younger career self?

Prasad Patil: 27:13

That's a great question.

Prasad Patil: 27:14

I think what would be relevant to me back then, and even probably now, is maybe to not be so shy and to try to make connections with other people and really sort of seek that out.

Prasad Patil: 27:26

I think when I was doing my PhD work and other PhD students can probably relate you get very siloed into what you are focused on and you're sort of you know at least to me, I convinced myself that you know this is like I work in a meritocracy If I do really good work it'll be recognized, and so I should really, you know, sit here in my room and focus on you know and I work.

Prasad Patil: 27:48

You know I work on like math. I work on theorems and things like that. But I think, like any other field, it's really important to get to know other people, and I think that's important for science. I think you can really only do so much you know by yourself cooped up, and you know getting to know others fosters collaboration. It gives you more opportunities to present your work and to let other people know about what you're working on, and so I would tell myself to try to take those opportunities as often as I can and try not to let what I'm doing kind of impede me from learning about what others are doing, and trying to talk about it with other people.

Patrick Sullivan: 28:25

What an insightful message, and one of my favorite axioms is the best work is done across disciplines that seem very disparate, like the interesting stuff is always at the intersection. So thanks for sharing. These questions are always a little bit vulnerable and feel a little bit on the spot, but I appreciate you sharing that. No, that's been so nice talking to you. Thank you for your focus on this particular area and especially for focusing on the health of a really vulnerable population and for sharing your methods. It was great to have you on the podcast and thanks again for bringing this work to Annals of Epidemiology.

Patrick Sullivan: 29:03

It's my pleasure. Thanks so much for having me.

Patrick Sullivan: 29:07

I'm your host, Patrick Sullivan. Thanks for tuning in to this episode and see you next time on EPITalk, brought to you by Annals of Epidemiology, the official journal of the American College of Epidemiology. For a transcript of this podcast or to read the article featured on this episode and more from the journal, you can visit us online at www. annalsofepidemiology. org.

Patrick S. Sullivan, DVM, PhD

Host