Skip to main content
An official website of the United States government
Email

Season 2 – Episode 24: BD-STEP: A NCI & VA Training Program for Data Scientists

In this episode, we highlight the Big Data Scientist Training Enhancement Program (BD-STEP), a collaboration between the National Cancer Institute (NCI) and the Veterans Health Administration (VHA). We hear from Dr. Michelle Berny-Lang, Director of the BD-STEP of the NCI Center for Strategic Scientific Initiatives, Dr. Frank Meng, National Director of BD-STEP of the Department of Veterans Affairs, and Dr. Ted Feldman, Data Scientist, former BD-STEP fellow, and current mentor, of the Cooperative Studies Program Informatics Center, Massachusetts Veterans Epidemiology Research and Information Center, Department of Veterans Affairs. The episode discusses the program's goals, structure, benefits, and the importance of AI in healthcare data science.  They also give advice to those interested in pursuing a career in data science.     

We want to hear from you!

Share your thoughts on the podcast and help us shape its future. Your feedback is invaluable in ensuring we deliver the best possible content for aspiring cancer researchers https://www.surveymonkey.com/r/NCIICCPodcast

 Listen and Subscribe to Inside Cancer Careers

Episode Guests

Michelle A. Berny-Lang, PhD - Program Director

Michelle Berny-Lang, PhD

Dr. Michelle Berny-Lang serves as a Program Director in the Center for Strategic Scientific Initiatives (CSSI) at the NCI. In this capacity, she participates in the organization, development, and evaluation of CSSI initiatives, working to facilitate multidisciplinary collaborations among researchers, NCI and NIH colleagues, and external partners.

With the National Institute of Biomedical Imaging and Bioengineering (NIBIB), she coordinates a program in Synthetic Biology and Cancer to bring innovative technology approaches to challenges in cancer research. She is NCI’s lead for the Big Data Scientist Training Enhancement Program (BD-STEP), a collaboration with the Department of Veterans Affairs (VA), which immerses data science fellows into the VA healthcare system to advance research and patient care. Within NCI, she co-chairs the trans-NCI Advisory Committee to CSSI (TACTIC).

Before joining the NCI, Dr. Berny-Lang was a postdoctoral fellow at Harvard Medical School, Boston Children’s Hospital, investigating novel cellular pathways in clinical disorders such as acute coronary syndrome and sickle cell disease. She received her Ph.D. in biomedical engineering from Oregon Health & Science University, with research focused on mechanisms of thrombosis. She holds a bachelor’s degree in bioengineering from Oregon State University.

Frank Meng

Frank Meng, PhD

Dr. Frank Meng is the National Director of the Big Data Scientist Training Enhancement Program (BD-STEP), a postdoctoral and post-Master’s fellowship program in the Department of Veterans Affairs (VA).  He is also Associate Director for the Data Science Core at the VA Boston Cooperative Studies Program Coordinating Center and Assistant Professor of General Internal Medicine at the Boston University Chobanian and Avedisian School of Medicine.  Dr. Meng received is SB from MIT and Ph.D. from UCLA, both in Computer Science.  His research interests include natural language processing of clinical documents and AI validation/safety in healthcare.

 

Ted Feldman

Ted Feldman, PhD

Dr. Ted Feldman is a BD-STEP fellow and mentor, as well as a Data Scientist at the Cooperative Studies Program Informatics Center within the VA Boston Healthcare System. Previously, he served as a Lecturer in Biomedical Informatics at Harvard Medical School. He received his PhD in Applied Physics from Harvard University. 


 


 

Show Notes

Ad: The Worta McCaskill-Stevens Career Development Award for Community Oncology and Prevention Research (K12) 

Your Turn Recommendations: 

Episode Transcript

Oliver Bogler: 
Hello and welcome to Inside Cancer Careers, a podcast from the National Cancer Institute where we explore all the different ways people fight cancer and hear their stories. I'm your host, Oliver Bogler from NCI’s Center for Cancer Training. 

Today, we're talking about an innovative training program using data science to advance cancer research and patient care, which is a collaboration between the NCI and the Veterans Health Administration or VHA.

Listen through to the end of the show to hear our guests make some interesting recommendations and where we invite you to take your turn. And of course, we're always glad to get your feedback on what you hear and suggestions on what you might like us to cover. The show's email is NCIICC@nih.gov. 

So it's a pleasure to welcome Dr. Michelle Berny-Lang, the Director of the Big Data Scientist Training Enhancement Program, BD-STEP, for NCI's Center for Strategic Scientific Initiatives. Welcome.

Michelle Berny-Lang: 
Hello, thank you, it's great to be here.

Oliver Bogler: 
Welcome also to Dr. Frank Meng, National Director for BD-STEP from the Department of Veterans Affairs.

Frank Meng: 
Thanks Oliver, thanks for having me.

Oliver Bogler: 
Also a warm welcome to Dr. Ted Feldman, BD-STEP grad and now mentor and data scientist in the Cooperative Studies Program Informatics Center in the Massachusetts Veterans Epidemiology Research and Information Center, part of the VA as well. Welcome.

Ted Feldman: 
Thank you Oliver, great to be here and thank you for the invite.

Oliver Bogler: 
The BD-STEP program is a unique partnership between the VA and the NCI. Michelle and Frank, as leaders of the program, and Ted as an alum, what is the program and what are its goals?

Michelle Berny-Lang: 
Frank, you maybe want to start with what it looks like now and then maybe can pepper in some history?

Frank Meng: 
Sure. So BD-STEP historically was a postdoctoral fellowship program. And currently it's funded by the VA, specifically the Office of Academic Affiliations or OAA. And so we try to bring in postdocs from various backgrounds to come in and train in the VA, on the VA's data, and also within the VA's healthcare system. And one of the main goals is that we can retain some of these fellows to be permanent data scientists within the VA because we feel there is quite a shortage of the skill set within the workforce. Recently, we've also included besides postdoctoral fellows, also post-masters as well.

Michelle Berny-Lang: 
Yeah, and maybe I can take us back a tiny bit to the history of how this got started. So, looking back almost 10 years at this point, back to 2015, just as Frank said, there was the recognition then that we still very much have now of the need for more data experts across biomedicine and in cancer research.

And so we had been doing some collaborations with the VA and thought, you know, maybe that would be a powerful carrot to get individuals access to VA data, allow them to engage in this, you know, really exciting test bed and allow them to conduct impactful research within the healthcare system, hoping that maybe this would hook some of these individuals that were data savvy or had quantitative skill sets and keep them retained in biomedical research careers, which I think we'll be able to share, we've definitely done some of that.

Oliver Bogler: 
And Ted, as an alum of the program, what was your experience? 

Ted Feldman: 
Yeah, well, I think I sort of owe my career to being a BD- STEP fellow. So it very much shaped my path in deciding where to go after my fellowship and what I can do and really molded much of what I think about data science and really what I know of medical informatics as a subspecialty of the field really.

Oliver Bogler: 
So Frank, you mentioned the data resources at the VA that are kind of the centerpiece or one of the centerpieces of the BD-STEP program. What are those? What's the scale and the scope of these data resources and why are they so valuable as both for cancer research and as an opportunity to train?

Frank Meng: 
Yeah, that's a great question. As the audience may or may not know, the VA is actually the only national health care network in the US. There are other large health care systems, but the only one that spans across the nation is the VA health care system. And so the VA has been collecting electronic data for decades. And so one of the unique features of what the VA has in terms of data resources is something called the Corporate Data Warehouse, or CDW, which is a centralized repository of all the EHR data, electronic health record data, of the patients that get their care within the VA. And this spans decades of time, but also spans millions to tens of millions of patients over the course of time. Of course, currently, at any time active, there's probably like a million or so, few million patients.

And so this includes various things like ICD codes, visit information, and also documentation, free text documents on a lot of the patient's lab results. So it includes many structured data elements and also the unstructured in terms of the documents. The one- Yeah.

Oliver Bogler: 
Just for our audience, a couple of things. The ICD codes, what is that exactly?

Frank Meng: 
Yeah, so the ICD codes are usually used to indicate what the patients and diseases they have, the morbidities that they have, and also various other things about the patient. So traditionally it's been used for insurance billing purposes, but they also found uses in just record keeping and bookkeeping since the VA is both provider and insurer in a sense, but this data is kept as a record for what kind of care was provided to the patient.

Oliver Bogler: 
Yeah, it's the closest thing we have in the States to the systems in the Nordic countries, for example, right? Where there's sort of a national health care that encompasses the patient for their entire life. Frank, you also mentioned structured and unstructured data. Just please tell us the difference there.

Frank Meng: 
Sure, so structured data is mostly data that has a specific kind of form or definition, a format that is predefined. And so that kind of data could be numeric, like numbers, for instance, a patient's weight, body mass index, or blood pressure, various things like that are usually numeric in nature. And so those are stored as numbers within the database and various other things like more nominal, very categorical data sets like perhaps ethnicity, gender would be examples of categorical data. And so those are predefined, somewhat understood, and those are more easily processed by data analysis software and computational systems. And that's what we define as structured. 

Unstructured, would view as data that's less defined in a sense of what it looks like. And so, documents would be a great example of that and imaging would be another great example of that. So documents you have, usually, because we're in America, it's written in English. so anything can be in there, but it takes several more levels or layers of analysis to really understand what's going on. So given a number that's embedded in a document, we don't have any idea whether it's a BMI, it's a weight, it's blood pressure, unless we look at the context that's surrounding it. So there's another layer of analysis.

Oliver Bogler: 
So these are like the notes that people put into the system when they meet with a patient, like a physician's note or a nurse's note like that, right? 

Michelle, so there's this amazing data set that's just phenomenal. What is the reason from the NCI's perspective for this partnership? What are the goals of the BD-STEP program?

Michelle Berny-Lang:
Yeah, that's a great question. I'll give a little fun fact here in that NCI was actually the first investor in the BD-STEP program. We had an interagency agreement to do the first seed funding to get this launched and get, I think, VA really enthusiastic about the program. 
So from our perspective, we see a tremendous opportunity for important cancer research analyses, ability to understand how different cancer treatment modalities are delivered and being able to look at a population that represents different levels of diversity than we may see in other clinical trials or other clinical settings. And, so being able to partner with VA, we hope to keep some of the interests focused in cancer research areas and then train at least a portion of the individuals who are going to continue in an academic career and are going to be our future NCI and NIH applicants continuing their cancer research trajectories.

Oliver Bogler: 
Thanks. So Ted, when you were a BD-STEP Fellow, please tell us a little bit about the research that you did and how was that uniquely possible in this constellation?

Ted Feldman: 
Yeah, sure. So, you know, I think Frank touched upon this in the different types of data. And I think one theme coming from being a bench scientist and a technologist prior to being in BD-STEP, you know, I think one theme throughout my career has been kind of trying to find signal in a lot of noise. First as a single molecule biophysicist and now as a data scientist. 
And, you know, one approach that we've used, or I've used to do that is really that, you know, if you have a needle in a haystack or you have a rare event, if you have a big enough number, suddenly that rare event has at least enough size that you can say something statistically significant about it. You know, I think in previous episodes you talked about, you've talked with guests about rare cancers. And so, you know, because we have such a large patient population, even if it's sort of a low prevalence event, you have a large enough number, you can find enough patients to say something about that. 

I've applied that, not necessarily the most rare cancers, but to answer questions around colorectal cancer and the progression of chronic liver diseases to liver cancer. And so I think one example there is because we have a very large patient population coming with all sorts of different trajectories and medical histories, we can look at different causative factors that might be causing or result in a cancer and say, how do these work together in unique combinations that might not have been able to be studied in other patient populations or with smaller data sets? 

So it really gives us that ability to think about medicine and sort of this systems or network way to say, how are these different unique comorbidities or multimorbidity impacting patient outcomes or even the diagnosis of cancer?

Oliver Bogler: 
And what kind of opportunities does that unlock for us as a society, as a community of people, when you're looking at these large numbers and these many data points, does that affect policy? Does it change medical practice?

Ted Feldman: 
Yeah, I mean, I think it can do both. know that something, one of the parts that I love about being a data scientist, particularly in the VA is this role of getting to not only be the developer but being the translator between the clinical and the technological side of things. And I think, one question or one statement I've heard from the clinicians that I work with is sort of, how do I best treat the patient in front of me. And so the guidelines that a physician or a clinician might be working with are often based on systematic reviews of clinical trials where the goal is to kind of try to reduce as much noise as possible in order to figure out if this drug is safe or effective. And that's great because that's how we get new medicines or new technologies approved through regulatory hurdles. But the reality is, patients in reality in the real world are using real world evidence. All of that so-called noise that you might want to eliminate in a clinical trial is actually the reality of helping or treating a patient. And so it gets into this question of what works best or how do we provide evidence in the real world?

Oliver Bogler: 
Let me, I just want to pursue that a little bit because you mentioned real world evidence, which is a, I guess a relatively new area of a focus, right? For all of us. Help me understand that a little better. What, what is it, how does that distinguish itself from, for example, what you might observe in a clinical trial?

Ted Feldman: 
Yeah, sure. So I could give you one great example that I've worked on personally, which was looking at adverse events in patients in cancer, in liver cancer patients who are treated with a relatively new class of drugs called immune checkpoint inhibitors. One of the challenges there is that the conventional metrics for how we sort of identify an adverse event, for one, don't really necessarily apply directly to immune checkpoint inhibitors because they're expecting a very, they sort of work on the principle that there's a very quick reaction to the drug or the therapy. And in some cases, because these medicines work through the immune system, the reaction can be quite delayed to 14 weeks or more after the initiation of therapy. And so a metric that's based on the chronology or sort of the immediate cause and effect of you gave the patient a drug and you know, they had evidence in their labs of liver inflammation, you know, doesn't necessarily happen right away. And, know, related to that, it's sort of how do you handle these questions when, okay, it's very difficult to discern an adverse event from the progression of the patient's underlying liver disease or underlying liver cancer.

Oliver Bogler:
So it's really where the reality sets in after the more structured studies, right? You can now learn a lot more that's vital to the effective therapy for these cancers. Thanks. 

Frank, the program, the BD-STEP program is described as providing well-rounded training for data scientists. Can you elaborate on what that means?

Frank Meng: 
Yeah, yeah, sure. So yeah, we try to be as well-rounded as we can. So we have what we call a national curriculum. And so the fellowship is actually structured in a way, I don't know if it's unique, I don't think it's unique, but I think it's a little bit different from a lot of the other kind of more traditional postdoctoral fellowships where funding agencies such as the NIH would give money or folks will apply for money and will get granted money to develop a training program perhaps within their institution, within a lab. This training program, because it's funded by the VA, takes place across the entire VA, so to speak. And so we have a national kind of coordinating center that oversees the program. And so then we have four different sites that the fellows are actually physically located at and actually do their research and other things like that.

From the coordinating center standpoint and from the local site standpoint, I think we provide various different ways and angles and different methods to open up training opportunities to the fellows. For instance, of course postdoctoral fellows come in with their own research ideas. A lot of them also come in with academic mentors that they want to work with. And so that's great. And that opens up a lot of opportunities for the fellowship or the program at large because they come in with lot of ideas that we would have never thought of, and they work with odd people that we would have not thought about working with. 

But on the other hand, our fear would be that they would be stuck in that little corner of research, which is wonderful for the two years that they're here, but not understand the broader picture of what the VA is, because the VA is a huge agency. So what we try to do is one aspect of the program that we have implemented is called the service project. And so they're required for 25 % of their time while they're here with us in the program to work on a service project that may not be what they thought they would work on when they come into the program. And so these projects have been vetted by the coordinating center. And we try to collaborate with all sorts of folks, researchers, and others within the VA that perhaps fellows may not have thought about working with. And so our hope is that it would kind of broaden their horizons, open up certain opportunities and landscapes that perhaps they would not have worked on or worked with folks from these different places. And so it would give them a bigger, a broader view of not just the VA, but healthcare data science, clinical data science in general. 

And we have other things like monthly seminars, monthly roundtable meetings where they get to present their own things. So we try to cross pollinate as much as we can and also open up various opportunities within the VA and beyond, like working with Michelle at the NCI.

Michelle has provided a lot of opportunities, a lot of resources that they have access to. that our whole point is that our job is to open the doors. Our job is to provide the opportunities, and our hope is that they would take those opportunities. They would grasp these chances that they have while they're in the fellowship to learn as much as they can about the VA, about the NCI and about data science and healthcare in general.

Oliver Bogler: 
So they're not only honing their data science skills in their main project, they're also getting a chance to see the bigger picture. So as they then make the next career step, they have an opportunity to pick the right place for themselves.

Frank Meng: 
Yeah, and as Ted mentioned, he enjoys being that kind of in the middle ground between the data science and the clinicians. And that's one thing we found a lot of our fellows learn as they go along is how to even interface or interact with clinical folks, clinical professionals, because they do come from a different background. They approach the problem from a different angle. And to be able to understand what they're trying to get at, what problems they're trying to solve is a very good skill to pick up while you're a fellow.

Michelle Berny-Lang: 
Maybe I could expand just a little bit on some of the training too. I think one thing we haven't mentioned that's key to the program is a large number of our fellows haven't seen healthcare or clinical data before. And so one important training element that's added is to have them actually exposed to the clinical setting. They might go to a tumor board where the care of an individual is discussed, or they might shadow a clinician and actually get to understand that that patient provider interaction is the downstream data that they're looking at so they can really treat it with you know the respect and the reverence that it deserves and then understand where that's coming from. 

And I think Frank mentioned these roundtables that we have and this is really a way to build this peer community so we're really ensure that they're each sharing their research and engaging with each other and we get them together about twice a year so that they can share their research and really ask those critical questions of each other to help develop along their career paths. 

And then I'll mention just quickly, we've also been trying to figure out how to use the service projects to tap into questions that NCI has. So have been doing some looking to see are there great projects that can really be a nice collaboration between something a program director at NCI wants to tackle and a fit for a fellow.

We have done a little bit with FDA where a fellow has been able to come in and look at some data after a treatment's approved and see this from the FDA angle as well. So I think the service project is a great way to expand their awareness and their connection. We haven't talked about it yet, but I think has been really a key driver of potential future job opportunities because they make these additional connections.

Oliver Bogler: 
So Ted, now I'm curious, what was your service program?

Ted Feldman: 
Yeah, so I worked with actually - talk about forming collaborations. I worked collaboratively with two other fellows on a project looking at how to implement early detection of colorectal cancer in the veteran population. So, you know, given sort of the problem that some of the DNA based tests like Cologuard are quite expensive to implement at scale and colonoscopy, you often have to do thousands of colonoscopies to find a patient with cancer, which is both a good thing and also a logistical challenge for cancer prevention. 

So that was a good example where talk about working between the VA and NCI and being able to work with clinical folks, one of the fellows that joined our project actually was an MD-PhD, so in the fellowship as well as going through a clinical fellowship. And that fellow had particular ideas about, well, do we want to make this kind of a lab-based algorithm versus more of a risk score? And so we explored both approaches and I think got a lot of great feedback from biostatisticians and experts at NCI when we were fortunate to meet there that really helped drive the design and direction of that project.

Oliver Bogler: 
That sounds like an incredibly important piece of work. 

Michelle, you mentioned that many of the fellows haven't yet experienced a clinical environment when they are in the program. Where do you draw your fellows from? What skill set are you looking for, for our listeners if they're interested in this? Where do they come from, the fellows?

Michelle Berny-Lang: 
Yeah, that's a great question and also an evolving question, which I think is exciting for the program. So when we first started out, were very, and maybe more narrowly, targeting individuals that were coming from an engineering discipline, computer science, mathematics, physics, thinking that they really had developed significant quantitative skill sets. And we're looking for individuals that had some level of working with large data sets, maybe some modeling, and ideally some level of coding that they could bring the skill set in. Because we don't fully train all aspects of data science, what they're really learning is how to integrate within this clinical research environment. 

And as the programs continued forward, we've realized those were likely too rigid and too restrictive. And so we've had a lot of exciting additions of new disciplines coming in. People from computational biology, epidemiology, health policy, and economics. I think some of the economics work has been fascinating and not what we anticipated, as well as individuals that are bringing a clinical psychology perspective. 

So the disciplines in which they're trained, I think we’re really open and have kept that broad, but really have been stringent on ensuring that they bring in those data skill sets so that they have worked with large data sets. They do have some coding experience in a language or two. And that really enables them to be successful regardless of what their background and training is.

Frank Meng: 
We have a couple of staff data scientists that can help out to bridge some gaps. So if some folks come in maybe a little bit weaker on the SQL side, which is database working with databases. So we can help with some of those. But like Michelle said, mostly if they have some background working with data, then we can start, they can hit the ground running much easier.


Oliver Bogler: 
And Frank, us a few numbers of how many fellows are in the program at any one time. The program is almost 10 years old, right? How many alums do you have?

Frank Meng: 
Yeah. Right. So started in 2015. So yeah, you're right, Oliver, almost 10 years old. So we have currently we have 14 fellows in the program, but we supported 58 overall. And those are the ones that received VA funding. But there are other phenotypes of fellows that come in that don't get supported by the VA. Maybe they have their own funding. Like we had some on T grants before, as well as like students or even active employees that don't need our funding because they're already getting salaries from somewhere else within the VA. so total participants is around 60. We don't have an exact number, but it's around 60. But yeah, so we've been really fortunate to have a lot of great fellows come in recently. We've had a lot of great candidates come in. so 14 is a great number for us because our max is actually 16 slots. And so we're almost at a full capacity at this time.

Oliver Bogler: 
Ted, you mentioned that you have returned to the program as a mentor now. Tell us a little bit about that.

Ted Feldman: 
Yeah, I mean, I enjoy mentoring fellows. Part of my professional trajectory has included teaching and curriculum design. That's what I was doing as part of my postdoc before I came here. So I really jumped and was excited to work with fellows as a mentor. So it's great to be able to guide them, work with fellows with geography backgrounds with biostats backgrounds. 

So it's oftentimes, as Frank said, the SQL or working with large databases is very new. And that's often just because that's a very kind of expensive architecture to set up. You do that if you have to in like a large healthcare system, but it's not usually something that's seen in academia. So it's great to guide them and be able to connect them to resources. The VA Informatics Research Community has some incredible resources for learning that and learning that efficiently, particularly on healthcare data. 

And we get to one of the facets of the program, which is explaining and giving people clinical experience. I like kind of being a mentor who explains the detective work of, okay, where might we find the data for question X or how does the clinical process tell us about where the data might be most reliable, like for medication, you know, in the VA there's a separate database for when a healthcare practitioner, know, scans an inpatient wristband and that's going to go one spot versus depending upon how this medication was paid for, that's going to go in a different database. And so being able to provide them the story about, you know, where this data is coming in from the clinical workflow. And so that tells us how reliable or how meaningful it might be. It was particularly cool. And I get to see their excitement about doing this, which keeps me going. And I think one of the things that is great is that excitement of working with trainees and their new fresh eyes and perspectives, and the sort of let's do it attitude.

Oliver Bogler:
Yeah, you need that. You need that attitude for sure.

Michelle Berny-Lang: 
Do you mind if I just jump in quickly with Ted hearing that? Because I think Frank will relate to this. One of the best parts of being someone who helps to administer this program is seeing people like Ted. So now we have fellows that are currently in the program that'll say, wow, Ted's been a great mentor to me. And we hear all of these career advancement stories and see people who are just doing amazing work. And so the downstream outcomes, just hearing Ted talk and knowing, you know, there's this whole cohort out there doing this is really a thrilling part of being part of this program.

Frank Meng: 
Yeah, I think a lot of these young up and coming investigators and scientists, they just need a chance. They need opportunities. And I think the program provides that kind of opportunity for them to get their hands dirty, to understand what's going on in the health care system. So it's not only just the data of the VA, but also they're embedded in an actual living, working health care system. And it happens to be the largest health care system in the US. So they get a front row seat, like I like to say. And of all the action that's going on, and they get a deeper understanding of where the bottlenecks are, where the pinch points are, where the difficulties are and the challenges. And can we use what we know from data science to bring those techniques to bear to solve some of these problems.

Oliver Bogler: 
Right. What are the important questions that are ahead? Yeah. Very, very interesting. My last question on the BD-STEP program. AI is obviously disrupting, hopefully productively, happily, many areas of science. That's a bigger question, but specifically in the context of this training program, how is AI affecting how you're training your fellows and helping them be part of that workforce that will no doubt live in a very AI  rich world.

Frank Meng: 
No doubt, I think everyone would admit AI has revolutionized a lot of things, or at least will start to revolutionize a lot, most, many aspects of our lives. And so healthcare is no exception. And I think what I would want to see from the fellows is I think coming as a data scientist fellow, into a program that is focused on data science, you already have a leg up in terms of understanding some of the basic principles of how AI works, at least the current paradigm in terms of the deep neural networks. And basically, they being a form of statistical modeling. And so I think, of course, to the max on steroids, but they would have a general understanding of how statistical modeling works and how using data to create these kinds of models, what kind of effects will there be? 

And so I would love for our fellows to understand the good and the bad, right? So the good, how can we harness this kind of technology, the powers of this technology to impact care, not just from the outside, but also from the inside, because they have an understanding, a deeper understanding of the data. They have some understanding of statistics, right? They have an understanding of computational methods and things like that. So, and then also what are the side effects? Maybe that are not easily seen just from the outside of viewing these systems as a black box, but knowing that there are biases within data, right? There are inherent other disparities and things like that. 

How can we, as data scientists within healthcare, be able to create systems that minimize these more harmful side effects? So I think that's something that I would hope our fellows would get more into as they move along.

Michelle Berny-Lang: 
And I can think back, our fellows have been maybe early adopters or early users of AI for many years applying natural language processing to extract data from chart notes or different machine learning approaches to look at radiology images. And so I think they've been from the early days involved in this. I would say we've seen even more excitement and maybe even more recruitment to the program, just recognizing this convergence of AI and healthcare. 

And I think something that's really exciting that they're getting out of this program is to have this good solid understanding of you have this tool, it's going to give you an output, and then they're developing the knowledge or the context or the expertise from others that they work with to know what that means in reality. If you get something that's spit out of one of these systems, they can then have that ability to understand if that makes sense, if that's something that's practical. And so I think they're really, like Frank said, getting a really good sense of how to handle this and how to be able to apply these powerful tools moving forward. So I think they're in a good spot.

Ted Feldman: 
Yeah, to piggyback on what Michelle said, think one of the strengths of the program is that it gives our fellows a tremendous opportunity to capture a deep understanding and deep hands-on experience working with AI and not only working with AI, but working with AI within a healthcare system, which often entails thinking about the regulatory and carefully thinking about the privacy, like what is AI doing with information and what kind of data are we feeding to this algorithm and how is it contained in a secure network that are critical to implementation that might not become apparent if you're learning AI with less sensitive healthcare data or data that's less sensitive than healthcare data. 

I think another exciting facet of AI that I've seen working with fellows and working within the VA is this question of really learning sort of as the technology evolves and as healthcare evolves is really learning where in the evolution do we apply AI and at what level of project. So I think it's great that fellows have an opportunity to work with it, but also work with projects that are, you know, different phases of, or different sort of contact levels to clinical data. And so, you know, one of the things that I think I mean by that is, you know, I probably often have conversations with clinicians where I sort of explain to them why they don't need AI to solve their problem or to, you know, work immediately on a clinical implementation as they might think, because it's this cool thing that's out in the media. 

And, you know, sometimes we need a neural network or, you know, we need a transformer based NLP approach and sometimes, you know, regression is good enough. And so much of that and so much of what I think what we teach fellows is this interface of how do we make this technology explainable and adoptable and believable by clinicians, you know, as Frank said, because it can often be a bit of a black box.

Oliver Bogler: 
Yeah, so the deep understanding you gain of the data and how to use it allows you to select the right tool for the question in hand.

Ted Feldman: 
Right. And how are we going to evaluate this or validate how well it works?

Oliver Bogler: 
Very interesting stuff. All right, we're gonna take a break and when we come back, we're gonna talk career paths. 

[music]
 
Calling all senior oncologists committed to community oncology and reducing health disparities! Are you passionate about shaping the next generation of researchers?  Apply to The Worta McCaskill-Stevens Career Development Award for Community Oncology and Prevention Research (K12) program and mentor promising clinical scientists.

This unique NCI program supports the training of clinical scientists in community cancer prevention, screening, intervention, control, and treatment research.  The program welcomes proposals for innovative research and career development programs with an equity lens and a focus on increasing diversity in clinical trials.   

As a program leader, you would guide clinical scientists from various oncology specialties as they conduct research, potentially even leading their own independent clinical trials.  This is a chance to leverage your expertise and make a lasting impact on cancer research and care in underserved communities.  

For more information about the McCaskill-Stevens K12, including how to apply and our staff contact details, visit our webpage – link is in the show notes.

[music ends]

Oliver Bogler: 
All right, we're back. Always my favorite question. What got you started in science? This is an open question. Anyone can jump in. How did you first know you wanted to do science, math, engineering?

Frank Meng: 
So I was thinking about this. So I think it came down to, so I went to elementary school in a small college town, about an hour and a half outside of Chicago. I think the main thing that happened there was my elementary school back then bought an Apple II computer. And so I don't know, I don't even remember how I started playing with it, but I started, I was able to make some of these graphic things. You know, it was all block based, wasn't anything fancy. And I think that started my journey in computer science and then in fifth grade, you know, joined like a competition and got second place somehow. And so that also helped encourage me to move on. And from there, yeah, I was a study computer scientist both as an undergrad and for my PhD.

Oliver Bogler: 
So that was it. You were just like, wow, I got to do this. This is cool. Yeah.

Frank Meng: 
Yeah, and then high school science fairs and things like that. I really have to thank my dad. You know, every time I needed like a little electronic part or something like that, he was willing to drive me anywhere, at least within the county that we live in.

Oliver Bogler: 
Michelle, how about you?

Michelle Berny-Lang:
Well, Frank, I might have a dad related story too. So I just kind of naturally liked science and math. And I think a driver of that, my dad was a high school math teacher teaching advanced math. And so I think some of that, you know, I heard it in the house. We’d do math problems on car rides and I liked that. I don't know, strange child. And so as I was growing up, the convergence of those two, he said, maybe you should think about engineering. And I you know, kind of really started to think about that and stumbled across bioengineering, which to me just seemed like this great way, bring the math and science to biology and really like the problem-solving approaches and the range that you could be designing a new pancreas or a pipeline for how you might make a new therapy. And so that that's the path that I took with my undergrad and then my PhD as well.

Oliver Bogler: 
Fantastic. How about you, Ted?

Ted Feldman: 
Cool, yeah. Sort of similar themes. I think both my mom and dad early on encouraged that. There was a Saturday science program that was run through a local high school that was very hands-on and would have challenges like how can you protect an egg from a six-foot drop using only straws and other household items?

And I think that sort of spurred my early curiosity in science or taking apart, you know, printed circuit boards. There was more taking apart than putting back together, I think at the age of eight. So I think that and, you know, being encouraged to be curious, I often got the, you know, if I asked a question that my parents didn't know an answer to, the answer is usually look it up. So there was a lot of time spent in card catalogs at the library. And we were fortunate to have an Encyclopedia Britannica and later a used World Book Encyclopedia that kind of became a good source of information, which I've evolved to kind of reading Wikipedia for fun at times.

Oliver Bogler: 
Yeah, yeah. That was when “look It up” didn't mean ask Gemini or something, right? It meant actually go somewhere where the information was in a book, something.

Ted Feldman: 
Correct, yes. And I think from there, was fortunate to grow up near Stony Brook University and Brookhaven National Lab. And through science fairs and school, getting to the regional fair at Brookhaven was always a big deal. Ironically, the year I won a school fair, I had just moved and got there too late to get to Brookhaven, but actually wound up, I did the Intel science competitions and things like that in high school and was in physics labs and wound up working at Brookhaven thanks to when I went to Stony Brook, thanks to a program called the Interdisciplinary Biomedical Research Program, which was actually sponsored through NIGMS. And they specifically and was encouraged by some mentors in the Honors College at Stony Brook to be a physicist and an engineer working in a bio lab. And so I got my first taste of being a government scientist working at the National Synchrotron Light Source at Brookhaven. So yeah, I guess I got early encouragement to be a government scientist through that and then got to work after undergrad at Lawrence Livermore, while  going to grad school.

Oliver Bogler: 
Great experiences. Fantastic. Frank, so you told us you studied computer science and then you were actually a software engineer, right? And a project leader in the tech industry. And then you came back to academia and now you're doing a variety of things, right? You're the Associate Director of Clinical Informatics at the VA Boston and you're also on the Faculty of Medicine at Boston University. How did that path shape?

Frank Meng: 
Yeah, so I think I was always interested in doing research and trying to discover new things. so when I, after the PhD, I thought I'd try industry because back then, you know, there was a lot of stuff going on in Silicon Valley. It's kind of a draw. So I kind of want to move up there. was at UCLA, so we were in Southern Cal, right? And Northern Cal, could to be another state actually. And so we just moved, I moved myself up there, worked at IBM. I didn't do a startup. I think I kind of regret I didn't do a startup back then. But then being at IBM, just being surrounded by all the tech culture there and the technology and all the things going on, it was pretty exciting to me. I felt like there was a lot of innovation that I could work on and maybe try to get more involved with. 

But I felt like in industry, at least at the time, I think industry jobs are great. But at the time I just felt like maybe I should go back and try some going back to academia see if I can I can do some some work there and so just happened to have a classmate that was in the PhD program with that you say he ultimately rose through the ranks and became a professor there and he was looking for postdocs. I said would you consider me and he said sure so I so I started to do a postdoc in that lab. And that's how I got that into health care or a clinical informatics. It was called medical informatics back then. It’s still medical informatics now, but, and so I did a postdoc and then also started, had a permanent position as a researcher at that, in that group. Then I jumped out again. 

So I had an interesting trajectory. I jumped out again to do more government related, more defense related work. But then another opportunity came up at the VA where again, while I was a postdoc, it was a graduate student who rose to the ranks in the VA and became one of the associate directors at Maverick where we are now. And he kind of said, yeah, we have openings, so why don’t you come back? And so I wanted to go back to my roots in medical informatics. And so I finally got back to that. I did a stint at UCLA, writing a lot of grants at the Department of Radiology. 

And then Lou Fiore, who's the former director of Maverick, I had met up with him because I had continued to work with them. So I was on a business trip back to Boston. I remember just sitting outside. It was like a nice sunny July day in Boston. And if anyone knows Lou, you know, he's got a way with people, a way with words. He's kind of larger than life kind of personality. So why don't you come back and work with us, you know? And he was telling me all the things that he was doing because the cancer moonshot was going on. They had found out what they were doing in the VA. You know, a lot of exciting things. I was thinking, wow, the things that this guy is sitting in front of me is talking about that they're already doing, they're only thinking about doing at UCLA at this point. I said, wow, it could be interesting to be a part of that. And so I decided then to come back to Boston and work at the VA at that point. And I've been here ever since. It's been great.

Oliver Bogler: 
Interesting, really shows that you are no longer confined to a particular sector these days. You can move between business and academia and government and just as the opportunities take you. 

Michelle, with your background in biomedical engineering, I'm curious how you took the path to where you are now. You're now working at the Center for Strategic Scientific Initiatives and the BD-STEP is only one part of your portfolio, right? So tell us a little bit about that path and what you're doing in a broader sense at CSSI.

Michelle Berny-Lang: 
Yeah, happy to. So, you know, with my undergrad in bioengineering, I think probably a theme throughout this this discussion today, someone took a chance on me and I had a professor that I was able to work in his lab and do some research and that was pretty exciting. So finishing up undergrad, seemed like research was something I wanted to continue. So that really inspired me going on to get my PhD in biomedical engineering at Oregon Health and Science University. And there I was applying a lot of engineering to understand blood clotting processes, how what happens in a test tube is different when it happens in the flow that you have in your vasculature. So I loved my research there and wanted to continue on to research. So moved over to the Boston area with a postdoc at Harvard Medical School and Boston Children's Hospital, a little bit more in the clinical side of research, working with a pediatric hematologist and I loved the research, I loved what I was doing, but I also had a big interest in being involved in a large breadth of research. I didn't see myself following a single cell or a single molecule for my whole life. I had broad interests in the types of research that I wanted to follow or participate in. And so probably like a lot of people that you hear, I started to ask around and see what can you do with a PhD? And I think at that point, the Boston area was really rising up with a lot of support for postdocs and a lot of connections to help with what careers could be. And so through what seemed like a million informational interviews, I ended up connecting with the deputy director of the Center for Strategic Scientific Initiatives. 
And at the time, the center was really focused on bringing technologies, whether it be a nanotechnology, a proteomics, physical science approaches to cancer research. And I really loved that with my engineering background. So joined about 10 years ago. And I'd say that the portfolio that I have worked on has largely been that. Are there opportunities to bring different disciplines, different communities together for new approaches to cancer? So I think BD-STEP is a good example.

I currently support a research program that's bringing synthetic biology to cancer and so have often worked across institutes or across disciplines to support new areas of cancer research and really love it. It's been a great career so far.

Oliver Bogler: 
Always something new? It sounds fascinating. 
Ted, what is the focus of your work now after all this pathway that you took?

Ted Feldman: 
Yeah, sure. So I work on a few different areas. Still studying the progression of chronic liver disease, liver cancer, a couple of different types of liver cancer. So primary hepatocellular carcinoma as well as biliary tract cancers. You know, it's great that the VA also has some incredible genetic resources that we get to use both clinical targeted sequencing in patients already diagnosed with cancer that I get to work with. 

In one of the operational roles I play implementing a genetic database and a clinical interpretation tool on VA data, as well as being able to use some of the VA resources like the Million Veterans Program to look at the genetics of liver disease progression to liver cancer.

Oliver Bogler: 
Sounds really important, important questions, important areas. 

My last question to all of you, what advice would you give to someone who's listening and is interested in a career in data science or maybe even more specifically healthcare data science? What are the most important skills they should think about picking up and the qualities for success in the field?

Michelle Berny-Lang: 
Maybe I can start. This will be simple and maybe Frank and Ted will have more technical skills, but I think data science, at least for now, and I hope in the long term, is going to be a very collaborative role. And so, yes, of course, needing the technical expertise, but I think one of the biggest is the soft skills and the ability to work across disciplines to communicate what you know in data science to someone who might be in a different biological area, a different clinical area. So being able to be that convener and communicator, I think will be key along their careers. 

And maybe just one piece of advice to impart in there. I think all of our trajectories were not perfectly linear and had maybe some unexpected steps along the way and something that we talk about a lot with the service projects, those may not always be the dream projects that people envisioned, but I think just the willingness to try something new, do something that may be outside of your wheelhouse or outside of the area that you really want to focus on is a worthwhile endeavor. I don't think I've ever participated in something where it doesn't have something that I learned or gained in the long term. So maybe an openness to the types of projects and the types of collaborations that you participate in along your career.

Ted Feldman: 
I think to jump on that one piece of advice that I would give to trainees is talk to people. You know, honestly, I became a material science and engineering and physics double major as an undergrad because my undergrad, I was an engineering major first. My undergrad engineering advisor made a bet with me that I couldn't pull off both majors in four years. So, you then I had to do it and that started over a conversation. So essentially, yeah, my career trajectory started over that, you know, and then again, when I was a Curriculum Fellow, which is a postdoctoral program at Harvard Medical School, you know, I got this protected academic time to figure out what it is I wanted to do with the rest of my life. And, you know, knew I wanted to do something translational and had met some faculty in biomedical informatics through the curriculum side. I just sent a cold email to the chair of the department and he surprisingly wrote back, which turned into a 10 minute conversation in which he basically said, well, what you want to do is an informatics project and connected me with two other people. 

So, that and I think being in that training role where I had some programming skills and know and folks around me said well you're you're the quant skills guy because you have this degree in physics where you have this PhD in applied physics now. We need you to teach this R visualization course in two weeks and I had never touched R before. It's like okay great well now I'll have really good hands-on experience with where my where the students are coming from. I just have to stay one step ahead of them. So, so many of the turns in my career have been nonlinear and just sort of grown out of random conversations. So I would say, don't be afraid to send the email or ask the question.

Oliver Bogler: 
That's great advice. Frank?

Frank Meng: 
Yeah, I'll touch upon first some more of the technical aspects since there was more of the softer skills. Yeah, when I was undergrad, they said if you have your math and you have your physics, a good solid foundation for those two, you can move across many disciplines. And so I think with data science specifically, there's the data aspect of things, know, knowing about databases, SQL has been a big thing, but just maybe even the programming language that you would know and then statistics. I think is a great thing to have to be able to be very successful. I have a solid foundation for this field. 

And then in other words, I would just echo what Michelle and Ted said. I think a lot of innovation comes from being here and then hearing an idea from here and then being able to connect them. And so if you look at AI itself, it's actually a melding of ideas from various different fields, cognitive science, psychology, of course, computational statistics and things like that. And so I think a lot of those ideas had to kind of gel together in the minds of several people or some prominent researchers at the time for them to come up with these ideas and eventually implement them and make them a real things. 

And so I think just, you a lot of people say, think outside the box. That's great. But I also think you need to be willing to step outside the box yourself to be able to work with people like Michelle said and Ted also mentioned, know, collaborate with folks that you would normally not collaborate with and look at ideas that normally and fields that normally you wouldn't look at. I think a lot of innovation may come that way. So yeah.

Oliver Bogler: 
Thank you. 

[music]

Oliver Bogler: 
Now it's time for a segment we call Your Turn, your chance to share a recommendation with our audience. If you're listening, then you're invited to take your turn. Record a tip for a book, a video, a podcast, or a talk, or anything that you found inspirational or amusing or interesting, and send it to us at NCIICC@nih.gov. We may just play it in an upcoming episode. 

But I'd like to invite our guests to take their turn. Let's start with you, Michelle.

Michelle Berny-Lang: 
Sure, so this one might be embarrassing because maybe I'm late to the game, but it has changed my happiness so I'll share it. So before the pandemic I had a very reliable commute and then reading would be the bookends to my day. I'm a reader, pretty much anything you throw at me I'd read. And with commute changing, work and life demands changing, I was not much of a reader and it has been really this key missing part of my life. So probably about six months ago, realized the joy of audiobooks from the library and feel like I've regained, not fully, it's not the same as a book, but regained that part of my life a bit. And so it's been fantastic to be able to do something else and listen to a book at the same time. it's helped round me out a little bit.

Oliver Bogler:
Yeah, I'm a huge fan of audiobooks. think that's a great recommendation. Frank.

Frank Meng: 
So being someone who's researching natural language processing, I think a great thing is to try to learn a new language or a second or third language. I am culturally Chinese, but as you probably could deduce from what I was talking about in my background, I grew up in the US. So English is my primary language, but we did speak Chinese at home. And so recently, about 10 years ago, I started to try to read because these characters have nothing to do with the alphabet in English. Trying to understand the characters and being able to recognize them and start reading things. And so after 10 years, I think I've gotten okay, not great. But I think, you know, I've found that reading the same thing in English and then translate it into another language, sometimes you get a different perspective on things. So I think learning a new language is something that I've enjoyed.

Oliver Bogler: 
That's, yeah, I can only agree, fantastic. Ted, how about you?

Ted Feldman: 
Sure, so I'll continue the theme of reading and maybe provide a couple of specific suggestions. there's a great Harvard Business Review article that a friend introduced me to, it published in 2005, called What's Your Story? And the advice in there is basically how I've gotten every job that I've had since reading that article. So I would recommend that. 
And as for a book, the one that came to mind is I always find inspiration from previous scientists and Max Perutz curated this collection of stories called I Wish I'd Made You Angry Earlier, which was advice he got from his doctoral advisor, William Bragg.

Oliver Bogler: 
Okay, great. Thanks. We'll put links in the show notes for those. That's fantastic. Well, thank you all very much for spending time, for sharing about the BD-STEP program and about your own careers. Thank you so much.

Michelle Berny-Lang:
Thank you, really appreciate this

Frank Meng: 
Thank you. Thanks for having us.

Ted Feldman:
Thank you, Oliver.

Oliver Bogler: 
That’s all we have time for on today’s episode of Inside Cancer Careers! Thank you for joining us and thank you to our guests.

We want to hear from you – your stories, your ideas and your feedback are welcome. And you are invited to take your turn and make a recommendation to share with our listeners. You can reach us at NCIICC@nih.gov.

Inside Cancer Careers is a collaboration between NCI’s Office of Communications and Public Liaison and the Center for Cancer Training. It is produced by Angela Jones and Astrid Masfar.

Join us every first and third Thursday of the month wherever you listen – subscribe so you won’t miss an episode.

If you have questions about cancer or comments about this podcast, you can email us at NCIinfo@nih.gov or call us at 800-422-6237. And please be sure to mention Inside Cancer Careers in your query.

We are a production of the U.S. Department of Health and Human Services, National Institutes of Health, National Cancer Institute. Thanks for listening.

Email