Join us for “Adventures in Text Mining: Applications, Ethics, and Cancer Care”

Promotional banner for Adventures in Text Mining eventJoin us for our webinar “Adventures in Text Mining: Applications, Ethics, and Cancer Care” on October 16 from 12:00 PM-1:00 PM Eastern Time.

What is Text Mining?
Text mining helps researchers sift through mountains of documents, clinical notes, and research papers to find important patterns and information quickly. Dr. Manika Lamba (Assistant Professor, School of Library and Information Studies, University of Oklahoma) will introduce the topic through the lens of her work in digital libraries and information organization.

Applications in Cancer Care
Dr. Anyi Li (Chief, Associate Attendings, Department of Medical Physics, Memorial Sloan Kettering) will explain how applying text mining technologies to clinical notes at MSK has automated radiation therapy processes, saving clinician time and allowing for risk event analysis and mitigation. He will address the ethical aspects of text mining in healthcare, including patient privacy and responsible data use.

Applications in the Published Literature
Text mining can allow researchers to analyze the vast volume of scientific literature. Dr. Zhiyong Lu (Senior Investigator, NIH/NLM, Deputy Director for Literature Search, NCBI) will showcase his work mining the literature in PubMed, which led to tools including the Best Match algorithm and LitCovid. 

Register now. All registrants will receive a link to the event recording, whether or not they can attend synchronously.

About the speakers:

Dr. Manika Lamba is an Assistant Professor at the School of Library and Information Studies, University of Oklahoma. Previously, she served as a Postdoctoral Research Associate at the HathiTrust Research Center, University of Illinois. Her research broadly falls under computational social science and science of science. She primarily focuses on using computational methods, such as text mining and machine learning, to provide better solutions for information retrieval and organization of digital libraries.

Dr. Anyi Li, Associate Attending Physicist and Chief of Computer Service at the Department of Medical Physics at MSK, leads a talented team comprising mathematicians, physicists, engineers, and data scientists. Together, they collaborate with the Division of Clinical Physics and the Department of Radiation Oncology to harness artificial intelligence, operational research algorithms, and big data. Their objective is to optimize radiation therapy plans, enhance the efficiency of the radiation treatment process from start to finish, develop a data platform for clinical decision support, and improve patient safety by managing accumulated radiation doses. They utilize the latest language models to analyze clinical event timelines and construct workflow knowledge graphs, which improve the radiation therapy workflow and provide valuable insights to the clinical team. With a background as a theoretical nuclear physicist and research scientist tackling NP-hard (nondeterministic polynomial time) problems, Dr. Li transitioned into big data engineering and AI, bringing experience from positions at Yahoo and IBM Watson Health.

Dr. Zhiyong Lu is a tenured Senior Investigator at the NIH/NLM IPR, leading research in biomedical text and image processing, information retrieval, and AI/machine learning. In his role as Deputy Director for Literature Search at NCBI, Dr. Lu oversees the overall R&D efforts to improve literature search and information access in resources like PubMed and LitCovid, which are used by millions worldwide each day. Additionally, Dr. Lu is Adjunct Professor of Computer Science at the University of Illinois Urbana-Champaign (UIUC). With over 400 peer-reviewed publications, Dr. Lu is a highly cited author, and a Fellow of the American College of Medical Informatics (ACMI) and the International Academy of Health Sciences Informatics (IAHSI).

Webinar: The “New” NIH Data Management & Sharing Policy: A Conversation

This webinar is a great opportunity to learn more about the new NIH Data Management & Sharing Policy and its impact on grant applicants. Join us for a conversation that will touch on policy expectations, insights in how to prepare a data management plan, and advice for sharing data responsibly and safely.

A panel of MSK staff from various departments will be sharing their recent experiences with time for attendees to participate in an interactive Q&A discussion.

Date: Thursday, March 23, 2023
Time: 12:00 PM to 1:30 PM, EST
Location – Zoom Webinar – Register Now

Panelist Bios:

Roy Cambria, BS, CCRP, CIP, Director, Human Research Protection Program (HRPP), MSK
Roy has been at Memorial Sloan Kettering since 2005 and has held several positions in clinical research throughout his almost 18 year career with the institution. He began MSK as the Institutional Review Board/Privacy Board (IRB/PB) Coordinator until 2008 when he transitioned to project, and program based positions in the former Office of Clinical Research. He returned to the IRB/PB administration space in 2016 as the Human Research Protections Program Director. The MSK Human Research Protection Program Office is part of the Protocol Activation, Review and HRPP unit in Clinical Research Compliance Administration. As HRPP Director, Roy oversees the daily operations of the HRPP office and MSK’s 3 IRB/PBs. He is responsible for promoting the welfare and rights of human research participants, facilitation of excellence in human subjects research, and ensure timely and high quality review of research. In addition, he and the HRPP office are responsible for ensuring full compliance with Institutional, AAHRPP(Association for the Accreditation of Human Research Protection Programs), State and Federal regulations, requirements and guidance regarding human subjects’ protection. Roy has served on the MSK IRB/PB since 2008 and is a member of the Society of Clinical Research Associates (SOCRA) and a Certified IRB Professional (CIP).

Anthony Dellureficio, MLS, MSc, Associate Librarian, Data Management, MSK
Anthony joined the MSK Library in 2019 to help launch a new Research Data Management program to support researchers by developing, implementing, and integrating resources that focus on data management plan creation, data discovery, and data as a component of the publication process. Prior to joining MSK, Anthony led The New School Library and Archives systems and technology team for about ten years. He has previously worked as the digital archivist at Cold Spring Harbor Laboratory, rare medical text cataloger at the Johns Hopkins Institute of the History of Medicine, and archivist at the Johns Hopkins Medical archives. His academic area of interest is in the history of classical genetics.

Kelly McConnell, PhD, Associate Attending Psychologist & co-Director of the Psycho-oncology of Aging and Cancer research laboratory in the Department of Psychiatry & Behavioral Sciences, MSK
Dr. Kelly McConnell’s research examines the nature and predictors of distress in older adult patients with cancer and their caregivers and care received at the end-of-life. She also examines the efficacy and implementation of interventions to reduce distress and increase rates of advance care planning in patients and caregivers. She has received NIH (K23, R21) and foundation (American Cancer Society, American Federation for Aging Research, RRF Foundation for Aging) grant funding for this research.

Joseph Olechnowicz, MA, Senior Editor, Department of Pediatrics, MSK
Joe Olechnowicz assists investigators with successfully communicating their scientific goals and asking for federal and philanthropic support for achieving them. He joined the Department of Pediatrics in 2008 initially contributing to protocol/project development activities (including protocol review activities of the department as well as the development of the FDA approved drug naxitamab) while also assisting with manuscript submissions and grant application and reporting.

Joe received his B.S. in biology from John Carroll University and his M.A. from Case Western Reserve University (CWRU) in biomedical ethics while also working in the lab of Dr. Sanford Markowitz on the genetics of familial colon cancer syndromes. He went on to work with Dr. Eric Kodish at the Cleveland Clinic analyzing the use of proxy consent and assent in clinical research involving children (Olechnowicz et al. Pediatrics, 2002). He then attempted to study the philosophy underpinning consent and intentional/free action (along with some other stuff) at Florida State University. He currently is employed using the skills he acquired during his academic activities, namely, writing concisely and precisely and making difficult scientific/conceptual explanations understandable.

Sub-grouping PubMed Records by Their Linkages to Other NIH Resources

The National Library of Medicine (NLM), which is part of the NIH, is responsible for a wide array of information/data resources. In addition to biomedical literature databases like PubMed, PubMed Central, and the clinical trial registry, ClinicalTrials.gov, NLM also includes computational molecular biology resources and human genome resources among its database offerings, all of which are freely-available to everyone.

One of the great strengths of NLM’s resources is that they have been designed with maximum accessibility/linkages in mind. If you are searching in one database and there is information in another NLM resource that might also be relevant, chances are pretty good that the database record you are consulting will include meaningful embedded links-out to the other tools.

These connections between resources are particularly valuable for conducting specialized searches of the biomedical literature. The ability to sub-group PubMed records according to their inclusion in a “secondary source” means that you can limit a search within PubMed to a more relevant portion of PubMed, which is a powerful way to increase the precision of your search results.

Following are two different use cases where this sub-grouping functionality can be super-useful if you are carrying out targeted information retrieval projects.

Case 1: ClinicalTrials.gov

In ClinicalTrials.gov, each registered clinical trial record includes a “Study Results” tab where searchers can find publication lists (when available). These lists of article citations link back to PubMed records, which in turn are indexed with ClinicalTrials.gov identifiers. As a result of this set-up, if a searcher wishes to start in PubMed and search on their favorite topic across the published clinical trial study results identified in ClinicalTrials.gov, they can do so by adding the following to their PubMed search strategy:

Clinicaltrials.gov[si]  

For example: clinicaltrials.gov[si] AND sarcoma – Search Results – PubMed (nih.gov)

(Note: The ClinicalTrial.gov linkage will appear in the PubMed Abstract record in the “Associated data” section.)

Case 2: GeneRIF (Gene Reference into Function)

Another specialized literature search that is often tricky to carry out is one that limits the search results to those publications that describe a gene’s function. Luckily, NLM already has a program called GeneRIF (Gene Reference into Function) that “provides a simple mechanism to allow scientists to add to the functional annotation of genes described in Gene.” By leveraging these gene-PMID connections developed for the Gene database, PubMed searchers can limit their search results to only those PubMed records that have been tagged with a GeneRIF identifier. They can do this by adding the following to their PubMed search strategy:

“pubmed gene rif” [Filter]

For example: “pubmed gene rif” [Filter] AND sarcoma – Search Results – PubMed (nih.gov)

(Note: The GeneRIF linkage will appear in the PubMed Abstract record in the “Related information” section.) 

If you have any questions or would like some additional guidance on designing specialized literature searches, feel free to Ask Us at the MSK Library.