Diseño y desarrollo de una herramienta basada en procesamiento de lenguaje natural mediado por Deep Learning para evaluación de registros clínicos electrónicos

dc.contributor.advisorDelgado Roman, Carlos Ignacio
dc.contributor.authorBonilla Arandia, Daniel
dc.date.accessioned2024-07-16T00:52:06Z
dc.date.available2024-07-16T00:52:06Z
dc.date.issued2024-06
dc.description.abstractLas actividades de evaluación de registros clínicos electrónicos (EHR) como la medición de adherencia a Guías de Práctica Clínica (GPC) presentan un reto para los prestadores de servicios de salud, puesto que su adecuada implementación permite reducir la variabilidad de la atención y costos de atención. Sin embargo, es una tarea que requiere una revisión de registros elevado y retroalimentación constante para considerarse exitosa. Esta investigación presenta el diseño y desarrollo de un prototipo basado en Procesamiento de Lenguaje Natural (NLP) mediado por Deep Learning para evaluar la adherencia a Guías de Práctica Clínica (GPC) en registros clínicos electrónicos, haciendo uso de un modelo grande de lenguaje (LLM) preentrenado y la combinación de Prompt Engineering y Retrieval Augmented Generation (RAG) para analizar las historias clínicas y determinar la adherencia.
dc.description.abstractenglishEvaluation activities for electronic health records (EHR), such as measuring adherence to Clinical Practice Guidelines (CPGs), present a challenge for healthcare providers, as proper implementation helps reduce variability in care and healthcare costs. However, it is a task that requires extensive record review and constant feedback to be considered successful. This research presents the design and development of a prototype based on Natural Language Processing (NLP) mediated by Deep Learning to evaluate adherence to Clinical Practice Guidelines (CPGs) in electronic health records. It utilizes a large pre-trained language model (LLM) and combines Prompt Engineering with Retrieval Augmented Generation (RAG) to analyze medical records and determine adherence.
dc.description.sponsorshipClinica Universidad de la Sabana
dc.identifier.urihttps://hdl.handle.net/20.500.12495/12672
dc.language.isoes
dc.relation.referencesAbid, A., Abdalla, A., Abid, A., Khan, D., Alfozan, A., & Zou, J. (2019). Gradio: Hassle-Free Sharing and Testing of ML Models in the Wild. arXiv. https://doi.org/10.48550/arXiv.1906.02569
dc.relation.referencesAlmazrouei, E., Alobeidli, H., Alshamsi, A., Cappelli, A., & Cojocaru, R. (2023). The Falcon Series of Open Language Models. arXiv. https://doi.org/10.48550/arXiv.2311.16867
dc.relation.referencesAnyscale. (2024). anyscale. anyscale. https://www.anyscale.com/
dc.relation.referencesBaker, R., Camosso-Stefinovic, J., Gillies, C., Shaw, E. J., Cheater, F., Flottorp, S., & Robertson, N. (2010). Tailored interventions to overcome identified barriers to change: effects on professional practice and health care outcomes. In The Cochrane Collaboration (Ed.), Cochrane Database of Systematic Reviews (p. 1). Wiley. https://doi.org/10.1002/14651858.CD005470.pub2
dc.relation.referencesBalaguer, A., Benara, V., Cunha, R. L. F., Filho, R. M. E., & Hendry, T. (2024). RAG vs Fine-tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture. arXiv. https://doi.org/10.48550/arXiv.2401.08406
dc.relation.referencesBeeching, E., Fourrier, C., Habib, N., & Han, S. (2023). Open LLM Leaderboard. Hugging Face. https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard
dc.relation.referencesBelkada, Y., Dettmers, T., Pagnoni, A., Gugger, S., & Mangrulkar, S. (2023, May 24). Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA. Hugging Face. https://huggingface.co/blog/4bit-transformers-bitsandbytes
dc.relation.referencesBurgers, J. S. (2003). Towards evidence-based clinical practice: an international survey of 18 clinical guideline programs. International Journal for Quality in Health Care, 15(1), 31–45. https://doi.org/10.1093/intqhc/15.1.31
dc.relation.referencesCabana, M. D., Rand, C. S., Powe, N. R., Wu, A. W., Wilson, M. H., Abboud, P. A. C., & Rubin, H. R. (1999). Why Don’t Physicians Follow Clinical Practice Guidelines?: A Framework for Improvement. JAMA, 282(15), 1458. https://doi.org/10.1001/jama.282.15.1458
dc.relation.referencesCommittee on Standards for Developing Trustworthy Clinical Practice Guidelines. (2011). Clinical Practice Guidelines We Can Trust. National Academies Press. https://doi.org/10.17226/13058
dc.relation.referencesCongreso de Colombia. (2015). ley 1751 de 2015. Ministerio de Salud y Protección Social. https://www.minsalud.gov.co/normatividad_nuevo/ley%201751%20de%202015.pdf
dc.relation.referencesConsejo Asesor Para el Sistema Único de Acreditación en Salud. (2011). Manual de Acreditación en Salud Ambulatorio y Hospitalario de Colombia. Estándares de Acreditación. https://acreditacionensalud.org.co/estandares-de-acreditacion/
dc.relation.referencesCuenta de alto costo. (2023, June 9). Día mundial de la prevención del cáncer de próstata 2023 - Cuenta de Alto Costo. Cuenta de alto Costo. https://cuentadealtocosto.org/cancer/dia-mundial-de-la-prevencion-del-cancer-de-prostata-2023
dc.relation.referencesDettmers, T., Pagnoni, A., Holtzman, A., & Zettlemoyer, L. (2023). QLoRA: Efficient Finetuning of Quantized LLMs. arXiv. https://doi.org/10.48550/arXiv.2305.14314
dc.relation.referencesDevlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv. https://doi.org/10.48550/arXiv.1810.04805
dc.relation.referencesDouze, M., Guzhva, A., Deng, C., Johnson, J., & Szilvasy, G. (2024). The Faiss library. arXiv. https://doi.org/10.48550/arXiv.2401.08281
dc.relation.referencesEssel, V., Vuuren, U. V., Sa, A. D., Govender, S., Murie, K., Schlemmer, A., Gunst, C., Namane, M., Boulle, A., & Vries, E. D. (2015). Auditing chronic disease care: Does it make a difference? African Journal of Primary Health Care & Family Medicine, 7(1), 7. https://doi.org/10.4102/phcfm.v7i1.753
dc.relation.referencesFischer, F., Lange, K., Klose, K., Greiner, W., & Kraemer, A. (2016). Barriers and Strategies in Guideline Implementation—A Scoping Review. Healthcare, 4(3), 36. https://doi.org/10.3390/healthcare4030036
dc.relation.referencesFleming, S. L., Lozano, A., Haberkorn, W. J., Jindal, J. A., & Reis, E. P. (2023). MedAlign: A Clinician-Generated Dataset for Instruction Following with Electronic Medical Records. arXiv. https://doi.org/10.48550/arXiv.2308.14089
dc.relation.referencesGao, Y., Xiong, Y., Gao, X., Jia, K., & Pan, J. (2023). Retrieval-Augmented Generation for Large Language Models: A Survey. arXiv. https://doi.org/10.48550/arXiv.2312.10997
dc.relation.referencesGrol, R., & Grimshaw, J. (2003). From best evidence to best practice: effective implementation of change in patients’ care. The Lancet, 362(9391), 1225–1230. https://doi.org/10.1016/S0140-6736(03)14546-1
dc.relation.referencesGrol, R., Wensing, M., Eccles, M., & Davis, D. (2013). Improving Patient Care: The Implementation of Change in Health Care (Vol. 1). Wiley. https://doi.org/10.1002/9781118525975
dc.relation.referencesGugger, S., Debut, L., Wolf, T., Schmid, P., & Mueller, Z. (2022). Accelerate: Training and inference at scale made simple, efficient and adaptable. [Computer software]. Hugging Face. https://github.com/huggingface/accelerate
dc.relation.referencesGunasekar, S., Zhang, Y., Aneja, J., Mendes, C. C. T., & Del Giorno, A. (2023). Textbooks Are All You Need. arXiv. https://doi.org/10.48550/arXiv.2306.11644
dc.relation.referencesHuggingFace. (n.d.). Sentence Transformers. HuggingFace. https://huggingface.co/sentence-transformers
dc.relation.referencesHysong, S. J., Best, R. G., & Pugh, J. A. (2006). Audit and feedback and clinical practice guideline adherence: Making feedback actionable. Implementation Science, 1(1), 9. https://doi.org/10.1186/1748-5908-1-9
dc.relation.referencesKong, A., Zhao, S., Chen, H., Li, Q., Qin, Y., Sun, R., & Zhou, X. (2023). Better Zero-Shot Reasoning with Role-Play Prompting. arXiv. https://doi.org/10.48550/arXiv.2308.07702
dc.relation.referencesKwak, G. H., & Hui, P. (2019). DeepHealth: Review and challenges of artificial intelligence in health informatics. arXiv. https://doi.org/10.48550/ARXIV.1909.00384
dc.relation.referencesLangChain. (n.d.). LangChain. LangChain. https://github.com/langchain-ai
dc.relation.referencesLlanio Navarro, R., Perdomo González, G., & Arús Soler,Enrique. (2007). Propedéutica clínica y semiología médica. Tomo 1. Editorial Ciencias Médicas. Editorial Pueblo y Educación.
dc.relation.referencesLewis, P., Perez, E., Piktus, A., Petroni, F., & Karpukhin, V. (2024). Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. arXiv. https://doi.org/10.48550/arXiv.2005.11401
dc.relation.referencesLi, J., Dada, A., Kleesiek, J., & Egger, J. (2023). ChatGPT in Healthcare: A Taxonomy and Systematic Review. medRxiv. https://doi.org/10.1101/2023.03.30.23287899
dc.relation.referencesLópez-Ramos, H., Gómez Cusnir, P., Moreno, M., Patiño, G., Rasch-Isla, A., Dallos, A., Fernández, N., Jaramillo, A., & Vega, J. (2015). Guía de manejo de la hiperplasia prostática benigna. Sociedad Colombiana de Urología 2014. Urología Colombiana, 24(3), 187.e1–187.e32. https://doi.org/10.1016/j.uroco.2015.04.005
dc.relation.referencesLouis, Antoine. (2020a, July 7). A Brief History of Natural Language Processing — Part 1. Medium. https://medium.com/@antoine.louis/a-brief-history-of-natural-language-processing-part-1-ffbcb937ebce
dc.relation.referencesLouis, Antoine. (2020b, July 7). A Brief History of Natural Language Processing — Part 2. https://medium.com/@antoine.louis/a-brief-history-of-natural-language-processing-part-2-f5e575e8e37
dc.relation.referencesLugtenberg, M., Zegers-van Schaick, J. M., Westert, G. P., & Burgers, J. S. (2009). Why don’t physicians adhere to guideline recommendations in practice? An analysis of barriers among Dutch general practitioners. Implementation science: IS, 4, 54. https://doi.org/10.1186/1748-5908-4-54
dc.relation.referencesMashette, N. (2023). Building a Document-based Question Answering System with LangChain using Open Source LLM model…. Medium. https://medium.com/@nageshmashette032/building-a-document-based-question-answering-system-with-langchain-using-open-source-llm-model-346efe676be6
dc.relation.referencesMcGlynn, E. A., Asch, S. M., Adams, J., Keesey, J., Hicks, J., DeCristofaro, A., & Kerr, E. A. (2003). The Quality of Health Care Delivered to Adults in the United States. New England Journal of Medicine, 348(26), 2635–2645. https://doi.org/10.1056/NEJMsa022615
dc.relation.referencesMeta Research. (2024a). faiss [Computer software]. Meta Research. https://github.com/facebookresearch/faiss
dc.relation.referencesMeta Research. (2024b). Getting to know Llama 2: Everything you need to start building. github. https://github.com/meta-llama/llama-recipes/blob/main/recipes/quickstart/Getting_to_know_Llama.ipynb
dc.relation.referencesMickan, S., Burls, A., & Glasziou, P. (2011). Patterns of ‘leakage’ in the utilisation of clinical guidelines: a systematic review. Postgraduate Medical Journal, 87(1032), 670–679. https://doi.org/10.1136/pgmj.2010.116012
dc.relation.referencesMikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient Estimation of Word Representations in Vector Space. arXiv. https://doi.org/10.48550/arXiv.1301.3781
dc.relation.referencesMinisterio de Salud y Protección Social. (1993). Ley 100 de 1993. Funcion publica. https://www.funcionpublica.gov.co/eva/gestornormativo/norma.php?i=5248
dc.relation.referencesMinisterio de Salud y Protección Social. (2010). Seguridad del paciente y la atención segura. Ministerio de Salud y Protección Social. https://www.minsalud.gov.co/sites/rid/Lists/BibliotecaDigital/RIDE/DE/CA/Guia-buenas-practicas-seguridad-paciente.pdf
dc.relation.referencesMinisterio de Salud y Protección Social. (2016). Sistema Obligatorio de Garantía de Calidad en Salud (SOGCS). Ministerio de Salud y Protección Social. https://www.minsalud.gov.co/salud/PServicios/Paginas/sistema-obligatorio-garantia-calidad-SOGC.aspx
dc.relation.referencesMinisterio de Salud y Protección Social. (2018). Guía práctica del sistema obligatorio de garantía de la calidad en salud SOGCS. Gobernacion de Boyaca. https://boyaca.gov.co/SecSalud/images/Documentos/cartilla_didactica.pdf
dc.relation.referencesMoutawwakil, I., & Pierrard, R. (2023). LLM-Perf Leaderboard. Hugging Face. https://huggingface.co/spaces/optimum/llm-perf-leaderboard
dc.relation.referencesOpenAI, Achiam, J., Adler, S., Agarwal, S., & Ahmad, L. (2023). GPT-4 Technical Report. arXiv. https://doi.org/10.48550/arXiv.2303.08774
dc.relation.referencesOvadia, O., Brief, M., Mishaeli, M., & Elisha, O. (2024). Fine-Tuning or Retrieval? Comparing Knowledge Injection in LLMs. arXiv. https://doi.org/10.48550/arXiv.2312.05934
dc.relation.referencesRasmussen, J. N., Chong, A., & Alter, D. A. (2007). Relationship Between Adherence to Evidence-Based Pharmacotherapy and Long-term Mortality After Acute Myocardial Infarction. JAMA, 297(2), 177. https://doi.org/10.1001/jama.297.2.177
dc.relation.referencesRoehrborn, C. G. (2005). Benign prostatic hyperplasia: an overview. Reviews in Urology, 7(9), 3–14. https://pubmed.ncbi.nlm.nih.gov/16985902/
dc.relation.referencesRudin, R. S., Fischer, S. H., Shi, Y., & Shekelle, P. (2019). Trends in the Use of Clinical Decision Support by Health System–Affiliated Ambulatory Clinics in the United States, 2014-2016. American Journal of Managed Care, 12(19), 4–10. https://researchgate.net/publication/338989890_Trends_in_the_Use_of_Clinical_Decision_Support_by_Health_System-Affiliated_Ambulatory_Clinics_in_the_United_States_2014-2016
dc.relation.referencesSinha, R. K. (2010). Impact of Health Information Technology in Public Health. Sri Lanka Journal of Bio-Medical Informatics, 1(4), 223. https://doi.org/10.4038/sljbmi.v1i4.2239
dc.relation.referencesSutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to Sequence Learning with Neural Networks. arXiv. https://doi.org/10.48550/arXiv.1409.3215
dc.relation.referencesTang, R., Han, X., Jiang, X., & Hu, X. (2023). Does synthetic data generation of llms help clinical text mining? arXiv. https://doi.org/10.48550/arXiv.2303.04360
dc.relation.referencesTayefi, M., Ngo, P., Chomutare, T., Dalianis, H., Salvi, E., Budrionis, A., & Godtliebsen, F. (2021). Challenges and opportunities beyond structured data in analysis of electronic health records. WIREs Computational Statistics, 13(6). https://doi.org/10.1002/wics.1549
dc.relation.referencesTouvron, H., Martin, L., Stone, K., Albert, P., & Almahairi, A. (2023). Llama 2: Open Foundation and Fine-Tuned Chat Models. arXiv. https://doi.org/10.48550/arXiv.2307.09288
dc.relation.referencesTunstall, L., Werra, L. V., & Wolf, T. (2022). Question Answering. In Natural language processing with transformers: building language applications with Hugging Face (p. 1). O’Reilly Media. https://www.oreilly.com/library/view/natural-language-processing/9781098136789/
dc.relation.referencesUnstructured Technologies Inc. (2024). unstructured [Computer software]. Unstructured. https://github.com/Unstructured-IO/unstructured
dc.relation.referencesVajjala, S., Majumder, B., Gupta, A., & Surana, H. (2020). NLP: A Primer. In Practical Natural Language Processing (p. 1). O’Reilly Media, Inc. https://www.oreilly.com/library/view/practical-natural-language/9781492054047/
dc.relation.referencesVasilakes, J., Zhou, S., & Zhang, R. (2021). Natural language processing. In Machine Learning in Cardiovascular Medicine (pp. 123–148). Elsevier. https://doi.org/10.1016/B978-0-12-820273-9.00006-3
dc.relation.referencesVaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2017). Attention is All you Need. arXiv. https://doi.org/10.48550/arXiv.1706.03762
dc.relation.referencesVerhamme, K. M. C., Dieleman, J. P., Bleumink, G. S., van der Lei, J., & Sturkenboom, M. C. J. M. (2002). Incidence and Prevalence of Lower Urinary Tract Symptoms Suggestive of Benign Prostatic Hyperplasia in Primary Care—The Triumph Project. European Urology, 42(4), 323–328. https://doi.org/10.1016/S0302-2838(02)00354-8
dc.relation.referencesVuichoud, C., & Loughlin, K. R. (2015). Benign prostatic hyperplasia: epidemiology, economics and evaluation. The Canadian Journal of Urology, 22(1), 1–6. https://pubmed.ncbi.nlm.nih.gov/26497338/
dc.relation.referencesWei, J., Wang, X., Schuurmans, D., Bosma, M., & Ichter, B. (2023). Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. arXiv. https://doi.org/10.48550/arXiv.2201.11903
dc.relation.referencesWolf, T., Debut, L., Sanh, V., Chaumond, J., & Delangue, C. (2020). Transformers: State-of-the-Art Natural Language Processing [Computer software]. Hugging Face. https://huggingface.co/docs/transformers/index
dc.relation.referencesYang, S., Zhu, F., Ling, X., Liu, Q., & Zhao, P. (2021). Intelligent Health Care: Applications of Deep Learning in Computational Medicine. Frontiers in Genetics, 12. https://doi.org/10.3389/fgene.2021.607471
dc.rightsAtribución-NoComercial-CompartirIgual 4.0 Internacionalen
dc.rights.accessrightshttp://purl.org/coar/access_right/c_abf2
dc.rights.accessrightsinfo:eu-repo/semantics/openAccess
dc.rights.localAcceso abierto
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/4.0/
dc.subjectProcesamiento de lenguaje natural
dc.subjectGuías de práctica clínica
dc.subjectModelo grande de lenguaje
dc.subjectRecuperación generación aumentada
dc.subjectRegistros clínicos electrónicos
dc.subjectNLP
dc.subjectGPC
dc.subjectLLM
dc.subjectRAG
dc.subjectEHR
dc.subject.keywordsNatural Language Processing
dc.subject.keywordsClinical Practice Guidelines
dc.subject.keywordsLarge Language Model
dc.subject.keywordsRetrieval Augmented Generation
dc.subject.keywordsElectronic Health Records
dc.subject.keywordsNLP
dc.subject.keywordsCPG
dc.subject.keywordsLLM
dc.subject.keywordsRAG
dc.subject.keywordsEHR
dc.titleDiseño y desarrollo de una herramienta basada en procesamiento de lenguaje natural mediado por Deep Learning para evaluación de registros clínicos electrónicos
dc.title.translatedDesign and development of a tool based on Natural Language Processing mediated by Deep Learning for the evaluation of electronic health records

Archivos

Bloque original
Mostrando 1 - 1 de 1
Cargando...
Miniatura
Nombre:
Trabajo de grado.pdf
Tamaño:
1.55 MB
Formato:
Adobe Portable Document Format
Bloque de licencias
Mostrando 1 - 3 de 3
No hay miniatura disponible
Nombre:
license.txt
Tamaño:
1.95 KB
Formato:
Item-specific license agreed upon to submission
Descripción:
No hay miniatura disponible
Nombre:
Carta de autorizacion.pdf
Tamaño:
2.23 MB
Formato:
Adobe Portable Document Format
Descripción:
No hay miniatura disponible
Nombre:
Anexo 1 Probatus.pdf
Tamaño:
169.04 KB
Formato:
Adobe Portable Document Format
Descripción: