Diseño y desarrollo de una herramienta basada en procesamiento de lenguaje natural mediado por Deep Learning para evaluación de registros clínicos electrónicos
dc.contributor.advisor | Delgado Roman, Carlos Ignacio | |
dc.contributor.author | Bonilla Arandia, Daniel | |
dc.date.accessioned | 2024-07-16T00:52:06Z | |
dc.date.available | 2024-07-16T00:52:06Z | |
dc.date.issued | 2024-06 | |
dc.description.abstract | Las actividades de evaluación de registros clínicos electrónicos (EHR) como la medición de adherencia a Guías de Práctica Clínica (GPC) presentan un reto para los prestadores de servicios de salud, puesto que su adecuada implementación permite reducir la variabilidad de la atención y costos de atención. Sin embargo, es una tarea que requiere una revisión de registros elevado y retroalimentación constante para considerarse exitosa. Esta investigación presenta el diseño y desarrollo de un prototipo basado en Procesamiento de Lenguaje Natural (NLP) mediado por Deep Learning para evaluar la adherencia a Guías de Práctica Clínica (GPC) en registros clínicos electrónicos, haciendo uso de un modelo grande de lenguaje (LLM) preentrenado y la combinación de Prompt Engineering y Retrieval Augmented Generation (RAG) para analizar las historias clínicas y determinar la adherencia. | |
dc.description.abstractenglish | Evaluation activities for electronic health records (EHR), such as measuring adherence to Clinical Practice Guidelines (CPGs), present a challenge for healthcare providers, as proper implementation helps reduce variability in care and healthcare costs. However, it is a task that requires extensive record review and constant feedback to be considered successful. This research presents the design and development of a prototype based on Natural Language Processing (NLP) mediated by Deep Learning to evaluate adherence to Clinical Practice Guidelines (CPGs) in electronic health records. It utilizes a large pre-trained language model (LLM) and combines Prompt Engineering with Retrieval Augmented Generation (RAG) to analyze medical records and determine adherence. | |
dc.description.sponsorship | Clinica Universidad de la Sabana | |
dc.identifier.uri | https://hdl.handle.net/20.500.12495/12672 | |
dc.language.iso | es | |
dc.relation.references | Abid, A., Abdalla, A., Abid, A., Khan, D., Alfozan, A., & Zou, J. (2019). Gradio: Hassle-Free Sharing and Testing of ML Models in the Wild. arXiv. https://doi.org/10.48550/arXiv.1906.02569 | |
dc.relation.references | Almazrouei, E., Alobeidli, H., Alshamsi, A., Cappelli, A., & Cojocaru, R. (2023). The Falcon Series of Open Language Models. arXiv. https://doi.org/10.48550/arXiv.2311.16867 | |
dc.relation.references | Anyscale. (2024). anyscale. anyscale. https://www.anyscale.com/ | |
dc.relation.references | Baker, R., Camosso-Stefinovic, J., Gillies, C., Shaw, E. J., Cheater, F., Flottorp, S., & Robertson, N. (2010). Tailored interventions to overcome identified barriers to change: effects on professional practice and health care outcomes. In The Cochrane Collaboration (Ed.), Cochrane Database of Systematic Reviews (p. 1). Wiley. https://doi.org/10.1002/14651858.CD005470.pub2 | |
dc.relation.references | Balaguer, A., Benara, V., Cunha, R. L. F., Filho, R. M. E., & Hendry, T. (2024). RAG vs Fine-tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture. arXiv. https://doi.org/10.48550/arXiv.2401.08406 | |
dc.relation.references | Beeching, E., Fourrier, C., Habib, N., & Han, S. (2023). Open LLM Leaderboard. Hugging Face. https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard | |
dc.relation.references | Belkada, Y., Dettmers, T., Pagnoni, A., Gugger, S., & Mangrulkar, S. (2023, May 24). Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA. Hugging Face. https://huggingface.co/blog/4bit-transformers-bitsandbytes | |
dc.relation.references | Burgers, J. S. (2003). Towards evidence-based clinical practice: an international survey of 18 clinical guideline programs. International Journal for Quality in Health Care, 15(1), 31–45. https://doi.org/10.1093/intqhc/15.1.31 | |
dc.relation.references | Cabana, M. D., Rand, C. S., Powe, N. R., Wu, A. W., Wilson, M. H., Abboud, P. A. C., & Rubin, H. R. (1999). Why Don’t Physicians Follow Clinical Practice Guidelines?: A Framework for Improvement. JAMA, 282(15), 1458. https://doi.org/10.1001/jama.282.15.1458 | |
dc.relation.references | Committee on Standards for Developing Trustworthy Clinical Practice Guidelines. (2011). Clinical Practice Guidelines We Can Trust. National Academies Press. https://doi.org/10.17226/13058 | |
dc.relation.references | Congreso de Colombia. (2015). ley 1751 de 2015. Ministerio de Salud y Protección Social. https://www.minsalud.gov.co/normatividad_nuevo/ley%201751%20de%202015.pdf | |
dc.relation.references | Consejo Asesor Para el Sistema Único de Acreditación en Salud. (2011). Manual de Acreditación en Salud Ambulatorio y Hospitalario de Colombia. Estándares de Acreditación. https://acreditacionensalud.org.co/estandares-de-acreditacion/ | |
dc.relation.references | Cuenta de alto costo. (2023, June 9). Día mundial de la prevención del cáncer de próstata 2023 - Cuenta de Alto Costo. Cuenta de alto Costo. https://cuentadealtocosto.org/cancer/dia-mundial-de-la-prevencion-del-cancer-de-prostata-2023 | |
dc.relation.references | Dettmers, T., Pagnoni, A., Holtzman, A., & Zettlemoyer, L. (2023). QLoRA: Efficient Finetuning of Quantized LLMs. arXiv. https://doi.org/10.48550/arXiv.2305.14314 | |
dc.relation.references | Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv. https://doi.org/10.48550/arXiv.1810.04805 | |
dc.relation.references | Douze, M., Guzhva, A., Deng, C., Johnson, J., & Szilvasy, G. (2024). The Faiss library. arXiv. https://doi.org/10.48550/arXiv.2401.08281 | |
dc.relation.references | Essel, V., Vuuren, U. V., Sa, A. D., Govender, S., Murie, K., Schlemmer, A., Gunst, C., Namane, M., Boulle, A., & Vries, E. D. (2015). Auditing chronic disease care: Does it make a difference? African Journal of Primary Health Care & Family Medicine, 7(1), 7. https://doi.org/10.4102/phcfm.v7i1.753 | |
dc.relation.references | Fischer, F., Lange, K., Klose, K., Greiner, W., & Kraemer, A. (2016). Barriers and Strategies in Guideline Implementation—A Scoping Review. Healthcare, 4(3), 36. https://doi.org/10.3390/healthcare4030036 | |
dc.relation.references | Fleming, S. L., Lozano, A., Haberkorn, W. J., Jindal, J. A., & Reis, E. P. (2023). MedAlign: A Clinician-Generated Dataset for Instruction Following with Electronic Medical Records. arXiv. https://doi.org/10.48550/arXiv.2308.14089 | |
dc.relation.references | Gao, Y., Xiong, Y., Gao, X., Jia, K., & Pan, J. (2023). Retrieval-Augmented Generation for Large Language Models: A Survey. arXiv. https://doi.org/10.48550/arXiv.2312.10997 | |
dc.relation.references | Grol, R., & Grimshaw, J. (2003). From best evidence to best practice: effective implementation of change in patients’ care. The Lancet, 362(9391), 1225–1230. https://doi.org/10.1016/S0140-6736(03)14546-1 | |
dc.relation.references | Grol, R., Wensing, M., Eccles, M., & Davis, D. (2013). Improving Patient Care: The Implementation of Change in Health Care (Vol. 1). Wiley. https://doi.org/10.1002/9781118525975 | |
dc.relation.references | Gugger, S., Debut, L., Wolf, T., Schmid, P., & Mueller, Z. (2022). Accelerate: Training and inference at scale made simple, efficient and adaptable. [Computer software]. Hugging Face. https://github.com/huggingface/accelerate | |
dc.relation.references | Gunasekar, S., Zhang, Y., Aneja, J., Mendes, C. C. T., & Del Giorno, A. (2023). Textbooks Are All You Need. arXiv. https://doi.org/10.48550/arXiv.2306.11644 | |
dc.relation.references | HuggingFace. (n.d.). Sentence Transformers. HuggingFace. https://huggingface.co/sentence-transformers | |
dc.relation.references | Hysong, S. J., Best, R. G., & Pugh, J. A. (2006). Audit and feedback and clinical practice guideline adherence: Making feedback actionable. Implementation Science, 1(1), 9. https://doi.org/10.1186/1748-5908-1-9 | |
dc.relation.references | Kong, A., Zhao, S., Chen, H., Li, Q., Qin, Y., Sun, R., & Zhou, X. (2023). Better Zero-Shot Reasoning with Role-Play Prompting. arXiv. https://doi.org/10.48550/arXiv.2308.07702 | |
dc.relation.references | Kwak, G. H., & Hui, P. (2019). DeepHealth: Review and challenges of artificial intelligence in health informatics. arXiv. https://doi.org/10.48550/ARXIV.1909.00384 | |
dc.relation.references | LangChain. (n.d.). LangChain. LangChain. https://github.com/langchain-ai | |
dc.relation.references | Llanio Navarro, R., Perdomo González, G., & Arús Soler,Enrique. (2007). Propedéutica clínica y semiología médica. Tomo 1. Editorial Ciencias Médicas. Editorial Pueblo y Educación. | |
dc.relation.references | Lewis, P., Perez, E., Piktus, A., Petroni, F., & Karpukhin, V. (2024). Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. arXiv. https://doi.org/10.48550/arXiv.2005.11401 | |
dc.relation.references | Li, J., Dada, A., Kleesiek, J., & Egger, J. (2023). ChatGPT in Healthcare: A Taxonomy and Systematic Review. medRxiv. https://doi.org/10.1101/2023.03.30.23287899 | |
dc.relation.references | López-Ramos, H., Gómez Cusnir, P., Moreno, M., Patiño, G., Rasch-Isla, A., Dallos, A., Fernández, N., Jaramillo, A., & Vega, J. (2015). Guía de manejo de la hiperplasia prostática benigna. Sociedad Colombiana de Urología 2014. Urología Colombiana, 24(3), 187.e1–187.e32. https://doi.org/10.1016/j.uroco.2015.04.005 | |
dc.relation.references | Louis, Antoine. (2020a, July 7). A Brief History of Natural Language Processing — Part 1. Medium. https://medium.com/@antoine.louis/a-brief-history-of-natural-language-processing-part-1-ffbcb937ebce | |
dc.relation.references | Louis, Antoine. (2020b, July 7). A Brief History of Natural Language Processing — Part 2. https://medium.com/@antoine.louis/a-brief-history-of-natural-language-processing-part-2-f5e575e8e37 | |
dc.relation.references | Lugtenberg, M., Zegers-van Schaick, J. M., Westert, G. P., & Burgers, J. S. (2009). Why don’t physicians adhere to guideline recommendations in practice? An analysis of barriers among Dutch general practitioners. Implementation science: IS, 4, 54. https://doi.org/10.1186/1748-5908-4-54 | |
dc.relation.references | Mashette, N. (2023). Building a Document-based Question Answering System with LangChain using Open Source LLM model…. Medium. https://medium.com/@nageshmashette032/building-a-document-based-question-answering-system-with-langchain-using-open-source-llm-model-346efe676be6 | |
dc.relation.references | McGlynn, E. A., Asch, S. M., Adams, J., Keesey, J., Hicks, J., DeCristofaro, A., & Kerr, E. A. (2003). The Quality of Health Care Delivered to Adults in the United States. New England Journal of Medicine, 348(26), 2635–2645. https://doi.org/10.1056/NEJMsa022615 | |
dc.relation.references | Meta Research. (2024a). faiss [Computer software]. Meta Research. https://github.com/facebookresearch/faiss | |
dc.relation.references | Meta Research. (2024b). Getting to know Llama 2: Everything you need to start building. github. https://github.com/meta-llama/llama-recipes/blob/main/recipes/quickstart/Getting_to_know_Llama.ipynb | |
dc.relation.references | Mickan, S., Burls, A., & Glasziou, P. (2011). Patterns of ‘leakage’ in the utilisation of clinical guidelines: a systematic review. Postgraduate Medical Journal, 87(1032), 670–679. https://doi.org/10.1136/pgmj.2010.116012 | |
dc.relation.references | Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient Estimation of Word Representations in Vector Space. arXiv. https://doi.org/10.48550/arXiv.1301.3781 | |
dc.relation.references | Ministerio de Salud y Protección Social. (1993). Ley 100 de 1993. Funcion publica. https://www.funcionpublica.gov.co/eva/gestornormativo/norma.php?i=5248 | |
dc.relation.references | Ministerio de Salud y Protección Social. (2010). Seguridad del paciente y la atención segura. Ministerio de Salud y Protección Social. https://www.minsalud.gov.co/sites/rid/Lists/BibliotecaDigital/RIDE/DE/CA/Guia-buenas-practicas-seguridad-paciente.pdf | |
dc.relation.references | Ministerio de Salud y Protección Social. (2016). Sistema Obligatorio de Garantía de Calidad en Salud (SOGCS). Ministerio de Salud y Protección Social. https://www.minsalud.gov.co/salud/PServicios/Paginas/sistema-obligatorio-garantia-calidad-SOGC.aspx | |
dc.relation.references | Ministerio de Salud y Protección Social. (2018). Guía práctica del sistema obligatorio de garantía de la calidad en salud SOGCS. Gobernacion de Boyaca. https://boyaca.gov.co/SecSalud/images/Documentos/cartilla_didactica.pdf | |
dc.relation.references | Moutawwakil, I., & Pierrard, R. (2023). LLM-Perf Leaderboard. Hugging Face. https://huggingface.co/spaces/optimum/llm-perf-leaderboard | |
dc.relation.references | OpenAI, Achiam, J., Adler, S., Agarwal, S., & Ahmad, L. (2023). GPT-4 Technical Report. arXiv. https://doi.org/10.48550/arXiv.2303.08774 | |
dc.relation.references | Ovadia, O., Brief, M., Mishaeli, M., & Elisha, O. (2024). Fine-Tuning or Retrieval? Comparing Knowledge Injection in LLMs. arXiv. https://doi.org/10.48550/arXiv.2312.05934 | |
dc.relation.references | Rasmussen, J. N., Chong, A., & Alter, D. A. (2007). Relationship Between Adherence to Evidence-Based Pharmacotherapy and Long-term Mortality After Acute Myocardial Infarction. JAMA, 297(2), 177. https://doi.org/10.1001/jama.297.2.177 | |
dc.relation.references | Roehrborn, C. G. (2005). Benign prostatic hyperplasia: an overview. Reviews in Urology, 7(9), 3–14. https://pubmed.ncbi.nlm.nih.gov/16985902/ | |
dc.relation.references | Rudin, R. S., Fischer, S. H., Shi, Y., & Shekelle, P. (2019). Trends in the Use of Clinical Decision Support by Health System–Affiliated Ambulatory Clinics in the United States, 2014-2016. American Journal of Managed Care, 12(19), 4–10. https://researchgate.net/publication/338989890_Trends_in_the_Use_of_Clinical_Decision_Support_by_Health_System-Affiliated_Ambulatory_Clinics_in_the_United_States_2014-2016 | |
dc.relation.references | Sinha, R. K. (2010). Impact of Health Information Technology in Public Health. Sri Lanka Journal of Bio-Medical Informatics, 1(4), 223. https://doi.org/10.4038/sljbmi.v1i4.2239 | |
dc.relation.references | Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to Sequence Learning with Neural Networks. arXiv. https://doi.org/10.48550/arXiv.1409.3215 | |
dc.relation.references | Tang, R., Han, X., Jiang, X., & Hu, X. (2023). Does synthetic data generation of llms help clinical text mining? arXiv. https://doi.org/10.48550/arXiv.2303.04360 | |
dc.relation.references | Tayefi, M., Ngo, P., Chomutare, T., Dalianis, H., Salvi, E., Budrionis, A., & Godtliebsen, F. (2021). Challenges and opportunities beyond structured data in analysis of electronic health records. WIREs Computational Statistics, 13(6). https://doi.org/10.1002/wics.1549 | |
dc.relation.references | Touvron, H., Martin, L., Stone, K., Albert, P., & Almahairi, A. (2023). Llama 2: Open Foundation and Fine-Tuned Chat Models. arXiv. https://doi.org/10.48550/arXiv.2307.09288 | |
dc.relation.references | Tunstall, L., Werra, L. V., & Wolf, T. (2022). Question Answering. In Natural language processing with transformers: building language applications with Hugging Face (p. 1). O’Reilly Media. https://www.oreilly.com/library/view/natural-language-processing/9781098136789/ | |
dc.relation.references | Unstructured Technologies Inc. (2024). unstructured [Computer software]. Unstructured. https://github.com/Unstructured-IO/unstructured | |
dc.relation.references | Vajjala, S., Majumder, B., Gupta, A., & Surana, H. (2020). NLP: A Primer. In Practical Natural Language Processing (p. 1). O’Reilly Media, Inc. https://www.oreilly.com/library/view/practical-natural-language/9781492054047/ | |
dc.relation.references | Vasilakes, J., Zhou, S., & Zhang, R. (2021). Natural language processing. In Machine Learning in Cardiovascular Medicine (pp. 123–148). Elsevier. https://doi.org/10.1016/B978-0-12-820273-9.00006-3 | |
dc.relation.references | Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2017). Attention is All you Need. arXiv. https://doi.org/10.48550/arXiv.1706.03762 | |
dc.relation.references | Verhamme, K. M. C., Dieleman, J. P., Bleumink, G. S., van der Lei, J., & Sturkenboom, M. C. J. M. (2002). Incidence and Prevalence of Lower Urinary Tract Symptoms Suggestive of Benign Prostatic Hyperplasia in Primary Care—The Triumph Project. European Urology, 42(4), 323–328. https://doi.org/10.1016/S0302-2838(02)00354-8 | |
dc.relation.references | Vuichoud, C., & Loughlin, K. R. (2015). Benign prostatic hyperplasia: epidemiology, economics and evaluation. The Canadian Journal of Urology, 22(1), 1–6. https://pubmed.ncbi.nlm.nih.gov/26497338/ | |
dc.relation.references | Wei, J., Wang, X., Schuurmans, D., Bosma, M., & Ichter, B. (2023). Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. arXiv. https://doi.org/10.48550/arXiv.2201.11903 | |
dc.relation.references | Wolf, T., Debut, L., Sanh, V., Chaumond, J., & Delangue, C. (2020). Transformers: State-of-the-Art Natural Language Processing [Computer software]. Hugging Face. https://huggingface.co/docs/transformers/index | |
dc.relation.references | Yang, S., Zhu, F., Ling, X., Liu, Q., & Zhao, P. (2021). Intelligent Health Care: Applications of Deep Learning in Computational Medicine. Frontiers in Genetics, 12. https://doi.org/10.3389/fgene.2021.607471 | |
dc.rights | Atribución-NoComercial-CompartirIgual 4.0 Internacional | en |
dc.rights.accessrights | http://purl.org/coar/access_right/c_abf2 | |
dc.rights.accessrights | info:eu-repo/semantics/openAccess | |
dc.rights.local | Acceso abierto | |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-sa/4.0/ | |
dc.subject | Procesamiento de lenguaje natural | |
dc.subject | Guías de práctica clínica | |
dc.subject | Modelo grande de lenguaje | |
dc.subject | Recuperación generación aumentada | |
dc.subject | Registros clínicos electrónicos | |
dc.subject | NLP | |
dc.subject | GPC | |
dc.subject | LLM | |
dc.subject | RAG | |
dc.subject | EHR | |
dc.subject.keywords | Natural Language Processing | |
dc.subject.keywords | Clinical Practice Guidelines | |
dc.subject.keywords | Large Language Model | |
dc.subject.keywords | Retrieval Augmented Generation | |
dc.subject.keywords | Electronic Health Records | |
dc.subject.keywords | NLP | |
dc.subject.keywords | CPG | |
dc.subject.keywords | LLM | |
dc.subject.keywords | RAG | |
dc.subject.keywords | EHR | |
dc.title | Diseño y desarrollo de una herramienta basada en procesamiento de lenguaje natural mediado por Deep Learning para evaluación de registros clínicos electrónicos | |
dc.title.translated | Design and development of a tool based on Natural Language Processing mediated by Deep Learning for the evaluation of electronic health records |
Archivos
Bloque original
1 - 1 de 1
Cargando...
- Nombre:
- Trabajo de grado.pdf
- Tamaño:
- 1.55 MB
- Formato:
- Adobe Portable Document Format
Bloque de licencias
1 - 3 de 3
No hay miniatura disponible
- Nombre:
- license.txt
- Tamaño:
- 1.95 KB
- Formato:
- Item-specific license agreed upon to submission
- Descripción:
No hay miniatura disponible
- Nombre:
- Carta de autorizacion.pdf
- Tamaño:
- 2.23 MB
- Formato:
- Adobe Portable Document Format
- Descripción:
No hay miniatura disponible
- Nombre:
- Anexo 1 Probatus.pdf
- Tamaño:
- 169.04 KB
- Formato:
- Adobe Portable Document Format
- Descripción: