Contributed by:

Shashwath P
Shashank Ashok
Akilan Yohendiran

Total downloads all time - 1126

Model Card for Model ID

The following model is an experimental fine tuned model of the IDEFIC 9B version, for medical Visual Question Answering. It uses a dataset combined from SLAKE and VQARAD. Check the following repository for the notebooks of training,merging and inference. https://github.com/Shashwathp/Idefic_medical_vqa

Model Description

This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.

Developed by: [@Shashwath01,@Akill19,@Shashank91097 ]
Model type: [Multimodal, Visual Question Answering]
Language(s) (NLP): [English]
License: [Apache - 2.0]
Finetuned from model [optional]: [IDEFIC 9B]

Dataset

https://huggingface.co/datasets/Shashwath01/VQARAD_SLAKE

Model Sources

Repository: https://github.com/Shashwathp/Idefic_medical_vqa
Paper : https://ieeexplore.ieee.org/document/10616779

How to Get Started with the Model

Check the below link to get started with inferencing. https://github.com/Shashwathp/Idefic_medical_vqa/blob/main/inference.ipynb

Citation

[1] S. Punneshetty, S. Ashok, M. Niranjanamurthy, and S. V. N. Murthy, "Fine Tuning Idefic 9b With LORA for Multimodal Medical VQA," in Proceedings of the 2024 International Conference on Knowledge Engineering and Communication Systems (ICKECS), India, Apr. 2024, pp. 1-8. DOI: 10.1109/ICKECS61492.2024.10616779.

Shashwath01
/

Idefic_medical_VQA_merged_4bit