This is just a toy experiment to see how to deploy a Fast.AI model with Voilà.
I used a ResNet18 model pre-trained on ImageNet classifier which was adapted (through "transfer learning") for a small set of images downloaded using Microsoft Bing Internet Search API.
To build this toy dataset for fine tuning, I used the following search terms (each treated as a class):
- X-ray of lungs with SARS-CoV2
- cancer lungs x-ray
- covid-19 lungs x-ray
- healthy lungs x-ray
- normal lungs x-ray
- pneumonia lungs x-ray
- selfie
Obviously some labels actually refer to the same class. The class "selfie" is my background class, i.e., it's expected that a sample that does not contain an X-ray image of the lung will be classified as "selfie".
A total of 1134 images were obtained, of which 80% was used for training and the remaining for validation. Fine tuning was done with batch size of 64 samples, for 10 epochs, giving the error rate of 40.97% and confusion matrix below:
Even if classes with the same meaning are grouped, this is clearly not a great result, so it illustrates how challenging this problem is, particularly when dealing with a small and noisy dataset. The training set was not curated, I just used whatever Bing Search gave, i.e., this is NOT a serious experiments for CoVid-19 detection.
A decent work with proper training samples labeled by experts is being developed by my colleague Flavio Vidal and his team, see Projeto XRAI at https://x-rai.redes.unb.br/.