Photography-based diagnostic models
Author, year | Task; classes (n) | Feature extractors/Features extracted | Classifier | Accuracy | Specificity (TNR) | Sensitivity (recall) | Precision (PPV) | AUC | F1-score or Jaccard index |
---|---|---|---|---|---|---|---|---|---|
Camalan et al. [1], 2021 | Classification; suspicious (54) and normal (54) ROIs in photographic images | - | Inception ResNet-v2 | 86.5% | - | - | - | - | - |
- | ResNet-101 | 79.3% | - | - | - | - | - | ||
Figueroa et al. [2], 2022 | Classification; suspicious (i.e., OSCC and OPMD) (~ 2,800) and normal (~ 2,800) photographic images | - | GAIN network | 84.84% | 89.3% | 76.6% | - | - | - |
Flügge et al. [3], 2023 | Classification; OSCC (703) and normal (703) photographic images | - | Swin-transformer DL network | 0.98 | 0.98 | 0.98 | - | - | 0.98 |
Jubair et al. [4], 2022 | Classification; suspicious [i.e., OSCC and OPMD (236)] and benign (480) photographic images | - | EfficientNetB0 | 85% | 84.5% | - | - | 0.92 | - |
Jurczyszyn et al. [5], 2020 | Classification; OSCC (35) and normal (35) photographic images (1 normal and one of leukoplakia in the same patient) | MaZda software/Textural features, as run length matrix (two), co-occurrence matrix (two), Haar Wavelet transformation (two) | Probabilistic neural network | - | 97% | 100% | - | - | - |
Lim et al. [6], 2021 | Classification; no referral (493), refer—cancer/high-risk (636), refer—low-risk (685), and refer—other reasons (641) | - | ResNet-101 | - | - | 61.70% | 61.96% | - | 61.68% |
Shamim et al. [7], 2019 | Classification; benign and precancerous (200) photographic images | - | VGG19 | 98% | 97% | 89% | - | - | - |
AlexNet | 93% | 94% | 88% | - | - | - | |||
GoogLeNet | 93% | 88% | 80% | - | - | - | |||
ResNet50 | 90% | 96% | 84% | - | - | - | |||
Inceptionv3 | 93% | 88% | 83% | - | - | - | |||
SqueezeNet | 93% | 96% | 85% | - | - | - | |||
Classification; types of tongue lesions (300) photographic images | - | VGG19 | 97% | - | - | - | - | - | |
AlexNet | 83% | - | - | - | - | - | |||
GoogLeNet | 88% | - | - | - | - | - | |||
ResNet50 | 97% | - | - | - | - | - | |||
Inceptionv3 | 92% | - | - | - | - | - | |||
SqueezeNet | 90% | - | - | - | - | - | |||
Sharma et al. [8], 2022 | Classification; OSCC (121), OPMD (102) and normal (106) photographic images | - | VGG19 | 76% | - | OSCC: 0.43 | OSCC: 0.76 | OSCC: 0.92 | OSCC: 0.45 |
- | Normal: 1 | Normal: 0.9 | Normal: 0.99 | Normal: 0.95 | |||||
- | OPMD: 0.78 | OPMD: 0.7 | OPMD: 0.88 | OPMD: 0.74 | |||||
VGG16 | 72% | - | - | - | OSCC: 0.94 | - | |||
- | - | - | Normal: 0.96 | - | |||||
- | - | - | OPMD: 0.92 | - | |||||
MobileNet | 72% | - | - | - | OSCC: 0.88 | - | |||
- | - | - | Normal: 0.99 | - | |||||
- | - | - | OPMD: 0.80 | - | |||||
InceptionV3 | 68% | - | - | - | OSCC: 0.88 | - | |||
- | - | - | Normal: 0.1 | - | |||||
- | - | - | OPMD: 0.88 | - | |||||
ResNet50 | 36% | - | - | - | OSCC: 0.43 | - | |||
- | - | - | Normal: 0.33 | - | |||||
- | - | - | OPMD: 0.42 | - | |||||
Song et al. [9], 2021 | Classification; malignant (911), premalignant (1,100), benign (243) and normal (2,417) polarized white light photographic images | - | VGG19 | 80% | - | 79% | 83% | - | 81% |
Song et al. [10], 2023 | Classification; suspicious (1,062), normal (978) photographic images | - | SE-ABN | 87.7% | 88.6% | 86.8% | 87.5% | - | - |
SE-ABN + manually edited attention maps | 90.3% | 90.8% | 89.8% | 89.9% | - | - | |||
Tanriver et al. [11], 2021 | Segmentation, object detection and classification; carcinoma (162), OPMD (248) and benign (274) photographic images | - | EfficientNet-b4 | - | - | 85.5% | 86.9% | - | 85.8% |
Inception-v4 | - | - | 85.5% | 87.7% | - | 85.8% | |||
DenseNet-161 | - | - | 84.1% | 87.9% | - | 84.4% | |||
ResNet-152 | - | - | 81.2% | 82.6% | - | 81.1% | |||
Ensemble | - | - | 84.1% | 84.9% | - | 84.3% | |||
Thomas et al. [12], 2013 | Classification; 192 sections of photographic images from 16 patients | GLCM, GLRL and intensity based first order features (eleven selected features) | Backpropagation based ANN | 97.92% | - | - | - | - | - |
Warin et al. [13], 2021 | Object detection and classification; OPMD (350) and normal (350) photographic images | - | DenseNet-121 | - | 100% | 98.75% | 99% | 0.99 | 99% |
Warin et al. [14], 2022 | Object detection and classification; OPMD (315) and OSCC (365) photographic images | - | DenseNet-169 | - | OSCC: 99% | OSCC: 99% | OSCC: 98% | OSCC: 1 | OSCC: 98% |
- | OPMD: 97% | OPMD: 95% | OPMD: 95% | OPMD: 0.98 | OPMD: 95% | ||||
ResNet-101 | - | OSCC: 94% | OSCC: 92% | OSCC: 96% | OSCC: 0.99 | OSCC: 94% | |||
- | OPMD: 94% | OPMD: 97% | OPMD: 97% | OPMD: 0.97 | OPMD: 97% | ||||
Warin et al. [15], 2022 | Object detection and classification; OPMD (300) and normal (300) photographic images | - | DenseNet-121 | - | 90% | 100% | 91% | 0.95 | 95% |
ResNet-50 | - | 91.67% | 98.39% | 92% | 0.95 | 95% | |||
Welikala et al. [16], 2020 | Object detection and classification; referral (1,054) and non-referral (379) photographic images | - | ResNet-101 | - | - | 93.88% | 67.15% | - | 78.30% |
Xue et al. [17], 2022 | Classification; ruler (440) and non-ruler (2,377) photographic images; first batch (2,817 images/250 patients), second batch (4,331 images/168 patients) | - | ResNetSt | 99.6% | 99.6% | 100% | 97.9% | 99.6% | 98.9% |
Vit | 99.8% | 99.8% | 100% | 0.98 | 99.8% | 99.5% |
ANN: artificial neural network; DL: deep learning; GAIN: guided attention inference; GLCM: gray-level co-occurrence matrix; GLRL: grey level run-length matrix; OPMD: oral potentially malignant disorders; OSCC: oral squamous cell carcinoma; PPV: positive predictive value; ROI: region of interest; TNR: true negative rate; AUC: area under the curv