Risk of bias of included studies.
| Author/Year | Country | Intervention/AI approach | Timing | Outcomes measurement | Validation of tool (Y/N) | Quality assessment (RoB 2 overall) |
|---|---|---|---|---|---|---|
| Bargshady et al. [17], 2024 | Australia/USA | Vision transformer | Acute pain datasets | Accuracy, comparison with baselines | Y | Low risk (well-reported external datasets) |
| Bargshady et al. [18], 2020 | Australia/Netherlands | Ensemble CNN + RNN | Lab datasets | Accuracy, ROC | Y | Some concerns (no external clinical validation) |
| Bellal et al. [19], 2024 | France | NEVVA® device (AI facial) | ICU pilot | Device calibration vs. experts | Y | Some concerns (small sample, feasibility only) |
| Benavent-Lledo et al. [20], 2023 | Spain | Transformer-based CV | Lab datasets | Accuracy, F1 | Y | Low risk (robust datasets, transparent methods) |
| Cascella et al. [21], 2024 | Italy | Binary AU-based classifier | Oncology outpatient | Accuracy, AUROC | Y | Some concerns (limited clinical cohort) |
| Cascella et al. [22], 2024 | Italy | Multimodal (speech + facial) | Clinical trial NCT04726228 | Classification accuracy | Y | Low risk (registered trial, multimodal) |
| Cascella et al. [23], 2023 | Italy | YOLOv8 | Lab/clinical feasibility | Detection metrics | Y | Some concerns (pilot, limited validation) |
| Casti et al. [24], 2019 | Italy | DL pain intensity system | Lab | Accuracy, calibration | Y | Low risk (strong methodological rigor) |
| Casti et al. [25], 2021 | Italy | Transfer entropy + ML | Lab | Accuracy, robustness | Y | Low risk |
| Chen et al. [26], 2022 | USA | AU combinations + MIL | Clinical + lab | Accuracy, AUC | Y | Low risk |
| Dutta and M [27], 2018 | India | Hybrid DL | Lab + simulated | Accuracy, computational metrics | Y | Some concerns (older methods, limited clinical data) |
| Ghosh et al. [28], 2025 | India/Switzerland | Multimodal (facial + audio) | Lab datasets | Accuracy (2–5 classes) | Y | Low risk |
| Guo et al. [29], 2021 | China | CNN/LSTM | Cold pressor | F1 score | Y | Some concerns (small sample) |
| Heintz et al. [30], 2025 | USA multicenter | CNN-based | Perioperative | AUROC, Brier score | Y | Low risk (robust clinical dataset) |
| Mao et al. [31], 2025 | China | Conv-Transformer multitask | Lab | Regression + classification | Y | Low risk |
| Mieronkoski et al. [32], 2020 | Finland | sEMG + ML | Experimental pain | c-index, features | Y | Some concerns (small sample, modest accuracy) |
| Morsali and Ghaffari [33], 2025 | Iran/UK | ErAS-Net | Lab datasets | Accuracy, cross-dataset | Y | Low risk |
| Park et al. [34], 2024 | Korea | ML (facial, ANI, vitals) | Postoperative | AUROC | Y | Low risk (clinical real-world) |
| Pikulkaew et al. [35], 2021 | Thailand | CNN | Lab | Precision, accuracy | Y | Low risk |
| Rezaei et al. [36], 2021 | Canada | DL | Long-term care | Sensitivity, specificity | Y | Low risk (validated on target population) |
| Rodriguez et al. [37], 2022 | Spain/Denmark | CNN + LSTM | Lab | AUC, accuracy | Y | Low risk |
| Semwal and Londhe [38], 2024 | India | Spatio-temporal network | Lab | Accuracy | Y | Some concerns (no external validation) |
| Tan et al. [39], 2025 | Singapore | STA-LSTM | Clinical | Accuracy, F1 | Y | Low risk |
| Yuan et al. [40], 2024 | China | AU-guided CNN | ICU, ventilated pts | Accuracy, regression | Y | Low risk |
| Zhang et al. [41], 2025 | China | VGG16 pretrained | Postoperative | AUROC, F1 | Y | Low risk |
AI: artificial intelligence; CNN: convolutional neural network; RNN: recurrent neural network; ROC: receiver operating characteristic; NEVVA: Non-Verbal Visual Analog device; ICU: intensive care unit; CV: computer vision; AU: action unit; AUROC: area under the receiver operating characteristic curve; YOLOv8: You Only Look Once version 8; ML: machine learning; MIL: multiple instance learning; AUC: area under the curve; DL: deep learning; LSTM: long short-term memory; sEMG: surface electromyography; ErAS-Net: enhanced residual attention-based subject-specific network; ANI: analgesia nociception index; STA-LSTM: Spatio-Temporal Attention Long Short-Term Memory.