Affiliation:
1Department of Science, Termodiagnose Institute Brazil - Infrared Thermography, Medical Thermology and Pain Medicine, Itu, SP 13300-025, Brazil
2Department of Research and Development, Division of Data Science and Analytics, Takotsubo Life Sciences, São Paulo, SP 01310-200, Brazil
Email: joaoalberto@institutotermodiagnose.com.br
ORCID: https://orcid.org/0000-0002-9158-8908
Affiliation:
3Department of Biology, Universidade Cruzeiro do Sul - CEUNSP (Centro Educacional Nossa Senhora do Patrocínio), Itu, SP 13300-023, Brazil
ORCID: https://orcid.org/0000-0002-7647-9092
Explor Musculoskeletal Dis. 2026;4:1007122 DOI: https://doi.org/10.37349/emd.2026.1007122
Received: November 13, 2025 Accepted: March 16, 2026 Published: April 01, 2026
Academic Editor: Philippe Gorce, University of Toulon, France
The article belongs to the special issue Prevalence and Risk Factors of Work-related Musculoskeletal Disorders
Work-related musculoskeletal disorders (WMSDs) are a major occupational health burden, yet early functional alterations are often difficult to capture with symptom-only screening or with predominantly structural imaging. Infrared thermography (IRT) provides a noncontact, nonionizing, physiologically grounded readout of superficial heat exchange that is strongly influenced by microperfusion and autonomic vasomotor control. We conducted an integrative narrative review with traceable study-level compilation to synthesize physiological foundations for thermal-signal interpretation, minimum requirements for acquisition standardization and Quality Control, and occupational applications for screening, risk characterization, and longitudinal monitoring, including multimodal integration. The final included corpus comprised 247 studies spanning diverse designs and contexts, with substantial heterogeneity in devices, regions of interest (ROIs), environmental conditions, thermal metrics, and reporting completeness. Across the evidence, interpretability was consistently dependent on protocol stability and on ROI-based, within-subject metrics [bilateral asymmetry, task-induced temperature difference (ΔT), and recovery dynamics] rather than isolated absolute thresholds. Occupational applications have most often targeted repetitive upper-limb demands, computer-based work, and cold challenge/rewarming paradigms in vibration-exposed populations. We provide an operational checklist aligned with guideline recommendations and propose a pragmatic multimodal workflow integrating IRT with functional measures [surface electromyography (sEMG), strength], structural/perfusion modalities (ultrasonography), and patient-reported outcomes. Future priorities include multicenter harmonization, occupation- and task-specific reference profiles, and prospective validation of decision rules under real-world conditions.
Work-related musculoskeletal disorders (WMSDs) represent a major occupational health problem, associated with pain, functional limitation, and productivity losses in economically active populations [1, 2]. From a clinical-epidemiological standpoint, World Health Organization (WHO) estimates indicate that approximately 1.71 billion people live with musculoskeletal (MSK) conditions that impair mobility, productivity, and well-being, underscoring the urgency and scientific relevance of this topic [3, 4]. In occupational settings, the occurrence and severity of WMSDs reflect a multifactorial exposure matrix, in which biomechanical demands and organizational and work-process determinants modulate risk and outcomes. This supports the emphasis on identifying incidence and risk factors as a necessary step to design and implement preventive interventions [5–7].
Within the current occupational health care model, identifying early functional alterations is often challenging, since initial surveillance typically relies on screening instruments and self-report measures whose effectiveness and standardization vary across occupational contexts and applications [8–10]. Pain-based measures and intensity scales are inherently subjective and may exhibit measurement variability across different instruments and populations, which limits longitudinal comparability and reduces precision for detecting subtle changes at the onset of symptoms [11–13]. Clinical examination, although indispensable, may underestimate subclinical inflammation/activity or discrete changes when compared with complementary imaging and functional assessment methods, highlighting limitations for early characterization in low-clinical-expression scenarios [10, 14]. Predominantly anatomic imaging modalities, such as magnetic resonance imaging (MRI) and ultrasonography (US), provide excellent structural/anatomic characterization; however, they do not always, on their own, capture early physiological dysfunction and the functional dynamics associated with occupational overload [15–17]. In addition, objective functional measures such as surface electromyography (sEMG) require technical rigor in acquisition and preprocessing and can be sensitive to collection conditions, which complicates their routine implementation in large-scale occupational surveillance and screening programs [18].
Infrared thermography (IRT) records the infrared (IR) radiation emitted by the skin and estimates surface temperature as an integrative outcome of the balance between tissue heat production and heat dissipation via convection, conduction, and evaporation, processes that are strongly modulated by cutaneous blood flow [19, 20]. Under physiological conditions, skin temperature is determined primarily by microperfusion and autonomic thermoregulatory mechanisms that adjust vasomotor tone and redistribute blood flow in response to local and systemic stressors [20, 21]. Experimental human studies show that small nerve fibers (Aδ/C) participate in the regulation of cutaneous blood flow during local thermal perturbations, enabling quantitative modeling of the temperature-perfusion relationship under heating/cooling protocols [21]. In addition, dynamic assessments with local heating indicate that axon-reflex-mediated vasodilation is detectable by thermography and exhibits measurable reproducibility, supporting the notion that the thermal “time profile” can function as a physiological proxy parameter for vasodilatory function and small-fiber integrity [22]. Direct comparisons between thermography and microvascular methods, such as laser Doppler, demonstrate that temperature and perfusion are not equivalent, but are related through physiological mechanisms (including vasomotor reactivity and thermoregulation), which requires interpreting the thermal signal within the clinical-functional context and under controlled acquisition conditions [23].
From a neurohumoral perspective, the skin is a target organ where sympathetic vasoconstrictor activity, axon reflexes, inflammatory mediators, and nociceptive modulation converge; thus, disturbances along these axes may manifest as regional thermal patterns [asymmetry, gradients, and task-induced temperature difference (ΔT)] [20, 24–27]. In pain syndromes characterized by autonomic and vasomotor dysfunction, clinical and pathophysiological studies have demonstrated relationships between sympathetic vasoconstrictor activity and pain/hyperalgesia, supporting the plausibility of thermal “signatures” when peripheral autonomic-vascular balance is disrupted [24, 25, 28]. Under repetitive occupational overload, early mechanisms often involve neurovascular and neuroinflammatory responses (changes in microperfusion and vasomotor reactivity) that may precede structural alterations detectable by predominantly anatomic methods, making IRT conceptually well suited to complement clinical assessment in early phases [1, 2, 13, 14]. In parallel, central and peripheral control of cutaneous blood flow and interindividual variability (conditioning status, sex, age, inflammatory state, and/or vasoactive substances) underscore that the value of IRT depends on standardization and on an interpretive model capable of distinguishing expected physiology from potentially pathological patterns [20, 22, 29].
In occupational settings, the relevance of a physiologically grounded rationale translates into applications aimed at identifying early functional alterations in risk scenarios (repetitive tasks and sustained postures) before established disability distorts the natural course and responsiveness to interventions [9, 26, 30]. Studies in office workers have used IRT to measure skin temperature on the dorsum of the hand and to examine its relationship with upper-limb MSK outcomes, suggesting that thermal measures may capture functional components associated with symptoms and severity [9, 26]. In parallel, rapid thermography-based screening proposals for assessing risk arising from repetitive upper-limb actions seek to operationalize this principle for occupational surveillance and prevention, provided that protocols reduce environmental and technical variability [30, 31]. This methodological requirement is consistent with recommendations and guidelines emphasizing control of ambient conditions, acclimatization, distance/angle, emissivity settings, calibration, and standardized regions of interest (ROIs), since these factors determine the physiological validity of the thermal signal and its longitudinal comparability [32, 33].
Despite the growing clinical and occupational applications of IRT, the literature still shows substantial heterogeneity in protocols, environmental conditions, thermal metrics, and interpretive criteria, which limits comparability across studies and hinders translation into standardized occupational surveillance [19, 31, 34–37]. In occupational contexts—including repetitive work and office-based activities—evidence supports the applicability of IRT for characterizing thermal patterns associated with overload and for proposing rapid risk-screening approaches; however, operational variability and the lack of a consistent interpretive model across tasks and populations remain prominent limitations [9, 30, 38]. In addition, reviews of screening tools and methods for assessing biomechanical workplace exposures report variability in effectiveness and methodological performance, reinforcing the need for more robust and transparent workflows when the goal is to inform large-scale preventive decision-making [8, 39, 40]. Finally, although established methods such as US and sEMG provide relevant structural and functional information, it remains insufficiently specified—at an operational and reproducible level—how to integrate thermal (physiological) findings with functional/structural measures and patient-reported outcomes (PROs) to support clinical and occupational screening and monitoring decisions [36, 37, 41].
To delineate the utility of IRT, under standardized protocols, as a complementary tool for screening, risk assessment, and monitoring of WMSDs in occupational health, including its application within multimodal interpretation models.
Describe the physiological rationale (microperfusion, autonomic modulation, and vasomotor reactivity) supporting IRT as a functional marker in occupational overload.
Establish minimum standardization requirements (environmental control, acclimatization, emissivity, calibration, ROI definition, and documentation) necessary to ensure reproducibility and comparability of the thermal signal.
Map occupational applications of IRT by task/activity and body region for screening, risk stratification, and follow-up, highlighting limitations, confounders, and divergent evidence.
Present a multimodal workflow for integrating IRT with functional/structural measures (sEMG, strength, and US) and PROs, including decision points and quality assumptions.
Accordingly, this review examines whether IRT, when acquired under standardized protocols, can support occupational screening/risk assessment and intervention monitoring for WMSDs, either as a stand-alone tool or within multimodal workflows (sEMG, strength testing, US, and PROs).
This study was designed as an integrative narrative review with a systematized and traceable compilation of evidence on the use of IRT in WMSDs. A narrative approach was selected to enable critical integration and applied interpretation across physiological foundations relevant to interpreting the thermal signal, standardization, and quality-control requirements, and occupational/ergonomic scenarios, while preserving a study identification and selection process that is sufficiently transparent for methodological audit. Manuscript preparation followed SANRA (Scale for the Assessment of Narrative Review Articles) principles [42], emphasizing an explicit rationale, a reproducibly described search, a consistent logical structure, and balanced use of directly relevant evidence. The identification, screening, and eligibility process was documented using Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) 2020 [43] with the specific purpose of recording the selection pathway; this flow documentation does not, in itself, imply quantitative synthesis. Search strategies were reported in a PRISMA extension for reporting literature searches (PRISMA-S)-compatible format [44], with full search strings and parameters provided in the Supplementary material.
The review question was structured using the Population-Concept-Context (PCC) framework [45] to explicitly define scope and eligibility within an integrative narrative review with traceable study selection. The guiding question was:
In workers exposed to occupational risk factors for WMSDs, to what extent can IRT—when acquired under standardized protocols—support screening and risk assessment/stratification, as well as monitoring of interventions, either as a stand-alone measure or integrated with functional/structural measures and patient-reported outcomes?
For the P (Population) component, we considered workers and populations in occupational settings (repetitive activities, intensive computer use, industrial sectors, and services), including asymptomatic individuals at risk, symptomatic individuals, and those diagnosed with WMSDs. For the C (Concept) component, the focus was on IRT applied to MSK conditions/WMSDs for screening, risk assessment, and monitoring, including studies employing thermal metrics and/or static/dynamic protocols under controlled conditions. For the C (Context) component, the review was delimited to occupational health and ergonomics applications, including monitoring of interventions (e.g., ergonomic measures, active breaks, and rehabilitation) and, when available, integration with complementary modalities (sEMG, strength/dynamometry, and US with or without Doppler) and PROs.
To maximize coverage and sensitivity, international and regional bibliographic databases with complementary scope were searched: PubMed/MEDLINE (Medical Literature Analysis and Retrieval System Online), Embase (Elsevier), Scopus, Web of Science Core Collection, and SciELO. In addition, technical reference documents and consensus statements focused on standardization and quality assurance in clinical thermography were included because they are directly pertinent to reproducibility and the operational validity of thermal-signal acquisition and interpretation in neuromusculoskeletal (NMSK) applications, including the International Academy of Clinical Thermology (IACT) Quality Assurance Guidelines (2015) and the American Academy of Thermology (AAT) NMSK Thermography Guidelines (2024) [32, 33]. As an additional identification method, snowballing (citation tracking) was performed, including systematic screening of reference lists (backward citation searching) from eligible studies and key reviews/guidelines, as well as forward citation searching in citation-indexed databases (Scopus/Web of Science) [46–48].
Bibliographic searches were conducted on February 12, 2026, with no lower date limit and with pre-specified language eligibility for English, Portuguese, and Spanish. Strategies combined controlled vocabulary (MeSH–Emtree) and free-text terms, organized into four conceptual blocks:
IRT/IRT;
MSK conditions/WMSDs;
occupational/ergonomics context; and,
screening, surveillance, risk assessment, and monitoring/interventions.
Searches were run in PubMed/MEDLINE, Embase, Scopus, Web of Science Core Collection, and SciELO, using platform-specific syntax and fields compatible with each search engine (ti/ab/kw; TITLE-ABS-KEY; TS). In PubMed/MEDLINE, the strategy was operationalized in three complementary sets (CORE; IRT multimodality with sEMG/strength/US; and standardization/protocols). To reduce document noise without compromising sensitivity, database-specific filters and exclusions were applied: in PubMed/MEDLINE, animal-only records were excluded using the NOT operator (animals[MeSH] NOT humans[MeSH]), while preserving records indexed with humans; in Embase, limits were applied to humans and conference publication types were excluded (conference abstract, conference paper, and conference review); in SciELO, Portuguese and Spanish strategies were used with semantic equivalents of the conceptual blocks.
Identification was supplemented by snowballing, and records retrieved through this method were subjected to the same import, normalization, and deduplication workflow prior to screening, and were counted in the PRISMA 2020 [43] flow diagram as records identified from other sources. Full, verbatim search strings—by database and by set (including fields, operators, filters, and original syntax)—were preserved and made available in Supplementary material. All retrieved records were exported in raw format (plain text or .csv when available) and integrated into a deterministic R-based pipeline [49] prior to screening and eligibility steps, preventing double-counting and ensuring consistency with the PRISMA 2020 flow diagram.
All records retrieved from electronic searches and additional methods (including snowballing) were processed through a deterministic pipeline designed to ensure provenance, traceability, reproducibility, and Quality Control (QC) prior to screening. Outputs from each source were preserved as immutable raw files, maintaining fidelity to the format and content exported by each platform. Input integrity was documented via a cryptographic manifest containing file metadata (name/relative path, size, and timestamp) and a Secure Hash Algorithm 256-bit (SHA-256), enabling audit of changes and a verifiable chain linking inputs to intermediate workflow products.
Import and parsing were performed using source-specific routines to accommodate heterogeneity in structures and fields. Each record was assigned explicit provenance fields (source/database, input file, and import position/line), preserving intra- and inter-source traceability. Next, canonical normalization of identifiers and bibliographic metadata [DOI (Digital Object Identifier), PMID (PubMed identifier), title, year, and journal] was applied, including syntactic standardization (removal of DOI prefixes/URLs when present, normalization of case and whitespace, harmonization of types/formats, and consistent handling of characters), to maximize matching sensitivity without inducing spurious merges. Records with missingness in critical fields were flagged for review (QC file), and, when strictly necessary for identifiability, targeted corrections to critical metadata could be made after verification against the original record, while maintaining traceability and consistency of the analytic dataset.
Deduplication was conducted hierarchically and deterministically using matching keys with priority given to persistent identifiers: DOI > PMID > normalized title + year > normalized title. Records were grouped into duplicate clusters by key type, and each cluster received a stable identifier derived from a hash. To select the retained records within each cluster, a deterministic retention rule based on bibliographic completeness and quality was applied, combining:
a metadata completeness score (prioritizing identifiers, year, journal, authorship, and abstract when available); and,
a fixed, a priori source hierarchy for tie-breaking (PubMed/MEDLINE > Embase > Web of Science > Scopus > SciELO > snowballing).
If ties persisted, a stable ordering based on provenance fields (source/file/import order) was used, ensuring reproducibility of the retained-record selection.
To ensure auditability, the pipeline generated:
a deduplicated master table for screening;
a table of removed records with explicit mapping between each excluded record and the retained record within the cluster; and,
audit tables by cluster and by key type, enabling systematic inspection of deduplication decisions.
This design ensures that each study contributes only once to subsequent stages (screening and eligibility), preventing count inflation and preserving consistency with the PRISMA flow diagram (including the contribution of snowballing).
Study selection was conducted in two stages [TA (Title/Abstract) and Full Text] using the master dataset generated after cross-source integration and deduplication, while preserving traceability from each record to its unique identifier and retrieval source (bibliographic databases vs. snowballing). All decisions were recorded in canonical spreadsheets and processed through deterministic routines, such that selection-flow counts were programmatically derived from the decision files and reported in the PRISMA 2020 flow diagram; final selection numbers are presented in the Results section (Figure 1).

PRISMA 2020 flow diagram of the identification, screening, eligibility, and inclusion process (including the “other methods” pathway for snowballing). Adapted from [43]. © Author(s) (or their employer(s)) 2021. CC BY.
Initial TA screening was performed in duplicate by two independent reviewers (fields REV_1 and REV_2). Decisions were captured in a structured format and binary-coded (include/exclude), with automatic identification of disagreements (ta_conflict) and generation of a preliminary decision (ta_decision) based on a deterministic rule. To reduce the risk of premature exclusion, records with disagreement were conservatively retained at the preliminary stage (conflicts temporarily coded as include for retention) until formal resolution through adjudication in a dedicated worksheet (REV_Decision). Adjudication was mandatory: conflicts without an adjudicator’s decision were listed in an audit file, and the pipeline was halted in strict mode, preventing completion with unresolved conflicts. As a quality-control measure, interrater agreement was summarized using the kappa statistic and exported as an audit artifact; data-entry/labeling inconsistencies and pending adjudications were treated as auditable events (backups, pending-item lists, and hard failures), rather than as silently ignored exceptions.
Records remaining eligible after TA underwent full-text retrieval, with explicit recording of retrieval status (retrieved vs. not retrieved). Full-text eligibility was determined according to a priori criteria, and exclusions were documented using standardized reasons from a controlled vocabulary, supplemented by notes when necessary. Logical validation routines were applied to prevent inconsistent combinations (e.g., disallowing “include” when fulltext_retrieved = no), thereby reinforcing internal integrity of the final decision set.
Data extraction was performed from the included full-text reports using a canonical spreadsheet with pre-specified fields and an associated data dictionary. For each included study, the domains and variables listed in Table 1 were extracted.
Domains and variables used for data extraction.
| Domain | Variables/Description |
|---|---|
| Study identification and bibliographic metadata | Study identifier (study_id), title, abstract, year, DOI, PMID, journal, and authors (as indexed). |
| Source and provenance | Retrieval source (source_db), contributing sources within cluster (sources), file provenance (files), and snowball presence flag (has_snowball). |
| Study design | Study design and analytic design (e.g., observational/experimental; cross-sectional/longitudinal; presence of comparator and/or follow-up) (study_design). |
| Population and occupational context | Population and occupational context (occupation/sector, predominant task/exposure, and inclusion/exclusion criteria when reported) (population_occupation). |
| Anatomic region/ROI | Anatomic region assessed and region-of-interest definition/segmentation, including laterality when reported (body_region). |
| Conditions/WMSD focus | Target condition, case definition, or WMSD-related construct assessed (condition_wmsd). |
| Comparators/reference measures | Comparator and/or reference assessments used (e.g., clinical examination, questionnaires, other imaging or physiologic measures) (comparator_reference). |
| IRT device characteristics | IRT device model/specifications when reported (irt_device). |
| IRT acquisition protocol | Key acquisition/protocol elements (environmental control, acclimatization, distance/angle, emissivity/calibration, static/dynamic procedures, pre/post task) (protocol_key_points). |
| Multimodality | Complementary modalities integrated with IRT (sEMG, strength/dynamometry, ultrasonography with/without Doppler, PROs) (multimodal_components). |
| Outcomes and measures | Thermal and non-thermal outcomes/measures extracted (e.g., absolute temperature, asymmetry, task-induced ΔT, gradients, recovery kinetics; statistical endpoints when reported) (outcomes_measures). |
| Key findings and limitations | Summary of principal results (direction/magnitude) and reported limitations, bias/confounding considerations, and reproducibility/QC notes (key_findings; limitations_bias_notes). |
PROs: patient-reported outcomes; ROI: regions of interest; sEMG: surface electromyography; WMSD: work-related musculoskeletal disorder; ΔT: temperature difference; DOI: Digital Object Identifier; PMID: PubMed identifier; QC: Quality Control.
Missingness was handled using explicit, non-overlapping codes. Fields not reported in the source article were coded as not reported (NR), with no imputation or inference, thereby preserving the distinction between absence of evidence and evidence of absence. Fields that were not applicable (NA) by study design or context (i.e., the variable could not logically be collected or did not apply) were coded as NA. Information that was sparsely and inconsistently described across the corpus (adverse events/intercurrent events) was captured only when explicitly stated in the full text, but it was not operationalized as a structured domain for cross-study synthesis.
Critical appraisal in this review was conducted at two complementary levels:
reporting quality and editorial consistency of the manuscript; and,
operational reproducibility of the primary evidence, with emphasis on elements that determine the validity of IRT applications in MSK and occupational contexts.
At the manuscript level, SANRA was used as an editorial checklist for narrative reviews. Because this review is an integrative narrative review with a systematized and traceable evidence compilation, without quantitative synthesis, a formal risk-of-bias instrument with high/low judgments by domain was not applied across all studies. Instead, a descriptive appraisal oriented toward reproducibility and plausible sources of systematic variation in the thermal signal was adopted, focusing on the practical applicability of IRT in WMSDs.
At the primary-evidence level, descriptive appraisal of studies prioritized in the synthesis (occupational/ergonomics, standardization, and/or multimodal studies) was guided by pre-specified criteria directly linked to IRT validity and interpretability, including:
reported environmental control and acclimatization;
sample size and implications for estimate stability;
presence of objective comparators and operational consistency across measures; and,
explicit quality-control and reproducibility procedures (device calibration/checks, ROI standardization, repetition/test-retest, and blinding of reading/analysis when applicable).
This appraisal informed the interpretive weight assigned to findings in the compilation, without aggregated scoring and without automatic exclusion based solely on reporting limitations.
QC of the corpus and consistency between decisions and counts were ensured by the deterministic R pipeline, with explicit normalization of identifiers and fields, hierarchical deduplication, canonical decision files (TA and full text), and automated consistency validations (preventing logically invalid combinations between full-text retrieval status and eligibility decision; requiring standardized reasons for exclusions; and recomputing PRISMA counts from the decision spreadsheets). With respect to blinding, reviewers were not blinded to authors, journals, or affiliations; mitigation focused on structured screening, explicit recording of decisions/reasons, and deterministic consistency checks.
Synthesis of the included studies was conducted through thematic integration, structured a priori into axes aligned with the PCC framework and the study objectives, while preserving linkages among IRT acquisition conditions and technical variables, the thermal metrics used, and the occupational/ergonomic context (exposure, task, anatomic region, and screening/monitoring purpose). Findings were organized into four main axes:
Axis A: physiological foundations and interpretation of the thermal signal;
Axis B: standardization and QC;
Axis C: occupational health/ergonomics applications; and,
Axis D: multimodality, describing the rationale and operational sequencing for integration with functional/structural measures and PROs.
A meta-analysis was not planned because the research question and scope encompass study designs, populations/occupations, anatomic regions, acquisition protocols, and thermal metrics that are often heterogeneous and not directly comparable under a single clinical/methodological estimator. The thematic strategy was adopted as the core approach to enable applied interpretation while maintaining traceability between inferences and source studies.
To ensure audit-ready transparency and end-to-end traceability, all workflow stages (searching, import/parsing, normalization, deduplication, screening, and eligibility) were implemented deterministically, with preservation of canonical artifacts and an audit trail. Complete, verbatim search strategies by database and by set were provided in Supplementary material in a PRISMA-S-compatible format. PRISMA flow counts and summaries of full-text exclusion reasons were automatically recomputed from the canonical decision files, avoiding manual transcription of numbers; final values are presented in the Results. Scripts are available as Supplementary material via a repository.
A total of 1,128 records were identified, including 1,083 from bibliographic databases and 45 from other methods (snowballing). After deterministic deduplication (n = 66), 1,062 records remained for TA screening. At this stage, 761 records were excluded, and 301 reports were sought for retrieval. During full-text eligibility assessment, 27 reports were not retrieved (i.e., unavailable for full-text evaluation), resulting in 274 full-text reports assessed for eligibility. Of these, 27 reports were excluded after full-text review, using standardized exclusion reasons; out_of_scope was the predominant reason (n = 25), followed by duplicate (n = 2). Ultimately, 247 studies were included in the synthesis (Figure 1).
This section provides a traceable, study-level descriptive profile of the final included corpus (n = 247), with emphasis on:
a) population/context;
b) anatomical region/ROI;
c) study design;
d) key elements of the IRT acquisition protocol;
e) comparators and/or multimodal components; and,
f) outcomes.
All variables reported below were derived directly from the master data extraction spreadsheet (Supplementary material), built from full-text data extraction. Items not explicitly reported in the source material were coded as NR, preserving the distinction between non-reporting and absence of effect (not inferred).
The included studies spanned the period from 1978 to 2026, with a median publication year of 2020 (interquartile range 2012–2023). Persistent bibliographic identifiers were available for most records, with a DOI present in 233/247 (94.3%) and a PMID present in 209/247 (84.6%), supporting traceable linkage between narrative claims and their corresponding primary sources.
Consistent with the deterministic deduplication-and-retention rule, the source of the retained “representative” record for included clusters was predominantly PubMed (n = 187), followed by snowballing as the retained representative source (n = 26), Embase (n = 23), Scopus (n = 6), Web of Science (n = 4), and SciELO (n = 1).
Importantly, when snowballing was evaluated as a contributor to identification at the cluster level (i.e., present among contributing sources regardless of which record was retained), 41/247 clusters (16.6%) included snowballing as a contributing identification method. This distinction avoids conflating record provenance (retained sources) with identification pathways (contributing sources).
Study contexts were heterogeneous, reflecting the integrative scope of this review. Based on the setting descriptors recorded in the master data extraction spreadsheet (Supplementary material), studies were most frequently conducted in clinical settings (85/247; 34.4%) or laboratory settings (60/247; 24.3%), while workplace-based studies accounted for 15/247 (6.1%). Secondary evidence and guidance documents (reviews, syntheses, and standards/guidance-oriented publications) represented 75/247 (30.4%). The remaining studies were coded as other/unclear settings (12/247; 4.9%).
With respect to anatomical targets, ROI descriptions allowed studies to be categorized by primary region: upper limb (46/247; 18.6%), lower limb (44/247; 17.8%), trunk/spine (25/247; 10.1%), neck/shoulder (15/247; 6.1%), head/face (28/247; 11.3%), multiple/systemic (46/247; 18.6%), other/undefined (33/247; 13.4%), and NR (10/247; 4.0%). This distribution underscores the need to interpret thermal metrics in light of regional physiology, exposure patterns, and protocol comparability, all of which were explicitly tracked in the master data extraction spreadsheet (protocol fields and ROI descriptors).
Extracted study designs demonstrated substantial heterogeneity. When grouped into broad descriptive categories, the corpus included systematic reviews (n = 25; 10.1%), narrative reviews (n = 31; 12.6%), and guideline/consensus/standards-oriented documents (n = 7; 2.8%), in addition to multiple primary-study designs, including experimental/laboratory studies (n = 41; 16.6%), randomized trials (n = 19; 7.7%), cross-sectional studies (n = 34; 13.8%), case-control studies (n = 15; 6.1%), and other observational and/or validation formats (n = 75; 30.4%) (master data extraction spreadsheet; study_design field).
Comparator/reference methods and multimodal components were variably reported. Among included studies, data extraction indicated integration with ultrasound (with or without Doppler) in 39/247 (15.8%), sEMG in 23/247 (9.3%), dynamometry/strength testing (DYNA) in 11/247 (4.5%), and nerve conduction testing/electrodiagnostic assessments in 14/247 (5.7%); MRI/computed tomography (CT) was referenced in 28/247 (11.3%). PROs, including validated instruments and rating scales, were also frequently captured in the extracted outcome fields (92/247; 37.2%), with visual analog scale (VAS) specifically recorded in 30/247 (12.1%). Collectively, these distributions support the treatment of multimodality as a distinct descriptive axis within the corpus, while also indicating substantial variability in how comparator structures, complementary modalities, and outcome layers were implemented and reported across studies. This variability was evident not only in the choice of complementary measures, but also in their timing and analytical role relative to the thermal signal, limiting direct comparability of integrated protocols across the corpus.
Within the included corpus, IRT is used as a functional readout of cutaneous heat exchange, in which surface temperature reflects the dynamic balance between heat production and heat dissipation, under the strong influence of microperfusion and autonomic vasomotor control [33, 50, 51]. Consequently, interpretation of thermal findings in WMSDs tends to rely more on spatial and temporal patterns within ROIs than on isolated absolute values, provided that acquisition occurs under controlled and well-documented conditions [33, 52, 53].
A recurrent interpretive axis can be described by the operational contrast of hot versus cold [10, 40, 54]. Patterns of regional temperature elevation are generally discussed as compatible with increased superficial perfusion and/or inflammatory processes in more superficial tissues, whereas patterns of regional temperature reduction are discussed as compatible with vasoconstriction, hypoperfusion, and/or autonomic vasomotor dysfunction [10, 40, 54]. The corpus supports that these patterns should not be treated as mutually exclusive categories, as they may coexist in complex presentations, particularly when occupational exposures combine sustained static load, repetition, and psychophysiological factors associated with stress [9, 33].
Three operational characteristics derived from the evidence inform interpretation of the thermal signal:
bilateral asymmetry in homologous ROIs;
task- or stimulus-induced variation relative to baseline (ΔT); and,
temporal recovery dynamics following standardized tasks or provocations [38, 54, 55].
In occupationally focused applications, these characteristics are used predominantly to describe functional deviations associated with exposure and demand, rather than to replace clinical assessment or to infer structural changes in isolation [9, 33, 56]. This framing is consistent with the frequently multifactorial nature of WMSDs and with the need for contextualization by task, anatomical segment, and protocol comparability [9, 13, 33].
The thematic compilation indicates that interpretability of the IRT thermal signal, especially for between-subject comparisons and longitudinal monitoring, depends critically on standardization and QC [33, 52, 53] (Table 2). Because cutaneous temperature is sensitive to environmental variables and multiple behavioral confounders, gaps in standardization can compromise the physiological attribution of observed thermal differences [33, 52, 53].
Standardized environmental and laboratory conditions for clinical infrared thermography (IACT, 2015) [32, 33].
| Standardized environmental and laboratory conditions | |
|---|---|
| Environment and laboratory | |
| Ambient temperature | The room should be maintained between 18°C and 23°C. |
| Temperature stability | The room temperature should not vary by more than 1.0°C during the examination. |
| Humidity control | Humidity should be controlled to avoid moisture accumulation on the skin, perspiration, or vapor. |
| Airflow | The room should be free of drafts. Doors and windows should be sealed. |
| Infrared (IR) radiation sources | Windows should be covered or shielded to prevent entry of external IR radiation. Heating or air-conditioning ducts should be kept away from the patient or turned off. |
| Lighting | Incandescent lighting (which produces IR radiation) should not be used during the examination. Standard fluorescent lighting is adequate. |
| Patient preparation | |
| Acclimatization (equilibration) | The patient should undergo a minimum acclimatization period of 15 minutes in the examination room environment. |
| Skin exposure | The area to be imaged must remain completely uncovered by clothing or jewelry. For breast examinations, the breasts must remain uncovered throughout the entire acclimatization and imaging period. |
| Sun exposure | The area to be analyzed should not be exposed to sunbathing/tanning in the 5 days prior to the examination. |
| Topical products | Do not use lotions, oils, creams, powders, makeup, deodorants, or antiperspirants (for trunk/breast examinations) on the day of the exam. |
| Physical stimuli | Do not undergo physical therapy, EMS, TENS, ultrasound, acupuncture, chiropractic treatment, sauna, or use hot/cold packs within 24 hours before the examination. |
| Physical exercise | Do not exercise on the day of the examination. |
| Bathing | If showering, it must be at least 1 hour before the exam. Immersion baths (bathtub) are prohibited in the 24 hours prior to the exam. |
| Minimum equipment requirements | |
| Spectral range | Detector response must be greater than 5 µm and less than 15 µm, encompassing the 8–10 µm region. |
| Emissivity setting | Emissivity must be set to 0.98 (human skin). |
| Absolute resolution | At least 19,200 temperature measurement points per image frame. |
| Spatial resolution | 1 mm2 at 40 cm (2.5 mrad IFOV). |
| Thermal sensitivity (NETD) | Less than 80 mK. |
| Precision (repeatability) | Ability to detect a temperature difference of 0.1°C. |
| Absolute accuracy | ± 2°C or ± 2% of the reading, whichever is smaller. |
| Image acquisition protocol | |
| Personnel qualification | Image acquisition should be performed only by personnel credentialed as a certified clinical thermography technician or certified clinical thermologist. |
| Positioning (angle) | The detector(s) should be as perpendicular as possible to the surface being imaged. |
| Positioning (bilateral views) | If non-perpendicular views are necessary, the angle must be kept exactly the same for comparable bilateral views. |
| Settings (bilateral views) | Equipment settings (color scale and temperature range) must not be changed between the two views. |
| Field of view (FOV) | The body region of interest should be brought close enough to the detector to fill the visible image area. |
This table summarizes minimum environmental control variables, acclimation and participant preparation requirements, operational specifications for the acquisition environment, and restrictions required to ensure repeatability and comparability of thermal image acquisition in clinical and research settings [32, 52]. IACT: International Academy of Clinical Thermology; EMS: electrical muscle stimulation; IFOV: instantaneous FOV.
Two pillars emerge consistently across the included set:
control of the acquisition environment (thermal stability, humidity management, minimization of airflow, and control of external sources of IR radiation and lighting); and,
control of the participant and the acquisition system (acclimation with adequate exposure of ROIs, pre-test restrictions for factors that alter microcirculation and thermoregulation, equipment configuration and specifications, reproducible positioning geometry, and structured documentation of the protocol) [33, 52, 53].
These pillars are consolidated in normative documents and clinical guidelines (Table 3) included in the corpus and support the traceability of findings reported in primary and secondary studies [32, 33].
Operational checklist for clinical infrared thermography (IACT/AAT).
| Domain | Mandatory (minimum for clinical validity) | Recommendation—good practices |
|---|---|---|
| Environment | Stable temperature (ΔT ≤ 1.0°C); absence of drafts; non-reflective surfaces | Natural convection < 0.2 m/s; relative humidity < 70%; continuous recording of temperature and relative humidity using a calibrated thermo-hygrometer. |
| Room temperature | 20–24°C (adjusted to the region under evaluation) | 20–21°C for inflammatory processes; 22–24°C for extremities |
| Relative humidity | Controlled; avoid sweating/condensation | 50–60% as the comfort zone |
| Acclimatization | ≥ 15 min, target area exposed throughout the entire period | 20 min if ambient temperature is > 21°C; repeated series for medicolegal purposes |
| Equipment | Emissivity ε = 0.98; NEDT/NETD ≤ 80 mK (AAT: ≤ 50 mK); resolution ≥ 19,200 pixels | Annual calibration check; reference blackbody when available |
| Acquisition | Perpendicular positioning; symmetric bilateral views; identical scales between views | Fixed tripod; standardized distance; discrete anatomical markers |
| Patient | No creams/oils; no exercise on the day of the exam; no TENS/EMS/ultrasound in the previous 24 h | Avoid caffeine/nicotine for 4 h; withhold vasoactive/opioid medications for > 24 h (when clinically safe) |
| Documentation | Record temperature and relative humidity, acclimatization time, distance, and task/stress protocol | Standardized template plus checklist signed by the operator |
AAT: American Academy of Thermology; IACT: International Academy of Clinical Thermology; ΔT: temperature difference; EMS: electrical muscle stimulation; NEDT: noise-equivalent temperature difference; TENS: transcutaneous electrical nerve stimulation.
The checklist integrates minimum criteria and best practices for the environment, participant preparation, equipment configuration, acquisition geometry, and documentation. It is intended to standardize the application of IRT across laboratory, clinical, and occupational settings [32, 33, 52].
Taken together, axis B establishes the operational foundation for interpreting the applied axes: signal quality and its clinical and occupational interpretability are contingent on protocol stability and reporting completeness, both of which are traceable in Supplementary material [33, 52, 53].
Within the occupational scope of the included corpus, IRT appears in three applied functions that are often combined: screening/surveillance to identify ROIs with atypical thermal patterns under standardized tasks; risk characterization by linking thermal patterns to ergonomic demands and relevant exposures; and monitoring, when protocols are repeated to document thermal variation in response to task changes, interventions, or recovery strategies [9, 13, 38]. Across these functions, the synthesis uses as reference the key occupational determinants associated with WMSDs, including repetition, sustained postures, load intensity, and opportunities for recovery, because these factors modulate both microvascular demand and superficial autonomic and inflammatory responses [9, 13, 38].
Across studies conducted in office contexts and in tasks involving continuous keyboard and mouse use, IRT is applied predominantly to upper-limb ROIs (hand, wrist, forearm) and, in some protocols, to regions related to postural load associated with sedentary work [9, 13, 38]. The evidence describes thermal patterns associated with prolonged and repetitive work, most often analyzed using asymmetry metrics and changes relative to baseline under controlled conditions [9, 13, 38]. In protocols that incorporate recovery strategies, IRT is used as an adjunct objective measure to document thermal variation associated with active breaks and temporal reorganization of exposure [9, 13, 38].
A recurrent application also occurs in experimental paradigms that emulate industrial work demands, including repetitive tasks involving reaching, gripping, manipulation, and overhead work [53, 56]. In these models, IRT is used to map regional thermal distribution and its evolution across baseline, post-task, and recovery time points, enabling functional characterization of ROIs under controlled demand [53, 56]. The synthesis indicates that such paradigms are used as a methodological bridge between occupational exposure and observable physiological responses, provided that environmental and protocol standardization are explicitly reported [33, 53].
The corpus includes robust evidence related to hand-arm vibration exposure and characterization of peripheral vasomotor function using cold provocation and rewarming protocols [10, 40, 54]. In these studies, IRT is integrated with clinical and occupational assessment procedures and, in some designs, with vascular and ultrasonographic modalities, focusing on dynamic rewarming parameters and acral/peripheral thermal patterns [10, 40, 54]. This application reinforces the need to interpret cold patterns and recovery dynamics as expressions of neurovascular and microcirculatory regulation, with relevance for occupational surveillance and functional evaluation, particularly when aligned with occupational history and complementary measures [10, 40, 54].
A subset of the corpus addresses simulated procedural tasks in professionals, in which demand is characterized by sustained posture, repetitive fine movements, and prolonged load on specific segments [56–58]. In these contexts, IRT is used to describe thermal responses in upper-limb ROIs related to effort and task duration, often in combination with functional and neuromuscular measures [56–58]. The synthesis positions these designs as examples of application in high-precision scenarios in which task-, time-, and ROI-level comparability is essential for interpretation [56, 57].
The included evidence describes multimodal integration heterogeneously, but with recurrent operational patterns [51, 56, 59, 60]. In these designs, IRT tends to provide a functional layer related to microperfusion and superficial autonomic regulation, complementing electrical measures (surface EMG and/or electrodiagnostics), mechanical measures (dynamometry, strength, torque, and functional measures), structural/perfusion measures (MSK ultrasound, in some cases with Doppler), and self-reported outcomes (PROs), including symptom and function scales and instruments [51, 56, 59, 60] (Table 4).
Clinical synthesis of modality roles when IRT is used as a functional marker in WMSD-oriented assessments (corpus-derived from the master data extraction spreadsheet; n = 247).
| Modality | Clinical “signal” captured (as recorded in outcomes_measures/comparator_reference) | Typical clinical targets in the corpus (body region/condition_wmsd) | How it is used with IRT in the corpus (comparator_reference/multimodal_components) | Practical constraints and failure modes most commonly reported (limitations_bias_notes/protocol_key_points) |
|---|---|---|---|---|
| IRT | Surface thermal metrics (absolute temperature; side-to-side asymmetry; ΔT/change/gradients; hot/cold patterning; rewarming/kinetic metrics when dynamic protocols are used) | Upper limb (hand/wrist/forearm), lower back, knee, shoulder; occupationally framed symptom/risk contexts and mixed clinical MSK conditions (including neuropathic/vascular phenotypes in some subdomains) | Primary functional layer; used as baseline + post-task/provocation + recovery mapping; frequently paired with patient-reported measures and, less often, with objective neuromuscular or imaging comparators | High sensitivity to environment/protocol; incomplete protocol reporting; ROI/operator dependence; small samples/pilot designs; cross-sectional or short follow-up commonly noted |
| MSK ultrasound (with and without Doppler) | Structural/soft-tissue and (when Doppler is present) perfusion-adjacent signals explicitly described as “ultrasound/sonography” and “Doppler/PD” | Joints and peripheral segments (e.g., hand/wrist/fingers; foot; elbow), with recurring rheumatologic/degenerative and regional pain contexts in the corpus | Used as anatomical/structural comparator or complementary component in multimodal designs; typically deployed to contextualize IRT patterns with segment-level structural information | Operator dependence and ROI localization; heterogeneity of acquisition/reporting; limited direct workplace deployment in the corpus; protocol/environment sensitivity still reported in multimodal subsets |
| MRI | Deep structural imaging explicitly described as MRI/magnetic resonance | Regional MSK pain syndromes, degenerative spinal contexts, and injury-related contexts in the corpus; multiple anatomical regions represented | Used as a structural comparator in multimodal/secondary synthesis contexts (i.e., to anchor structural interpretation when IRT is used functionally) | Not interchangeable with IRT (different signal domain); limited occupational/workplace deployment in the corpus; frequent reliance on secondary synthesis designs; “specificity/gold standard” concerns occasionally noted |
| EMG/NCS (electrodiagnostics) | Neuromuscular activation and peripheral nerve function explicitly described as EMG and/or NCS | Upper-limb neuropathic/compressive phenotypes dominate (hand/wrist/fingers; CTS and other compressive neuropathies appear repeatedly in condition_wmsd) | Used to provide an objective functional comparator for neuromuscular/nerve involvement when IRT is used as a surface-functional marker; occasionally embedded as a multimodal layer in diagnostic/assessment workflows | Operator/placement dependence; protocol sensitivity; small samples; heterogeneity of thresholds and reference standards; limited workplace implementation in the corpus (mostly clinical/lab or secondary synthesis) |
| DYNAStrength testing | Force/strength endpoints explicitly described as dynamometry, grip strength, torque/isokinetic testing | Frequently linked to functional capacity contexts (upper limb/shoulder; low back in some designs; healthy/task paradigms also present) | Used as objective functional output alongside IRT (e.g., workload/fatigue/provocation paradigms) to triangulate functional impact | Small sample/pilot designs; variability in task standardization; protocol sensitivity; limited direct occupational/workplace deployment in the corpus |
| PROsVAS and questionnaires | Patient-reported pain/function metrics explicitly described as VAS and/or questionnaires/scales | Broad across ROIs; recurrent in upper limb, knee, shoulder, and exercise/overuse-type paradigms; includes occupationally relevant symptom/function capture | Most common complementary layer paired with IRT; used to align functional thermal patterns with symptom burden and perceived function | Instrument heterogeneity; cross-sectional designs; limited uniformity in timing relative to IRT acquisition; reporting variability in multimodal subsets |
IRT: infrared thermography; MSK: musculoskeletal; MRI: magnetic resonance imaging; EMG: electromyography; NCS: nerve conduction studies; DYNA: dynamometry/strength testing; PROs: patient-reported outcomes; VAS: visual analog scale; ROIs: regions of interest; ΔT: temperature difference; PD: power Doppler.
Thematic compilation allows the integration to be organized into four operational dimensions observable in the corpus:
Integration is described predominantly as anatomical cohesion across modalities; that is, thermal ROIs defined by anatomical landmarks are linked to measurements collected from the same segment [56, 59, 60]. This ROI-based alignment is central to interpretive traceability and depends on explicit ROI definition and reproducible positioning [33, 53, 56].
In protocols involving standardized tasks or provocations, multimodal integration tends to be built around comparable time points: baseline acquisition, immediate post-task/stimulus acquisition, and assessments during recovery [54–56]. In follow-up applications, repetition of a temporal design under standardized conditions is used to document changes associated with time, exposure, or interventions [13, 56].
The synthesis identifies recurrent use of within-subject and within-session metrics for interpretation: ΔT relative to baseline, bilateral asymmetry in homologous ROIs, and parameters derived from dynamic curves (particularly in cold/rewarming protocols) [38, 54, 55]. In multimodal designs, these approaches are described as operational tools to reduce interindividual variability and to facilitate coherence across signals collected by distinct modalities [33, 53, 60].
Multimodal integration is reported as a strategy to increase interpretive coherence when multiple domains converge on the same ROI and within the same temporal design [56, 59, 60]. In studies using EMG and mechanical measures, IRT is analyzed in parallel with indicators of activation/fatigue and functional performance; in studies using US, IRT is discussed in relation to structural and perfusion findings; and in studies using PROs, the thermal signal is contextualized by self-reported symptom and functional burden [10, 51, 56, 59]. Overall, the corpus describes multimodality as an operational triangulation mechanism for the phenomenon of interest, reducing interpretive dependence on any single measure [56, 59, 60].
In the included corpus, IRT is described as a noninvasive, noncontact method based on capturing IR emission from the body surface to estimate skin temperature, without exposing participants to ionizing radiation [20]. Guidelines and methodological reviews included in the corpus reiterate that the safety and interpretability of the examination depend primarily on environmental control and participant preparation, with explicit documentation of acquisition conditions and potential confounders [31–33, 61]. When used in dynamic protocols (cold thermal challenge and evaluation of rewarming curves), IRT is applied as time-series measurements under controlled conditions, using short-duration stimuli and execution guided by participant tolerance and procedural standardization [32, 55, 62]. Among studies that explicitly reported safety items and adverse event monitoring, mentions of adverse events referred to concomitantly evaluated therapeutic interventions rather than to the thermographic acquisition procedure itself [63–65].
The included literature showed heterogeneous and generally nonstandardized reporting of adverse events associated with interventions monitored with IRT support, with frequent absence of explicit adverse-event descriptions in nonpharmacological intervention studies [31]. In a randomized pilot trial of osteopathic intervention for chronic thoracolumbar pain, safety was treated as a feasibility outcome, with adverse events recorded (6%) in the context of the intervention, alongside the use of thermography as an ROI-based objective measure [63]. In acupuncture interventions, included protocols incorporated adverse event monitoring as an outcome, but the availability of safety results was variable, including studies published in protocol format [64, 65]. Similarly, in protocols involving transcutaneous electrical stimulation applied to chronic pain conditions, monitoring of side effects was planned as part of the outcome set, without uniform reporting across studies [66]. In low-complexity workplace interventions, such as active breaks during prolonged sitting, IRT was used to monitor thermal changes in trunk ROIs; however, adverse events were not explicitly reported at the study level in the extraction [13].
The corpus described a broad set of modifiers with the potential to alter baseline skin temperature, response amplitude (ΔT), and/or recovery dynamics, with direct impact on standardization and interpretation of the thermal signal. These modifiers include intrinsic variables (age, sex, body composition/body mass index (BMI), circadian rhythmicity, and cutaneous characteristics) and extrinsic variables (smoking/nicotine, caffeine, food intake, recent physical activity, use of medications with vasomotor effects, and prior therapies such as transcutaneous electrical nerve stimulation (TENS)/electrical muscle stimulation (EMS), therapeutic ultrasound, and acupuncture), as well as environmental variables (temperature, relative humidity, airflow, external IR sources, and ambient reflectance) [31–33, 61]. With respect to age and sex, the corpus included populations with wide age variation, including studies specifically conducted in older adults with chronic low back pain, reinforcing the relevance of considering aging and comorbidities as potential physiological modifiers of thermal patterns [67]. In experimental protocols with environmental control, evaluation of sex as a modifier was inconsistent; when tested, sex showed no significant interaction with the thermal trajectory in an exercise-to-exhaustion protocol, with analyses adjusted for covariates such as BMI and workload (repetitions) [29]. Sex-specific variables, such as menstrual cycle and menopause, were described as potential sources of variability in methodological reviews, but with heterogeneous control and reporting across the study set [31]. Metabolic and vascular conditions (diabetes and vasospastic phenomena) were addressed in specific subcorpora, indicating that microvascular and perfusion alterations may modify thermal patterns and dynamic parameters, particularly in cold challenge and rewarming protocols [62, 68]. Operationally, the included guidelines and reviews converge in recommending that these modifiers be handled as part of the protocol (eligibility criteria, pre-examination restrictions, documentation of comorbidities and medications) and as interpretive elements (preference for within-subject comparisons, use of bilateral asymmetry and ΔT rather than universal absolute thresholds, when applicable) [31–33, 61]. Recent experimental studies also indicated that technical factors and cutaneous characteristics, including skin tone, may influence thermal measures under controlled conditions, reinforcing the need for technical documentation and caution when extrapolating reference values across subgroups [52].
IRT should be interpreted primarily as a functional marker of superficial thermal phenomena related to cutaneous perfusion and autonomic regulation, with greater utility when grounded in ROI-based patterns (side-to-side asymmetry, ΔT, and dynamic responses) rather than in isolated absolute thresholds [31–33, 61]. From an applied standpoint, the included studies suggest that IRT may function as a complementary tool for operational screening and longitudinal monitoring of responses associated with repetitive exposures and sustained postures, with evidence in computer-based work settings (continuous mouse/keyboard use) and in work-organization interventions (active breaks) monitored through segmental thermal maps [9, 13, 38]. Accordingly, the primary practical implication is to reposition IRT as an adjunct, repeatable, protocol-driven objective measure, for which environmental control, participant preparation, and procedural documentation constitute minimum requirements for validity and comparability across assessments [31–33].
IRT is positioned as an indirect functional marker of superficial thermal phenomena, with interpretation dependent on the physiological context (microperfusion/vasomotor regulation), the definition of ROIs, and, most critically, acquisition standardization [19, 31–33, 37, 61]. Across reviews and normative documents included in the corpus, IRT’s primary comparative advantage lies in its ability to capture relatively large-area thermal maps rapidly and without contact, which favors pattern-based interpretation (asymmetry, regional distribution, and changes relative to baseline) in screening and monitoring scenarios [19, 36, 37, 69]. In terms of applicability, included reviews also describe the expansion of portable platforms (including smartphone-based thermography) as a potential logistical enabler for serial monitoring, provided that minimum requirements for environmental control, device configuration, and protocol documentation are met [36, 69].
These operational advantages, however, coexist with structural limitations that clearly delineate what IRT does not do. The corpus converges in recognizing that IRT does not provide deep anatomical information and should not be interpreted as a substitute for structural imaging; moreover, because it is highly sensitive to environmental and technical variables (ambient temperature and stability, humidity, airflow, external IR sources, emissivity, distance/angle, and sensor parameters), IRT can yield false-positive findings when applied without rigorous standardization, with incomplete reporting, or in uncontrolled environments [19, 20, 31–33, 37, 52]. Controlled studies and methodological reviews included in the corpus further emphasize that technical differences and cutaneous characteristics can influence thermal measurements and their comparability, reinforcing the need for complete documentation and caution when extrapolating thresholds across devices and subgroups [31, 52].
In comparison with US, the corpus indicates a functional-structural contrast. US is used for targeted anatomical assessment and, when applicable, for characterizing structural and/or inflammatory/perfusion-related changes within the evaluated segment, whereas IRT adds a superficial functional layer with rapid ROI-based spatial readout [70–72]. Studies combining IRT and US suggest practical complementarity when IRT is used to map segment-level patterns and to guide ROI selection for focused ultrasonographic assessment, rather than to pursue direct diagnostic equivalence [71, 72]. This positioning is consistent with corpus syntheses that treat US and MRI as complementary structural modalities, while reserving for IRT an adjunct role contingent on standardization and on the specific clinical or occupational question [37, 70].
In comparison with sEMG and mechanical measures (dynamometry/strength), the corpus suggests that IRT captures a distinct physiological domain with its own kinetics, which may be useful for contextualizing repetitive demands and regional patterns of superficial response. Included multimodal studies describe protocols in which IRT is analyzed in parallel with EMG (activation/fatigue) and/or strength, indicating that sEMG and dynamometry are more directly aligned with quantifying neuromuscular activity and mechanical performance, whereas IRT provides complementary information when interpreted as ROI-based patterns under standardized tasks and controlled conditions [56, 60, 73]. Accordingly, the corpus-supported critical comparison indicates that integration tends to be more informative than isolated use, particularly when the objective is operational triangulation (superficial functional signal + neuromuscular/mechanical load + self-reported outcomes), rather than structural inference [37, 56, 60, 73].
The contrast with MRI is even more pronounced. The corpus recognizes that MRI provides deep anatomical information and is appropriate when the clinical question requires structural/anatomical characterization, whereas IRT does not access deep tissues and does not provide direct anatomical evidence [37, 70, 74]. In addition, included reviews note that degenerative findings on structural imaging are common in asymptomatic populations and increase with age, requiring clinical correlation and limiting interpretations based exclusively on structural imaging [17]. In the context of low back pain, the corpus includes evidence of limited diagnostic accuracy for thermography relative to clinical reference standards and guidelines that do not recommend its use as a diagnostic test for low back disorders, reinforcing the risk of low specificity when it is applied as a standalone tool [75, 76].
WMSDs represent a multifactorial occupational outcome in which biomechanical determinants (repetition, force, sustained postures, and misalignment), organizational determinants (work pace, breaks, duration of exposure), and psychosocial determinants (occupational stressors) modulate risk by body segment and work context, with substantial economic impact on the workforce [1, 2, 5–7]. Within this framework, IRT’s specific contribution to the theme of “prevalence/risk factors” is not to replace epidemiologic methods or exposure assessment, but to add an objective functional layer that may be useful for characterizing ROI-based patterns in surveillance scenarios and for supporting the prioritization of interventions in tasks or environments with a higher likelihood of superficial physiologic overload [1, 5, 6].
In the domains of surveillance and risk mapping, evidence from the corpus suggests that IRT can be positioned as a complementary component to established approaches for workplace exposure assessment and risk-factor screening tools, provided it is embedded within programs that include explicit acquisition standardization and documentation to reduce variability and misinterpretation [8, 32, 33, 39]. Operationally, such integration is most consistent when:
the unit of analysis is defined by task, department, or job function (rather than diagnosis alone);
metrics and ROIs are prespecified and reproducible; and,
interpretation prioritizes within-subject and/or within-task comparisons (pre- versus post-exposure, or serial assessments on comparable days), preserving methodological traceability [32, 33].
Sector- and occupation-specific applicability within the corpus is illustrated by studies in intensive computer users and office work, in which IRT was used to characterize upper-limb thermal patterns under standardized tasks and in association with MSK complaint severity [9, 26, 38]. In scenarios related to postural load and sedentary work, IRT was used to monitor trunk thermal responses during prolonged sitting and to track changes under workstation-level mitigation strategies [13]. In repetitive tasks and production-line analog paradigms, experimental and reliability studies demonstrated the technical feasibility of IRT for monitoring upper-limb ROIs under repetitive demands and overhead tasks, providing a translational basis for task-based risk-mapping programs in industrial sectors [53, 77]. For groups exposed to hand-arm vibration, the corpus included specific evidence for surveillance using cold provocation testing and rewarming assessment, emphasizing protocol heterogeneity and the need for standardized interpretive criteria in occupational health [35, 40, 54, 78]. Finally, the inclusion of studies in high-precision professions with substantial upper-limb demand (e.g., surgeons performing simulated laparoscopic tasks and pianists) broadens the spectrum of applicability and reinforces the potential of IRT as a complementary measure for function- or activity-based functional mapping in specific occupational subgroups [56, 58].
With respect to emerging conditions such as telework and hybrid arrangements, the corpus literature does not coalesce as a dedicated subfield; however, findings from office work, continuous computer use, and prolonged-sedentary protocols provide a direct translational basis for designing surveillance in remote contexts [9, 13, 26, 38]. In this extension, the key implication that the intrinsic environmental variability of home settings is likely to amplify confounding and noise, making adoption of minimum standardization requirements (recording temperature/humidity, acclimation, and pre-examination restrictions) and the use of comparative metrics (asymmetry and relative changes) even more critical for exposure-oriented interpretation [32, 33].
This review was designed as an integrative narrative review with a traceable compilation, with the aim of clinically integrating physiological foundations, standardization/quality-control requirements, and occupational applications of IRT in WMSDs. This deliberately broadened scope entailed inclusion of diverse study designs (experimental, observational, diagnostic-accuracy, intervention studies, reviews, and normative documents), which supports thematic synthesis and critical comparison but limits direct cross-study comparability and precludes quantitative outcome-level pooling across much of the corpus [19, 31, 32, 79, 80]. The secondary literature included in the corpus itself reports substantial heterogeneity in devices, ROIs, protocols, environmental conditions, and thermal metrics, with inconsistencies that prevent meta-analysis and constrain the development of universal parameters [79, 80].
At the operational level, a recurrent methodological limitation in the corpus (and therefore inherited by this synthesis) is incomplete reporting of determinants required for reproducibility and for interpretation of the thermal signal. Included guidelines and reviews emphasize that environmental variables, acclimation, acquisition geometry, equipment configuration, and ROI definition are decisive for validity and comparability [1–3]. In the master data extraction spreadsheet (Supplementary material), IRT device reporting contained at least one NR element in 19/247 studies, and comparator/reference information was absent or NR in 18/247; by contrast, key protocol points were available for all included studies (0/247 coded as NR). These reporting patterns restrict critical appraisal of confounding risk factors and increase the likelihood of systematic between-study variation, particularly for a method that is sensitive to technical and environmental factors [19, 31, 32, 52].
With respect to the exclusion of evidence from animal models, the search and eligibility strategy was designed to prioritize literature with direct clinical and occupational applicability, consistent with the objective of guiding the use and interpretation of IRT in WMSDs within occupational health contexts. For this reason, a dedicated translational synthesis based on animal models was not undertaken, acknowledging that such literature may offer mechanistic insights under highly controlled experimental conditions but falls outside the applied and clinical scope that structured this manuscript. Accordingly, this methodological choice entails a deliberate loss of mechanistic depth and constitutes an accepted limitation to be addressed by future targeted translational reviews.
Finally, the absence of meta-analysis resulted not only from heterogeneity in protocols and outcomes, but also from the multimodal and frequently exploratory nature of part of the corpus, with nonuniform measures and comparators [79, 80]. As a methodological choice aligned with this scenario, appraisal was conducted descriptively and oriented toward reproducibility (emphasizing environmental controls, technical documentation, and comparators), rather than through a single standardized, domain-based risk-of-bias judgment applied uniformly across all included designs, recognizing that this option limits quantitative inference and reinforces the interpretive character of the compilation [19, 31, 32, 79, 80]. In addition, although snowballing broadened coverage (41/247 included studies that had snowballing as a contributing identification method), this procedure may favor citation networks and therefore requires cautious interpretation regarding potential visibility bias; this risk was mitigated by applying the same screening/eligibility and deduplication workflow to records retrieved via additional methods.
Establishing IRT as a clinically useful and operationally reproducible tool for WMSDs requires a research program centered on standardization, protocol traceability, and external validation. Included guidelines and methodological reviews converge in emphasizing that environmental variables, participant preparation, equipment specifications, acquisition geometry, and ROI definition determine interpretive validity and comparability across assessments; therefore, future studies should report these components systematically and in an auditable manner [19, 31–33]. In addition, evidence indicates that technical factors and cutaneous characteristics can influence thermographic measurements even under controlled conditions, reinforcing the need for inter-device harmonization and caution when extrapolating thresholds across populations and contexts [52].
A cross-cutting gap in the corpus is the scarcity of operationally useful reference parameters stratified by occupation, task, and body segment, particularly for surveillance and monitoring applications. Studies in office workers indicate that distal thermal patterns may be associated with continuous mouse/keyboard use and symptom severity, suggesting that relevant thermal profiles are sensitive to the task, microenvironment, and occupational behavior [9, 26, 38]. In parallel, evidence on prolonged sedentary exposure and organizational interventions (active breaks) supports that thermal responses in trunk ROIs vary with exposure and with work modification, underscoring the need to model response trajectories (baseline-post-task-recovery) by exposure intensity and duration [13]. Future studies should therefore build occupational databases using standardized acquisition and prespecified ROIs, enabling derivation of task-specific reference distributions and stratification by relevant modifiers documented a priori in the protocol [31–33].
Generalizability depends on multicenter studies that reduce center effects and allow reproducibility to be estimated across different environments and equipment. Accordingly, the agenda should prioritize harmonized protocols (environmental control, acclimation, emissivity, distance/angle, and scales), as well as explicit calibration and cross-camera comparability strategies, with complete reporting of the setup and quality-control procedures [19, 31–33, 52]. This step is critical for translating isolated findings into parameters applicable to occupational health programs operating across different infrastructures.
Although part of the corpus describes screening and monitoring applications, evidence remains limited regarding the longitudinal predictive value of thermal patterns for incident and clinically meaningful occupational outcomes [19, 31–33]. Prospective studies should test whether metrics such as bilateral asymmetry, post-task ΔT, and recovery parameters provide incremental information when analyzed alongside occupational exposure and symptoms, and should estimate within-subject stability and responsiveness to organizational and ergonomic interventions [9, 13, 26, 38]. Prioritizing standardized follow-up is particularly important to distinguish physiologic variability from persistent signals associated with exposure.
The heterogeneity of multimodal designs in the corpus suggests potential, but also highlights the absence of uniform operational criteria for integrating IRT with measures that are already well established in the literature (EMG/dynamometry) and with person-centered outcomes [56, 60]. Based on the recurrent operational patterns identified in the synthesis, Figure 2 presents a conceptual workflow for screening and initial management of WMSDs in occupational settings. Within this framework, IRT is positioned as a complementary functional layer to be interpreted in conjunction with objective and person-centered measures, rather than as a standalone diagnostic method [31–33, 56, 60].

Conceptual flow diagram (operational hypothesis) for screening and initial management of WMSDs in occupational settings, using infrared thermography as a functional marker and integrating objective measures (EMG/dynamometry) and person-centered outcomes (VAS/PROs). WMSDs: work-related musculoskeletal disorders; ΔT: temperature difference; EMG: electromyography; VAS: visual analog scale; PROs: patient-reported outcomes.
The proposed workflow synthesizes recurring elements described across the corpus: brief clinical pre-screening, minimum environmental standardization and acclimation, baseline acquisition under reproducible views/scales, standardized task or exposure, post-task reacquisition, ROI-based interpretation using asymmetry and relative changes (ΔT), and stepwise integration of additional modalities according to the anatomical segment and the clinical/occupational question [31–33, 53, 56, 60]. This structure is consistent with the operational dimensions identified in the multimodal axis, namely ROI- and segment-level alignment, a shared temporal structure, normalization through within-subject comparisons, and interpretive convergence across modalities [33, 53, 56, 60].
Accordingly, a central agenda item is to validate, in defined occupational cohorts, multimodal workflows with:
• prespecified ROIs;
• a shared temporal structure (baseline-post-task-recovery/follow-up);
• normalization using relative measures (ΔT and asymmetry); and,
• explicit evidence-convergence rules for risk stratification and decision-making [31–33, 56, 60].
Such validation should address operational feasibility, reproducibility, subgroup performance, and the impact of environmental and technical confounders, which are recognized in the corpus methodological literature as critical determinants [19, 31–33, 52]. Importantly, Figure 2 does not represent validated evidence of effectiveness, diagnostic accuracy, or occupational impact; rather, it should be interpreted as a synthesis-derived operational hypothesis to be tested in multicenter and longitudinal studies with prespecified criteria and traceable decision rules [19, 31–33, 52, 56, 60].
Within the corpus, some studies apply artificial intelligence (AI)-based classification methods to thermographic images, and reviews synthesize the state of the art of AI in upper-limb conditions (carpal tunnel syndrome) [81, 82]. Taken together, these works indicate that computational models can be used to identify thermal patterns and support between-group discrimination in specific scenarios, provided they are trained and validated on datasets acquired under standardized protocols and with consistent clinical labeling, minimizing technical variation and selection bias [81, 82]. However, the AI agenda should prioritize external validation, bias control, and pipeline traceability (including technical and environmental variability); there is a risk of amplifying false positives and limiting transportability across devices and occupational environments [31–33, 52, 81, 82]. In occupational health applications, integration with AI is more plausible as a decision-support tool when coupled to quality-controlled protocols and functional/relational comparators (EMG, strength, and PROs), rather than operating as a standalone classifier based solely on thermal imaging [31–33, 52, 81, 82].
Based on the studies selected for compilation, IRT emerges as a complementary functional marker for WMSDs, useful for characterizing superficial thermal patterns and their dynamics in response to occupational exposure when applied with explicit ROI definition, environmental control, acclimation, and sufficient technical reporting to ensure reproducibility. In response to the guiding question, the compiled evidence supports that IRT can assist with screening and risk stratification and, most notably, longitudinal monitoring of change over time when interpreted using relative, within-subject metrics [bilateral asymmetry, pre- to post-task ΔT (side-to-side ΔT), and recovery parameters], contextualized by the task and the anatomical segment. At the same time, the body of evidence converges in indicating that IRT does not provide deep anatomical information and should not be used as a standalone diagnostic method, given the risk of nonspecific findings and its sensitivity to individual and environmental confounders.
The compilation also indicates that multimodal integration is promising but remains heterogeneous and incompletely standardized. When aligned by ROIs and shared temporal windows (baseline-post-exposure/provocation-recovery/follow-up), combining IRT with objective measures (sEMG/electrodiagnostics, strength/dynamometry, US with and without Doppler, and/or structural methods) and with self-reported outcomes may increase interpretive coherence and reduce reliance on any single source of information. For robust translation to occupational health, priorities include multicenter and longitudinal studies, inter-device harmonization, occupation- and task-specific reference values, evaluation in real-world work settings, and prospective validation of multimodal workflows and decision rules. Until these gaps are addressed, the most consistent application of IRT in occupational contexts is as an additional functional layer integrated with clinical assessment and complementary measures, under quality-controlled and traceable protocols.
AAT: American Academy of Thermology
AI: artificial intelligence
BMI: body mass index
CT: computed tomography
DOI: Digital Object Identifier
DYNA: dynamometry/strength testing
EMG: electromyography
EMS: electrical muscle stimulation
FOV: field of view
IACT: International Academy of Clinical Thermology
IFOV: instantaneous field of view
IR: infrared
IRT: infrared thermography
MEDLINE: Medical Literature Analysis and Retrieval System Online
MRI: magnetic resonance imaging
MSK: musculoskeletal
NA: not applicable
NCS: nerve conduction studies
NEDT: noise-equivalent temperature difference
NMSK: neuromusculoskeletal
NR: not reported
PCC: Population-Concept-Context
PD: power Doppler
PMID: PubMed identifier
PRISMA: Preferred Reporting Items for Systematic Reviews and Meta-Analyses
PRISMA-S: Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for reporting literature searches
PROs: patient-reported outcomes
QC: Quality Control
ROIs: regions of interest
SANRA: Scale for the Assessment of Narrative Review Articles
sEMG: surface electromyography
SHA-256: Secure Hash Algorithm 256-bit
TA: Title/Abstract
TENS: transcutaneous electrical nerve stimulation
US: ultrasonography
VAS: visual analog scale
WHO: World Health Organization
WMSDs: work-related musculoskeletal disorders
ΔT: temperature difference
The supplementary materials for this article are available at: https://www.explorationpub.com/uploads/Article/file/1007122_sup_1.pdf; and https://www.explorationpub.com/uploads/Article/file/1007122_sup_2.xlsx.
João Alberto de Souza Ribeiro acknowledges the Library Division of the University of São Paulo, Luiz de Queiroz College of Agriculture (USP/ESALQ), for technical support in bibliographic retrieval and access to scientific databases, and CAPES (Coordenação de Aperfeiçoamento de Pessoal de Nível Superior; Brazilian Federal Agency for the Support and Evaluation of Graduate Education) for institutional academic support.
JASR: Conceptualization, Methodology, Investigation, Data curation, Formal analysis, Writing—original draft. LAG: Investigation, Writing—review & editing. Both authors approved the final version of the manuscript and agree to be accountable for all aspects of the work.
The authors declare that there are no conflicts of interest.
This study relied exclusively on secondary bibliographic metadata from commercial databases. No individual-level clinical data or personal identifiers were collected. In line with standard practice for revision studies using non-personal data, ethics committee approval was not required. The study was conducted in accordance with principles of scientific integrity, transparency and reproducibility, and in compliance with the database terms of use.
Not applicable.
Not applicable.
Analysis scripts are available at Zenodo (DOI: https://doi.org/10.5281/zenodo.18988677) and in the GitHub repository (https://github.com/joaotakotsubo/emd_prisma_pipeline - release v1.0.0). Subscription database exports cannot be shared due to licensing restrictions; full search strategies are provided in Supplementary materials (1007122_sup_1).
Not applicable.
© The Author(s) 2026.
Open Exploration maintains a neutral stance on jurisdictional claims in published institutional affiliations and maps. All opinions expressed in this article are the personal views of the author(s) and do not represent the stance of the editorial team or the publisher.
Copyright: © The Author(s) 2026. This is an Open Access article licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, sharing, adaptation, distribution and reproduction in any medium or format, for any purpose, even commercially, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
View: 27
Download: 8
Times Cited: 0
Antonina D.S. Pavilanis ... Michael J.L. Sullivan
Philippe Gorce, Julien Jacquier-Bret
Raquel Ferreira Araruna de Carvalho, Márcio Alves Marçal
Philippe Gorce, Julien Jacquier-Bret
Mário Lopes, Marisa Lages