Characteristics and conclusions of included studies

ReferencesObjectiveSetting and study populationAnalysing techniqueMetric usedConclusionLimitationsEthics
1. Zhu et al. [26] (2022) AustraliaTo estimate the prevalence of agitated behaviors in people with dementia in NHsNursing notes from EHRs regarding NH residents with dementia (n = 3,528)Rule-based NLP to detect health terminology, terminology regarding dementia, and agitation-related termsF-scoreNLP can be valuable in evaluating agitation in people with dementia, and the identified behaviors can inform improvements in aged care and nursingRelies on the accuracy and completeness of EHRs.
The NLP methodology could not capture the entire diversity of writing styles
Ethical approval was obtained
2. Wang et al. [35] (2022) ChinaTo develop an early diagnostic tool for Alzheimer’s disease using ML and non-imaging factorsNHs in Hangzhou, China (n = 4).
NH residents aged 65 or older (n = 654).
Community members (n = 1,100)
Logistic regression, SVM, neural network, random forest, XGBoost, LASSO, and best subset modelsSensitivity, specificity, accuracy, AUROCThe developed non-imaging-based diagnostic tool effectively predicts dementia outcomes and can be easily integrated into clinical practice. Its online implementation eliminates barriers to usage, thereby improving dementia diagnosis, care quality, and reducing associated costsLimited study sitesEthical approval was obtained
3. Huang et al. [34] (2022) ChinaUsing AI to improve the time required for nurse-patient interactionNH residents (n = 32)Real-time analysis of streamed video data through CNNAccuracyAutomatic monitoring effectively improved the efficiency of nurse-patient interaction. The system achieved an abnormal status recognition accuracy of up to 96.53%Video data could raise privacy concernsEthical approval was obtained
4. Boyce et al. [25] (2022) USTo develop and validate a novel predictive model that forecasts the risk of falls for NH residents 90 days in advance, utilizing data from the LTC MDS and drug therapy recordsNH residents (n = 3,985) in 2011, 2012, 2013, and 2016–2018 from the University of Pittsburgh Medical Center Senior Communities NHsAn ML approach, known as CART was usedPrecision, recall, specificity, balanced F-measure, thresholdThe study successfully developed a novel, easily interpretable fall prediction model using MDS and drug dispensing/administration data, capable of guiding clinicians and NH staff in identifying individual residents’ fall risk within 90 daysThe model, trained and tested within a single health system, may require additional testing and potential retraining for use in other settings, and it does not currently incorporate promising data from wearable sensors for real-time fall predictionNot mentioned
5. Ritchie et al. [42] (2022) United KingdomTo determine the prevalence of AF and temporal trends by year of care home entry, and associations between AF and adverse health outcomes including stroke, TIA, major bleeding, MI, cardiovascular hospitalization, and mortalityNH residents in Wales between 2003 and 2018 (n = 86,602)Unadjusted logistic regression models to investigate associations with oral anticoagulant usage95% confidence interval, P-valuesThe study highlights the need for appropriate blood-thinning medications for stroke prevention and effective management of related heart conditions while emphasizing the need for improved data qualityCertain diagnoses were possibly missed due to positive recordings of diagnosesNot mentioned
6. Hacking et al. [43] (2022) NetherlandsTo explore different text-mining methods to analyze the quality of care in a NH settingInterviews with residents (n = 39), family members (n = 37), and care professionals (n = 49)Word frequency analyses, correlation analyses, deep learning-based sentiment analysis, and topic clustering using k-means clustering of word2vec vectorsNot mentionedThe study demonstrates the usefulness of text-mining to extend our knowledge regarding the quality of care in an NH settingDeep learning is less explainable compared to more traditional techniques.
Unigram and bigram models don’t offer many insights as they contain many words with little significance
Ethical approval was obtained
7. McGarry et al. [24] (2022) USTo examine the association of state COVID-19 vaccine mandates with staff vaccination coverage and staffing shortages at NHsData on state COVID-19 vaccine mandate policies were collected from a number of sources, including internet searches using Google, state websites, state memos, and news reportsThis study used event study models and linear regressions to analyze the association of state mandates with staff vaccination coverage and staffing shortages in NHsNot mentionedState vaccine mandates for NH staff were associated with increased staff vaccine coverage without exacerbating staffing shortagesData self-reported by NHs, potentially leading to biases.
Facilities might underreport staffing shortages due to fear of deficiency citations.
Measures might not detect staff departures accurately
Not mentioned
8. Shen et al. [23] (2022) USTo investigate the association of severe outbreaks with staffing measures, such as hires, absences, and departuresDaily shifts (n = 333 million) for staff members (n = 3.6 million) at facilities (n = 15,518) each year on averageThis study employs an event study framework with multivariable linear regressions, facility and calendar-time fixed effects, and sensitivity analyses to examine staffing pattern changes during and after a severe outbreakNot mentionedSevere COVID-19 outbreaks in NHs lead to significant and lasting reductions in nursing staffing levels, with CNAs experiencing the greatest losses, raising concerns about the potential impact on resident quality of life, morbidity, and mortalityInability to observe reasons for changes in absences, departures, and new hires.
Uncertainty about whether lowered staffing levels were intentional or due to turnover and hiring constraints.
Missing early outbreaks not captured by the NHSN data
Requested per Harvard institutional review board policy, but wasn’t required because this study uses publicly available data
9. Tadokoro et al. [32] (2022) JapanTo evaluate the therapeutic effect of makeup therapyFemale NH residents with dementia (n = 34)Faces were photographed at baseline and after 3 months and were analyzed with AI software (version of Microsoft Azure Face modified for Japanese patients)P-values, correlation coefficientsMakeup therapy had a chronic beneficial effect on the cognitive function of female patients. The AI facial emotion analysis may be superior to self-reported scales because of its independence on verbal ability and cognitionSmall sample.
Limited study sites
Ethical approval was obtained
10. Reddy et al. [44] (2022) IrelandTo measure and map US county-level spatial accessibility to high-quality NH care.
To discover the most relevant socio-demographic variables associated with these levels
Certified NHs in the USRandom forest approaches were used to impute data. Lasso approach was used to select variables for the predictive modelStd. error, t-value, P-valueSpatial accessibility was high in the Midwest and low in the Southwest and along the Pacific coast. Factors such as the size of the county, ethnicity, and patterns in local employment were related to high-quality care. The ML approach can be used to cast a wide net and select the most important variablesUse of county centroids to represent a county’s location.
Access to public transport was not considered
NR
11. Withall et al. [29] (2022) AustraliaTo examine the characteristics of victims and persons of interest regarding domestic violenceA total of 492,393 de-identified, police-recorded domestic violence events from the “new south Wales police force” for the period of January 2005 to December 2016A rule-based text-mining approach was used to extract dataPercentagesThis method demonstrated high precision and recall, highlighting the presence of mental illnesses, types of abuse, and sustained injuries in these narrativesThe study is based on police-recorded domestic violence data and may not fully represent the prevalence of elder abuse, especially in NHs, due to potential underreportingNot mentioned
12. Tadokoro et al. [31] (2021) JapanTo evaluate the immediate effect of makeup therapy on dementia patientsFemale NH residents (n = 36)Faces were photographed before and after treatment and were analyzed with AI software (version of Microsoft Azure Face modified for Japanese patients)P-values, correlation coefficientsMakeup therapy is a promising non-pharmacological approach for the immediate elevation of behavioral and psychological symptoms of dementia. The AI software quickly and quantitively evaluated the beneficial effects of makeup therapyNumber of participants was small.
Pathological background of dementia was not investigated.
Age in the makeup group was higher than in the control group.
Total treatment duration was different between the makeup group and the control group
Ethical approval was obtained
13. Lee et al. [45] (2021) CanadaTo determine predictors associated with 30 days mortality after a positive SARS-CoV-2 testResidents in LTC homes (n = 84.142)Random survival forest modelAUC (ROC)Residents’ characteristics related to functional status, comorbidities, and routine laboratory measures were major factors associated with mortalityAsymptomatic transmission of SARS-CoV-2 was not considered.
No information on public vs. for-profit homes was included.
No data on the severity of comorbidity was included
This study did not require approval by a research ethics board and did not require individual consent
14. Garcés-Jiménez et al. [41] (2021) SpainIt was hypothesized that anticipating an infectious disease diagnosis by a few days could significantly improve a patient’s well-being and reduce the burden on emergency health systemsResidents (n = 60) in NHs (n = 2)Data was analyzed using three ML algorithms:
Naive Bayes.
Filter classifier.
Random forest
P-valuesInfectious diseases can be predicted based on the vital signs collected. Its cost-effective implementation allows disadvantaged areas and less accessible populations to be reachedNeed to extend the period of sampling“Ethical consideration for setting clear limits for the research and protecting people’s privacy was implemented”
15. Lee et al. [37] (2021) KoreaTo compare a variety of ML methods in terms of their accuracy, sensitivity, specificity, positive predictive values, and negative predictor values by validating real datasets in order to predict factors for pressure ulcersNHs (n = 60).
NH residents (n = NR)
Representative ML algorithms (random forest, logistics regression, linear SVM, polynomial SVM, radial SVM, and sigmoid SVM) were used to develop a prediction modelAccuracy, sensitivity, specificity, negative predictor values, and positive predictive valuesThe random forest model had the greatest accuracy and was powerful. ML methods were able to identify many factors that predict pressure ulcers in NHs, including both NH characteristics (e.g., hours per resident day of director and number of current residents) and resident characteristicsNREthical approval was obtained
16. Lee et al. [36] (2020) KoreaTo compare different ML methods for predicting fallsNHs (n = 60).
NH residents (n = NR)
Representative ML algorithms (random forest, logistics regression, linear SVM, polynomial SVM, radial SVM, and sigmoid SVM) were applied to a pre-processed NH dataset to develop a prediction modelAccuracy, sensitivity, specificity, negative predictor values, and positive predictive valuesThe random forest model was the most accurate and is therefore a powerful algorithm to discern predictors of falls in NHs. Organizational characteristics (e.g., current number of residents) as well as personal factors should be considered for effective fall managementThe number of falls may have been overestimated or underestimated as self-collected data from NHs was used.
No differentiations were made in the type of falls, slips, and/or fall-related injuries.
Relatively small sample size to train a stable ML model.
Parameter tuning was not included
Ethical approval was obtained
17. Ambagtsheer et al. [28] (2020) AustraliaTo assess the effectiveness of AI algorithms compared to the electronic Frailty Index in accurately identifying frailty, based on a routinely-collected residential aged care administrative dataset.
To identify best-performing candidate algorithms
RCFs (n = 10).
RCF residents (n = 592)
A frailty prediction system was designed based on the electronic Frailty Index identification of frailty. Classification algorithms used are k-nearest neighbors, decision tree, and SVMAccuracyAI techniques show potential in accurately identifying frailty in RCFs based on data held in administrative databases. An SVM algorithm was found to be the best-performing. Frailty identification may enable service providers to anticipate and avoid potentially harmful impacts on residentsMost data extractions were performed manually using formulas in MS Excel. An NLP technique would be more efficient and accurate.
Data came from a single aged care service provider.
The dataset was relatively small
Ethical approval was obtained
18. Buisseret et al. [46] (2020) BelgiumTo design a method combining clinical tests and motion capture sensors in order to optimize the risk of fall prediction.
To assess the ability of AI to predict the risk of falls from solely sensor raw data
NHs (n = 4).
NH residents (n = 73)
A Timed Up and Go test and a six-min walking test were performed and combined with residents equipped with a homemade wearable sensor gathering kinematic data. An AI algorithm based on deep learning was created. Models based on CNN were trained and tested in order to find the optimal accuracy of the risk of fall predictionAccuracy, confusion matrices, P-valuesThe Timed Up and Go test was able to predict falls and the homemade wearable sensor was able to measure differences between fallers and non-fallers. It is shown that the combination improves the accuracy of risk of fall prediction at six months and that the AI algorithm trained by raw sensor data has an accuracy of 75% in fall predictionSmall size of the datasetEthical approval was obtained
19. Cheng and Cui [33] (2020) ChinaTo optimize the configuration of RCFs, while considering the demand of three stakeholders (government, elderly, investor), by development of a multi-objective spatial optimization modelRCFs in the Jing’an district of ShanghaiA multi-objective spatial optimization model was developed with the goals of maximizing the efficiency and equity of RCF configuration, minimizing travel costs of the elderly, and maximizing the profits of investorsNot mentionedA significant gap is concluded to be present between the service supply of RCFs and the demand of the elderly. Overall, the optimization model improved efficiency and equity, reduced the travel costs of the elderly, and increased the profits of investorsPolicy and resource constraints were not considered.
Predictions of the elderly population in the future were not considered
NR
20. Sun et al. [20] (2020) USTo inform about preventive measures for COVID-19 infection by identifying and assessing risk and possible vectors of infection, using a ML approachNHs (n = 1,146).
NH residents (n = NR)
A self-constructed dataset including information on the NHs’ facility and community characteristics was used to create predictive features. A tree-based gradient boosting algorithm was usedROC (AUC), sensitivity, specificityAn ML gradient boosting model is useful to quantify and predict the risk of infection in NHs. Several risk factors of infection were identified (e.g., NH county infection rate, NH size, and the number of separate units). The historical percentage of non-Hispanic white residents was found to be a protective factorCOVID-19 outcomes were inconsistently reported across states.
Model performance can be inconsistent in diverse geographic areas.
Data was gathered from historical reports, therefore it may not reflect real-time NH characteristics
NR
21. Suzuki et al. [30] (2020) JapanTo assess whether a CNN is able to predict the time of falling based on multiple complicating factors (such as age, severity of dementia, lower extremity strength, and physical function)NH (n = 1).
NH residents with Alzheimer’s disease (n = 42)
Residents were classified into three groups: those who fell within 150 days, within 300 days, and those who did not fall. Lower extremity strength, severity of dementia, and physical dysfunction were assessed using suitable measures. A CNN was created which focused on multiple complicating factor patternsAccuracyAn accuracy of 65% was found. A deep learning CNN method based on multiple complicating factors is able to predict the time of falling among NH residents with Alzheimer’s diseaseSome information may be lacking, e.g., about the various types of dementia, medication use, depressive symptoms, or the fall history of residents. These factors have been associated with an increased risk of falling.
A larger number of participants and an addition of important covariates, such as the ones previously listed, could have led to a more accurate prediction
Ethical approval was obtained
22. Gannod et al. [21] (2019) USTo explore the application and utility of a recommender system to preference assessment, based on data mining and ML techniquesNHs (n = 28).
NH residents (n = 255)
NH residents’ preferences were gathered using the PELI-NH interview and section F of the MDS 3.0. The information gathered was used to develop an ML recommender system, using an apriori algorithm and logistic regressionPrecision, recall, accuracy, F1-scoreA reasonable rate of accuracy and precision was found regarding the provision of recommendations on potential preferences for a resident. The ML recommender system has the potential to reduce the time needed to complete the PELI-NH interview, while simultaneously still incorporating important individualized preferences of residentsLearning approach was evaluated using a relatively small transaction dataset.
Only cognitively capable participants were included. The preferences of individuals with some form of cognitive impairment or those who are not able to communicate were not considered
NR
23. Delespierre et al. [38] (2017) FranceTo illustrate how text-mining of clinical narratives can enhance EMR data.
To demonstrate the convergence of information between clinical narrative extracted data and EMR data
NHs (n = 127).
NH residents (n = 1,015)
Textual data was extracted from physiotherapy narratives. Data mining techniques were combined. Standard query language and text-mining were used to build physiotherapy variables. Meaningful words were extracted. Principal components and multiple correspondence analyses have been performedNot mentionedIt is demonstrated that data mining and text-mining techniques can add new, usable, and simple data to EHRs with the goal of improving residents’ health and the quality of careMerely a selected sample of clinical narratives was used.
Matching residents with their associated clinical narratives relied on physiotherapy care observations that varied between NHs
Ethical approval was obtained
24. Jiang et al. [27] (2017) AustraliaTo identify risk factors related to medication management using text-miningResidential aged care homes (n = 3,607)Data in the form of reports were collected from the website of the Australian Aged Care Quality Agency. The text data was classified and labeled with representative keywords. Apache OpenNLP was used to extract a word cloud indicating the most frequently used words in text reports about medication managementNot mentionedUsing text data mining, 21 risk factors to fail in medication management were identified. “ineffective monitoring process”, followed by “noncompliance with professional standards and guidelines” were found to be the biggest risk factors. The gained information may be useful to improve medication management in residential aged care homesEvidence may be limited due to the relatively low sample size.
The reports used possessed inadequate details about why the failure happened
NR
25. Fernández-Llatas et al. [40] (2013) SpainTo present a set of algorithms based on process mining that may help professionals infer and compare individualized visual models of human behaviorNH (n = 1).
NH residents (n = 9)
The eMotiva process mining framework combining algorithms and visualization interfaces was used. Process mining algorithms were used that filter, infer, and visualise workflows. These workflows were inferred from data collected using indoor location systems and bracelets. PALIA was the main algorithm in the frameworkNot mentionedThe process mining technology was useful for inferring and presenting individual models to experts, representing human behavior in a visual and understandable mannerLimited number of cases were used for observationNR
26. Lapidus and Carrat [39] (2010) FranceTo develop a computerized algorithm able to identify the likeliest transmission paths during a person-to-person transmitted illness outbreakNH residents (n = NR)A computerized algorithm was built using information about the natural history of disease and a dataset about the population structure and chronology of observed symptoms. A simulator was used to assess the efficacy and was compared with reference methodsProportion of infected subjectsThe algorithm was able to provide information on the dynamics of an outbreak and may help identify sources of infection in order to take the right preventive actionsUnclear how the algorithm would deal with missing dataNR
27. Volrathongchai et al. [22] (2005) USTo evaluate the application of a KDD process using a likelihood-based pursuit data mining technique able to predict the likelihood of fallsLTC facility residents (n = 9,980)KDD was applied to data from the MDS. A likelihood-based pursuit technique has been used to construct models able to predict the likelihood of falls and the variables contributing to this likelihood. Four variables known to be associated with falls and two variables known to not be associated with falls were includedL1 norm of error, P-valuesThe likelihood-based pursuit technique was able to identify which of the variables were associated with falls and was able to make fall likelihood predictions based on these variables. It has the potential to be useful in assessing fall risk due to its ability to provide probabilities based on the exact combination of variables present in an individual residentModels constructed using the likelihood-based pursuit technique required that there is little correlation among the predictor variables:
Only five or six variables were included in the likelihood-based pursuit technique
Ethical approval was obtained

CNN: convolutional neural network; EMR: electronic medical record; KDD: knowledge discovery in databases; NR: not reported; NLP: natural language processing; PALIA: parallel activity-based log inference algorithm; RCFs: residential care facilities; SVM: support vector machine; XGBoost: extreme gradient boosting; LASSO: least absolute shrinkage and selection operator; CART: classification and regression tree; AF: atrial fibrillation; TIA: transient ischaemic attack; PELI: preferences for everyday living inventory; ROC: receiver operating characteristic; AUC: area under the curve; MI: myocardial infarction; SARS-CoV-2: severe acute respiratory syndrome coronavirus 2; AUROC: area under the receiver operating characteristic curve; NHSN: National Healthcare Safety Network; COVID-19: coronavirus disease 2019; OpenNLP: open natural NLP; MDS: minimum dataset