From:  Impact of smoking on subtypes and molecular profile of breast cancer: a systematic review

 Molecular characteristics and survival outcomes in studies assessing smoking history and breast cancer prognosis.

First author, yearCountry/Study populationStudy design & sample (molecular subset)Molecular data type (genes)Platform/AssayBioinformatics/Statistical methodsSmoking exposure definition & analysisMolecular outcomes/markersKey findings on smoking-molecular associationsSurvival/Recurrence outcomes
Andres et al., 2015 [32]USA; women with primary invasive breast carcinoma whose frozen tumors were archived in the University of Louisville breast cancer biorepository (diagnosed 1988–1996)Retrospective prognostic study using tumor specimens with long-term follow-up; gene discovery in 247 LCM-procured carcinoma samples (microarray, 22,000 genes) of which 165 had known smoking status; validation set of 98 tumors analysed by RT-qPCR (48 cigarette smokers, 50 never-smokers) with complete outcome dataTumor mRNA expression (RT-qPCR) of 23 candidate genes: NAT1, NAT2, COMT, SOD1, SOD2, BRCA1, BRCA2, CYP1A1, APOC1, ARID1B, CTNNBL1, MSX1, UBE2F, IRF2, NCOA1, LECT2, THAP4, RIPK1, AGPAT1, C7orf23, CENPN, CETN1, YTHDC2; candidates selected from microarray contrasts of smokers vs. non-smokers and recurrent vs. disease-free cases plus literature-based smoking/breast-cancer genesFrozen tumor sections (median 60% carcinoma cells) → RNA isolation (RNeasy), quality check (Agilent Bioanalyzer), reverse transcription (Superscript III); SYBR Green RT-qPCR with gene-specific primers (Primer Express), ACTB as reference and Universal Human Reference RNA as calibrator; expression quantified as −ΔΔCt (log2-scale) Univariate Cox models for OS and DFS for each gene and clinical covariate; BH correction for multiple testing. For prognostic modelling, data repeatedly split (1,000×) into 70% training/30% test sets separately for smokers and never-smokers; LASSO-penalized Cox used for variable selection; permutation tests to define significance thresholds for gene selection; multivariable Cox models (gene-only, genes + clinical covariates, clinical-only) fitted; L2 penalty added when needed for convergence; predictive performance assessed by C-index and Kaplan-Meier curves for high- vs. low-risk groups; additional models in combined cohort used interaction terms between smoking status and gene expression to test effect modificationSmoking history was obtained from clinical records. In the discovery microarray cohort, 66 women were confirmed cigarette smokers with pack-year data, and 99 were never-smokers. In the qPCR cohort, women were classified as cigarette smokers versus never-smokers; analyses were stratified by smoking status rather than dose. No information on secondhand smoke exposure was reported.Continuous tumor expression of individual genes (−ΔΔCt; evaluated per two-fold increase) for 23 candidates; multigene prognostic signatures (6–8 genes) for OS and DFS in smokers and never-smokers; focus on genes repeatedly selected in LASSO models: in smokers, CYP1A1, CETN1, NCOA1, IRF2, CENPN, LECT2, NAT1, RIPK1; in never-smokers, IRF2, CYP1A1, CETN1, NAT2; interaction terms for smoking × CYP1A1, LECT2, CETN1 in combined models In smokers, higher expression of CYP1A1 was strongly associated with lower mortality (HR per doubling 0.66, 95% CI 0.51–0.85) and lower recurrence risk (HR 0.77, 95% CI 0.66–0.90); higher CENPN expression was associated with increased mortality (HR 1.71, 95% CI 1.15–2.54) and recurrence (HR 1.37, 95% CI 1.03–1.84); IRF2 showed a protective effect for OS (HR 0.78, 95% CI 0.62–0.98). In never-smokers, IRF2 remained protective for OS (HR 0.81, 95% CI 0.69–0.95) and DFS (HR 0.86, 95% CI 0.75–0.99), while CETN1 showed a weaker, borderline adverse association with DFS. Smoking-gene interaction analyses in the combined cohort identified significant interactions for smoking status with CYP1A1, LECT2, and CETN1, indicating that the prognostic effect of these genes differs between smokers and never-smokers. Multigene signatures (≈ 7–8 genes) in smokers achieved high predictive accuracy (median C-index ≈ 0.8 for OS and ≈ 0.73 for DFS), whereas analogous signatures in never-smokers had only modest discrimination (C-index ≈ 0.59) Outcomes: overall survival (time to death) and disease-free survival (time to first recurrence or death). Smoking status alone was not significantly associated with OS or DFS (HR for smokers vs. never-smokers ≈ 0.8–0.95), but integrating smoking history with tumor gene expression substantially improved prognostic stratification among smokers; gene-based models outperformed Adjuvant! Online risk scores in smokers (C-index ~0.79 vs. 0.69 for OS; ~0.75 vs. 0.68 for DFS), while in never-smokers gene-based and Adjuvant Online models performed similarly (C-indices ~0.59–0.62)
Goldvaser et al., 2017 [18]Israel; women with early-stage ER-positive, HER2-negative breast cancer treated at Davidoff Cancer Center, Rabin Medical Center, whose tumors were sent for Oncotype DX testing (diagnosed 4/2005–3/2012)Retrospective single-centre cohort of 662 women (median age 61 years; 78% postmenopausal). All had ER+/HER2− early breast cancer (stage I–IIIA) with available pathology and Oncotype DX resultsMultigene expression-based recurrence score: Oncotype DX 21-gene RT-PCR assay (reported as continuous RS and categorized low 0–17, intermediate 18–30, high ≥ 31). Conventional IHC markers: ER, PR, HER2, Ki-67, p53; histologic subtype and gradeOncotype DX is performed on FFPE tumor tissue (central RT-PCR assay; ER/PR/HER2 measured by RT-PCR as part of ODX). Local pathology: IHC for ER/PR using modified H-score (0–3), Ki-67 (% positive nuclei), HER2 by IHC with reflex FISH for 2+ cases per ASCO guidelines; standard histopathology for angiolymphatic and perineural invasionContinuous variables: mean ± SD, compared by t-tests; categorical variables: χ² or Fisher’s exact tests. OS: Kaplan-Meier and log-rank. DFS and breast cancer–specific survival (BCSS): Cox proportional hazards models with Fine & Gray competing-risk correction (competing events: non-cancer death for BCSS, death without recurrence for DFS). Multivariable models adjusted for age, menopausal status, ethnicity, tumor size, nodal status and gradeSmoking data obtained at first oncology visit. Current smokers: actively smoking at diagnosis. Ever smokers: any history of smoking; never smokers: no history. Pack-years were recorded and used to define heavy smokers (≥ 30 pack-years) vs. < 30. Groups were compared as: ever vs. never; current vs. former + never; ≥ 30 vs. < 30 pack-years. No data on secondhand smoke, age at initiation, or pack-years before first pregnancy; smoking status not updated during follow-upOncotype DX RS (median 17; range 0–88; 50.6% low, 38.3% intermediate, 11.1% high). Tumor size, nodal status, stage, histologic type (IDC, ILC, other), grade, ER/PR IHC intensity, Ki-67 (%), p53 (%), angiolymphatic and perineural invasionSmoking history, status, and pack-years were not associated with tumor size, nodal involvement, stage, histologic type, grade, ER/PR intensity, Ki-67, p53, or Oncotype DX RS (mean RS similar across all smoking groups). The only significant differences were higher angiolymphatic invasion (10.4% vs. 5.1%, p = 0.045) and perineural invasion (8.3% vs. 3.5%, p = 0.031) in current smokers vs. others; no clear pack-year gradient. Authors concluded that smoking had no clinically significant influence on tumor burden, Oncotype RS, or most pathological characteristics in this ER+/HER2− cohortOverall 5-year DFS 95.7%, BCSS 98.5%, OS 98.5%. Recurrence rates: 3.6% (low RS), 5.3% (intermediate), 11% (high; p = 0.036); high RS was associated with worse DFS (HR 0.40 for low vs. high RS, 95% CI 0.16–0.98). Smoking variables were not associated with DFS, BCSS, or OS: ever vs. never smokers—DFS HR 0.73 (95% CI 0.32–1.68), BCSS HR 0.53 (0.12–2.38), OS HR 0.80 (0.22–2.92); heavy vs. lighter smokers—DFS HR 0.85 (0.26–2.80), BCSS HR 0.73 (0.09–5.70), OS HR 1.50 (0.33–6.78). Multivariable analyses adjusted for clinicopathologic factors yielded similar non-significant results, indicating no detectable impact of smoking on prognosis in this subgroup
Loroña et al., 2024 [28]USA; women 20–69 years with first primary invasive breast cancer diagnosed 2004–2015 in Seattle-Puget Sound SEER region (King, Pierce, Snohomish counties); oversampling of TN and HER2-overexpressing tumorsPopulation-based prospective cohort of 3,876 cases (2,153 luminal ER+, 1,252 TN, 471 HER2-overexpressing); outcomes: any and site-specific recurrence, breast cancer-specific mortality, all-cause mortality; recurrence subset n = 3,197 (excludes stage IV, very-early recurrences, and those without medical record review)Clinical molecular subtypes by ER/PR/HER2: luminal (ER+; luminal A and B combined), triple-negative (ER−/PR−/HER2−), HER2-overexpressing (ER−/HER2+); no genomic or epigenomic profilingER, PR, HER2 status abstracted from clinical pathology; HER2 determined by IHC with reflex FISH for equivocal cases; subtype assignment based on joint ER/PR/HER2 status per standard criteria Multivariable Cox proportional hazards models for recurrence, breast cancer-specific mortality and all-cause mortality, fitted overall and separately by subtype; models mutually adjusted for alcohol and smoking and for age, year of diagnosis, stage, hypertension, diabetes, race/ethnicity and BMI; subtype used as stratification variable in overall analyses; cause-specific Cox models for site-specific recurrences (locoregional, distant, lung, liver, bone, brain); inverse probability weighting to account for under-sampling of luminal cases; multiple imputation by chained equations for covariates; tests for trend across pack-year categories and for interaction with alcohol; time-dependent coefficient models to compare early (< 5 y) vs. late (≥ 5 y) hazardsSmoking at or just before diagnosis from structured interviewer questionnaires, supplemented by medical records when needed; status categorized as never, former, current smoker; “ever smoker” = former + current. Pack-years (total lifetime) were calculated and analysed among ever smokers in categories (< 10 vs. ≥ 10 pack-years; trend test) and as dose-response within subtypes; primary exposure in the main models was smoking status at diagnosis; limited post-diagnostic data used only for concordance checks (91% same status pre/post)Molecularly defined subtypes (luminal, TN, HER2-overexpressing); any recurrence and site-specific recurrence (locoregional, distant, bone, lung, liver, brain); breast cancer-specific death; all-cause death; also timing of recurrence and death by subtype (e.g., luminal cases more likely to recur ≥ 5 years after diagnosis)Among TN cases, ever smoking vs. never was associated with a higher risk of any recurrence (HR 1.33, 95% CI 1.01–1.74) and current smoking with an even higher risk (HR 1.59, 95% CI 1.07–2.35). Current smoking was associated with increased distant recurrence overall (HR 1.53, 95% CI 1.02–2.30). Ever and current smoking were associated with > 50–80% higher risk of recurrence to bone among luminal cases and overall (cause-specific models). Pack-years ≥ 10 tended to show higher recurrence and mortality risks than < 10 pack-years, particularly for breast cancer-specific and all-cause mortality, although trends were sometimes imprecise. No clear effect modification by menopausal status or year of diagnosis; no joint effect between alcohol and smoking.Ever smoking was associated with ~30–50% higher breast cancer-specific mortality across all cases, luminal, and TN subtypes and ~53–61% higher all-cause mortality across subtypes, with larger HRs for ≥ 10 pack-years. Current smoking at diagnosis was associated with 50–131% higher breast cancer-specific mortality and 122–158% higher all-cause mortality across subtypes. Associations were similar over time, with somewhat stronger effects of current smoking on early (< 5 y) breast cancer death in luminal tumors and late (≥ 5 y) death in TN tumors. Overall, smoking history at diagnosis was a consistent adverse prognostic factor; no survival benefit observed in any subtype.
Persson et al., 2016 [30]Sweden; women with first primary breast cancer operated at Skåne University Hospital, Lund (2002–2012)Population-based prospective cohort of 1,116 women; 51 with preoperative therapy excluded → 1,065 analysed for patient/tumour characteristics; 1,016 with invasive non-metastatic tumours included in survival analyses; 891 had ER+ tumours (endocrine-treatment subset) Routine clinical markers: ER, PR, HER2 (no Ki-67 routinely); used to define ER+ endocrine-treated cohorts; no genomic or epigenomic profilingPathology at regional hospital lab: ER and PR by IHC (positive if >10% nuclei stained, per Swedish clinical practice); HER2 amplification assessed from 2005 onwards in patients <70 years with invasive tumours; standard histopathology for size, grade, nodal status Patient/Tumour characteristics compared by smoking with χ²/Fisher’s exact tests and Mann-Whitney U; Kaplan-Meier curves and log-rank tests for breast cancer events, distant metastasis, and all-cause death; Cox proportional hazards models to estimate adjusted HRs (95% CIs) overall and within treatment strata; adjustment for tumour size ≥ 21 mm or muscle/skin involvement, any nodal involvement, grade III, ER status, age (continuous), BMI ≥ 25 kg/m2, and, in extended models, receipt of radiotherapy, chemotherapy, tamoxifen, and aromatase inhibitors; stratified analyses by endocrine treatment type (tamoxifen vs. AIs) and age (< 50 vs. ≥ 50 years) Smoking status self-reported preoperatively via questionnaire (self-defined non-smoker, smoker, or occasional smoker; plus cigarette categories for last week: 0, 1–5, 6–10, 11–15, 16–20, ≥ 20); “Smokers” = those identifying as smokers/occasional smokers or reporting any cigarettes in the prior week; 223/1,065 (21%) were smokers at baseline; follow-up questionnaires at 3–6 months and 1 year showed < 1% of preoperative non-smokers started smoking and ~10% of smokers quit, indicating stable status; analyses use preoperative smoking (yes/no); no pack-years or former-smoking history, and former smokers grouped with never smokersER, PR, HER2 status and standard pathological features; primary outcomes: “breast cancer events” (local/regional recurrence, new breast cancer, or distant metastasis), distant metastasis alone, and death from any cause; endocrine-response analyses restricted to ER+ tumours, with subgroups by endocrine therapy: tamoxifen ever, AI ever, neither; additional stratification by age (< 50 vs. ≥ 50 years) and by receipt of chemotherapy or radiotherapy Smokers were younger, leaner, had smaller breast volume, fewer children and younger age at first full-term pregnancy, and were more often hormone receptor-negative (lower ER+ and PR+ frequencies) than non-smokers. Overall, preoperative smoking showed no significant association with risk of breast cancer events (adjHR 1.45, 95% CI 0.95–2.20), but was linked to ~two-fold higher all-cause mortality (adjHR 2.03, 95% CI 1.29–3.21). In ER+ patients ≥ 50 years treated with aromatase inhibitors (AIs) (n = 309), smoking was strongly associated with worse outcomes: breast cancer events (adjHR 2.97, 95% CI 1.44–6.13), distant metastases (adjHR 4.19, 95% CI 1.81–9.72), and death (adjHR 3.52, 95% CI 1.59–7.81); absolute event rates were 17.5 vs. 48.2 per 1,000 person-years in non-smokers vs. smokers. Among ≥ 50-year ER+ patients treated with tamoxifen (TAM) (n = 408), smoking was not significantly associated with breast cancer events (adjHR 1.58, 95% CI 0.76–3.30). In chemotherapy-treated and in radiotherapy-only groups overall, smoking did not materially affect events, though a weak association in radiotherapy-treated patients disappeared after excluding AI users; within radiotherapy + AI-treated subgroup, smokers had ~four-fold higher event risk (adjHR 4.13, 95% CI 1.66–10.26). After excluding all AI-treated patients, smoking was not associated with breast cancer events or distant metastasis, but showed a borderline increased all-cause mortality (adjHR 1.82, 95% CI 1.01–3.26). Authors conclude that smoking may specifically impair AI effectiveness, whereas it does not appear to influence response to tamoxifenOutcomes: breast cancer events (122 events; 76 distant metastases) and 97 deaths over median 5.1 years’ follow-up; overall, smokers had about double the risk of all-cause death vs. non-smokers, but no clear increase in breast cancer events. In ER+ ≥ 50-year AI-treated patients, smoking was associated with markedly shorter event-free, metastasis-free and overall survival (adjHRs ~3–4), while in TAM-treated patients survival did not differ by smoking status. Study suggests smoking is an adverse prognostic factor particularly in AI-treated women, and may need to be considered when choosing endocrine therapy
Schmidt et al., 2020 [31]Germany; women with TxNxM0 triple-negative breast cancer (TNBC) treated at the Department of Gynecology, Obstetrics and Reproductive Medicine, University Medical School of Saarland (2004–2018)Retrospective single-centre chart review of 197 TNBC patients; all ER−/PR−/HER2− at diagnosis; 84 received neoadjuvant chemotherapy (NACT), 87 adjuvant chemotherapy, 26 no chemotherapy; median follow-up 41.4 monthsNo genomic/epigenomic profiling; TNBC defined by lack of ER, PR, HER2 expression in routine pathologyStandard institutional pathology; ER/PR/HER2 IHC used only to establish TNBC status (all triple negative); no additional molecular assays reportedDescriptive statistics for baseline characteristics; OS and DFS analysed by Kaplan-Meier curves; group comparisons by log-rank test with two-sided α = 0.05; no multivariable Cox models; secondary analyses of OS/DFS by weight change (> 3 kg) and parity (> 3 pregnancies), and of pathologic complete response (pCR) rates by BMI, smoking, alcohol, physical activity and parity Smoking habit was recorded as yes/no at baseline; smokers further described as regular (n = 35) or occasional (n = 12), total 47/197 (23.9%); no pack-years, intensity, age at initiation, or former-smoker category; exposure used as binary variable (smoker vs. non-smoker) in Kaplan-Meier/log-rank analyses of OS and DFS; no information on secondhand smokeClinical TNBC only (no additional markers); primary endpoints: overall survival (OS) and disease-free survival (DFS) by BMI, smoking, alcohol, physical activity and parity; secondary outcome: pCR (ypT0 ypN0/pN0) after NACT by these lifestyle factors; also explored impact of weight change > 3 kg and > 3 pregnancies on OS/DFS Smoking habit did not influence OS or DFS. Log-rank p-values for smokers vs. non-smokers: OS p = 0.9892, DFS p = 0.6040 (Kaplan-Meier curves B1/B2 on page 4). Similarly, BMI, alcohol, physical activity and parity showed no significant association with OS or DFS (e.g., OS by BMI p = 0.4720; DFS p = 0.2272; alcohol OS p = 0.6515, DFS p = 0.7460). None of these lifestyle factors, including smoking, affected the probability of achieving pCR after NACT (34/84, 40.38% overall), nor did weight change > 3 kg or > 3 pregnancies impact outcomes. Authors conclude that in this TNBC cohort, smoking and other lifestyle factors studied were not prognostic for disease course. During follow-up, 34/197 (17.3%) had recurrence, 51 (25.9%) developed metastases, and 51 (25.9%) died. Neither OS nor DFS differed significantly by smoking status or other lifestyle factors; survival curves by smoking and alcohol on page 4 show almost overlapping trajectories. Thus, no evidence that smoking at diagnosis alters recurrence risk or survival in women with TNBC in this series.
Seibold et al., 2014 [29]Germany; postmenopausal women aged 50–74 years with invasive breast cancer in the population-based MARIE/MARIEplus study (Hamburg and Rhein-Neckar–Karlsruhe regions), diagnosed 2001–2005Prospective cohort; 3,340 women with invasive breast cancer (after excluding in situ and prior non-breast malignancy) for mortality analyses; 2,857 with stages I–IIIA and clear staging (excludes NACT, stage IIIB–IV, early events) for recurrence analyses; NAT2 genotyped in 2,399Clinicopathologic subtypes based on ER, PR, HER2, and grade: luminal A-like (ER/PR+, HER2−, grade 1–2), luminal B-like (ER/PR+, any HER2, grade 4), HER2+ non-luminal (ER−/PR−/HER2+), triple-negative (ER−/PR−/HER2−). Germline NAT2 genotype (slow vs. fast acetylator) from 5 polymorphisms (rs1041983, rs1799929, rs1799930, rs1208, rs17999312)ER/PR status and grade from routine histopathology; HER2 by IHC ± confirmatory testing per local practice; NAT2 genotyping on blood DNA using Sequenom MassARRAY (iPLEX GOLD / hME Assay); NAT2 alleles classified into rapid (*4, *12A/B, *13A) vs. non-rapid (slow) acetylatorsDelayed-entry multivariable Cox regression (PHREG, SAS) with age at diagnosis and region as strata; endpoints: all-cause, breast cancer-specific, non-breast-cancer mortality, and recurrence (local/regional/contralateral/distant). Models adjusted for tumour size, nodal status, metastasis, grade, joint ER/PR status, BMI, alcohol, mode of detection, radiotherapy, HRT at diagnosis, CVD, diabetes; proportional hazards checked with martingale residuals. Effect modification was tested by stratified Cox and interaction terms for NAT2 status, BMI (< 25 vs. ≥ 25 kg/m2), alcohol (< 12 vs. ≥ 12 g/day), and molecular subtypeSmoking before diagnosis from standardized face-to-face interview. Ever smokers: ≥ 100 cigarettes lifetime. Current smokers: smoked within the year before diagnosis; former smokers: previously smoked but not in last year; never smokers: < 100 cigarettes lifetime. Pack-years = packs/day × years; categories: never, < 10, 10–20, ≥ 20 pack-years. Cigarettes/day: < 10 vs. ≥ 10. Time since cessation in former smokers: < 10, 10–20, ≥ 20 years. Main exposure for effect-modification analyses: current vs. never/former combinedMolecular/Clinical markers: ER/PR/HER2–/grade-based subtypes (luminal A-like, luminal B-like, HER2+ non-luminal, triple-negative); NAT2 slow vs. fast acetylator status; standard TNM; outcomes: all-cause, breast cancer-specific, non-breast-cancer mortality, and any recurrenceOverall, current vs. never/former smoking was associated with higher all-cause mortality (HR 1.39, 95% CI 1.10–1.76) and non-breast-cancer mortality (HR 1.96, 95% CI 1.28–2.99), with non-significant trends for breast cancer-specific mortality (HR 1.23, 95% CI 0.93–1.64) and recurrence (HR 1.16, 95% CI 0.86–1.57). Risk of non-breast-cancer death increased with dose (per 5 pack-years HR 1.12, 95% CI 1.07–1.18; per 5 cigarettes/day HR 1.22, 95% CI 1.11–1.34). Effect-modification analyses showed substantially stronger smoking effects in NAT2 slow acetylators: current vs. never/former HRs for slow vs. fast acetylators were 1.93 vs. 1.28 (all-cause), 1.77 vs. 1.09 (breast cancer-specific), and 2.76 vs. 1.83 (non-breast-cancer mortality), though heterogeneity tests were underpowered. By subtype, current smoking doubled all-cause mortality in luminal A-like (HR 2.08, 95% CI 1.40–3.10) and triple-negative tumours (HR 1.93, 95% CI 1.02–3.65), and markedly increased recurrence risk in HER2+ non-luminal tumours (HR 3.64, 95% CI 1.22–10.8); no clear associations were seen in luminal B-like tumours. For non-breast-cancer mortality, risks were strongly elevated only in normal-weight women (BMI < 25 kg/m2: HR 2.52, 95% CI 1.52–4.15; BMI ≥ 25: HR 0.94, 95% CI 0.38–2.36; Phet = 0.04), and particularly in women consuming ≥ 12 g/day of alcohol (HR 3.38, 95% CI 1.32–8.69). Authors conclude smoking is an adverse prognostic factor, especially in NAT2 slow acetylators and in luminal A-like, HER2+ and triple-negative subtypesOver median 5.7 years’ follow-up, 449 deaths (323 breast cancer-related) and 322 recurrences occurred. Current smoking increased all-cause mortality and non-breast-cancer mortality overall, with strongest absolute and relative excess in NAT2 slow acetylators and in certain molecular subtypes (luminal A-like, triple-negative, HER2+). Associations for breast cancer-specific mortality and recurrence were weaker overall but became more evident in these subgroups. Findings support targeted emphasis on smoking cessation in breast cancer patients, particularly those with NAT2 slow acetylator status or with luminal A-like, HER2+ or triple-negative tumours.
Ferreira et al., 2024 [15]Brazil; women with breast carcinoma treated in 2 public hospitals in São Paulo stateLongitudinal cohort of 208 women with breast cancer (age 25–65, all parous with ≥ 1 month breastfeeding); 80 smokers and 128 non-smokers; all had core biopsy with anatomopathology and immunohistochemistry, and were followed for 17 monthsImmunohistochemistry-based molecular subtypes (gene expression surrogates): luminal A, luminal B, luminal hybrid, HER2 overexpression, triple-negative, and “others”Standard IHC on histological sections with automated system: antigen retrieval in PTLink (Dako), incubation/development/counterstaining in AutoStainer Link; highly sensitive polymer detection and ready-to-use FLEX antibodies; molecular subtype assignment based on established IHC surrogate criteria from microarray gene-expression-defined subtypesDescriptive statistics with Kolmogorov-Smirnov test for normality; continuous variables as mean ± SD; group comparisons by ANOVA; categorical variables by chi-square; odds ratio for severe vs. non-severe cancer (smokers vs. non-smokers, “neoadjuvant chemotherapy groups”) with 95% CI; Kaplan-Meier curves for survival by smoking status, log-rank test; p < 0.05 considered significantSmoking was defined as regular use of ≥ 1 cigarette/day; 80 women classified as smokers and 128 as non-smokers; no information on duration, intensity, or pack-years; smoking status assessed at baseline (diagnosis) and used as binary exposure (smoker vs. non-smoker) in all analysesTumor molecular subtype by IHC (luminal A, luminal B, luminal hybrid, HER2 overexpression, triple-negative); clinical stage (TNM, grouped as early 0–IIB vs. late III–IV); “severe cancer” operationalized via molecular profile and need for neoadjuvant chemotherapy; mortality during 17-month follow-upMolecular profile distribution differed by smoking: among smokers, luminal A 24.0%, luminal B 31.3%, luminal hybrid 14.4%, HER2 overexpression 7.2%, triple-negative 19.0%, others 4.1%; among non-smokers, luminal A 35.9%, luminal B 35.9%, luminal hybrid 11.7%, HER2 overexpression 6.3%, triple-negative 10.1%, others 0.1%. Smokers had significantly lower luminal A (p = 0.035) and higher triple-negative frequency (p = 0.030). Triple-negative smokers were younger (mean 48.2 years) than triple-negative non-smokers (52.6 years, p = 0.005). Risk of more severe cancer (defined by neoadjuvant chemotherapy groups/molecular severity) was 5.5-fold higher in smokers than non-smokers (OR 5.5; 95% CI 3.0–10.0). Clinical stage distribution (I–IV) did not differ significantly between smokers and non-smokersOver 17 months, mortality was 39.5% in smokers vs. 20% in non-smokers; Kaplan-Meier analysis showed significantly lower survival among smokers (log-rank p = 0.01), with an estimated risk of death 2.2 times higher in smokers (95% CI 1.19–4.58); mean survival time for non-smokers was ~240 days; no multivariable survival modeling reported beyond smoking status
Takada et al., 2020 [16]Japan; women with resectable primary breast cancer undergoing curative surgery at Osaka City University Hospital (2007–2018); subset with biopsy/resection of recurrent lesions and known smoking historySingle-centre retrospective cohort of 989 primary breast cancer patients; recurrences in 77, of whom 50 (with paired primary–recurrent tissue and recorded smoking history) were included for molecular/smoking analyses; all were preoperative systemic-therapy-naïveProtein expression of ER, PR, HER2 and Ki-67 in primary and recurrent tumors by immunohistochemistry; tumors classified into intrinsic subtypes: HRBC (ER and/or PR+), HER2BC (ER−/PR−/HER2+), TNBC (ER−/PR−/HER2−)Standard immunohistochemistry on surgical and recurrent biopsy/resection specimens in institutional pathology lab; Ki-67 proliferation index evaluated with a 14% cutoff; imaging (US, CT, bone scintigraphy) used for staging but not for molecular classificationConcordance/Discordance in receptor status (ER, PR, HER2) between primary and recurrent tumors evaluated; chi-square tests for associations between receptor conversion and clinicopathological factors; logistic regression to estimate ORs and 95% CIs for positive HER2 conversion by smoking status and pack-year categories; Kaplan-Meier curves and log-rank tests for progression-free survival (PFS) and post-recurrence survival (PRS); Cox proportional hazards models for univariate and multivariate prognostic analysesSmoking history was recorded at the first visit (cigarettes/day and years of smoking); pack-years were calculated as (cigarettes per day ÷ 20) × years; patients classified as smokers (any history) vs. non-smokers; 14/50 (28%) were smokers with median 30 pack-years (range 1.4–150); for HER2-conversion analyses, smokers were further grouped by pack-years (≤ 25, 25–50, > 50) vs. non-smokers; smoking assessed only up to surgery (no longitudinal updates)Changes in IHC status of ER, PR, and HER2 between primary and recurrent tumors; intrinsic subtype change (HRBC/HER2BC/TNBC) at recurrence; observed conversion rates: ER negative conversion 3/50 (6%), ER positive conversion 1/50 (2%); PR negative conversion 15/50 (30%); HER2 positive conversion 6/50 (12%), no HER2 negative conversion; intrinsic subtype change in 5/50 (10%)Positive HER2 conversion at recurrence was significantly more frequent in smokers (4/14; 28.6%) than in non-smokers (2/36; 5.6%) (p = 0.024); logistic regression showed smokers vs. non-smokers had higher odds of HER2 positive conversion (OR 6.8, 95% CI 1.082–42.731), with ORs increasing across higher pack-year categories (up to OR 17.0 for > 50 pack-years vs. non-smokers, albeit with wide CIs); smoking was not significantly associated with ER or PR conversion, intrinsic subtype change, or other clinicopathological variablesPFS (from recurrence to progression or death) and PRS (from recurrence to death) were defined and analysed; median postoperative follow-up for the 50 recurrent cases was 2,128 days; no significant difference in PFS (p = 0.102, log-rank) or PRS (p = 0.140, log-rank) between smokers and non-smokers; in univariate Cox models, worse PFS was associated with adjuvant chemotherapy after surgery (HR 3.734, 95% CI 1.316–10.115) and intrinsic subtype change at recurrence (HR 3.889, 95% CI 1.083–11.236), and worse PRS with biopsied distant metastasis (HR 8.527, 95% CI 1.114–52.010), but smoking history was not an independent prognostic factor in multivariate analyses
Wang et al., 2021 [17]TCGA pan-cancer cohort (BLCA, CESC, ESCA, HNSC, KIRP, LUAD, LUSC); 2,317 tumor patients with recorded smoking history and multi-omics dataRetrospective multi-omics analysis of TCGA level-3 data across 7 smoking-related cancers; integrated RNA-seq, miRNA, DNA methylation, SNVs, CNVs and clinical data (OS, DSS, PFI, stage, age, sex)Multi-omics: mRNA expression (RNA-seq), miRNA expression, lncRNA expression, DNA methylation (Illumina HumanMethylation450), somatic SNVs, CNVs, immune/stromal scores, stemness indices; identification of 11 smoking-related methylation driver genes (EIF5A2, GBP6, HGD, HS6ST1, ITGA5, NR2F2, PLS1, PPP1R18, PTHLH, SLC6A15, YEATS2) and a 46-gene smoking-related prognostic signature; ceRNA network involving miRNAs (e.g., miR-193b-3p, miR-301b, miR-205-5p, miR-132-3p, miR-212-3p, miR-1271-5p, miR-137)Public TCGA pipelines: RNA-seq [log2(TPM + 1)], Illumina 450K methylation, VarScan2 SNVs, masked CNV segments; CNVs summarized with GISTIC2.0; immune and stromal contexture from ssGSEA and ESTIMATE; chemotherapeutic response predicted using GDSC IC50 modeling (ridge regression via “pRRophetic”)Survival differences by smoking history evaluated with Kaplan-Meier curves and Cox regression; multi-variable Cox models including smoking (non/former/current coded 0/1/2), age, sex, and stage; ssGSEA for 29 immune signatures; ESTIMATE for stromal/immune/estimate scores and tumor purity; BCR diversity, leukocyte fraction, neoantigens, HRD, CTA scores from published TCGA resources; stemness indices (mRNAsi, mDNAsi, DMPsi, ENHsi, EREG-mRNAsi, EREG-mDNAsi) from Malta et al.; mutation and CNV burden and landscapes analyzed with “maftools”; differential expression via edgeR; ceRNA network using miRcode, miRDB, TargetScan, miRTarBase; methylation driver genes defined by inverse correlation (R < −0.4, p < 0.05) between methylation and expression; 46-gene prognostic model built with univariate Cox + LASSO + multivariate Cox; ROC curves and C-index for model performance; nomograms with calibration for each cancer typeSmoking history derived from TCGA clinical data; patients categorized as non-smokers, former smokers, and current smokers; in Cox models coded as 0, 1, 2, respectively; no pack-years, intensity or duration data; all analyses stratified/comparative across these three smoking-history groups (non vs. former vs. current) across tumor typesMulti-omics endpoints comparing non-, former-, and current smokers: 29 immune signatures; ESTIMATE immune/stromal/estimate scores and tumor purity; BCR richness/Shannon, leukocyte fraction, neoantigen load, intratumor heterogeneity, HRD and CTA scores; stemness indices; TMB; SNV and CNV landscapes and burdens; differentially expressed mRNAs/lncRNAs/miRNAs and ceRNA network; 11 DNA methylation driver genes and their expression; a 46-gene smoking-related risk score; predicted IC50 to multiple targeted and cytotoxic agentsCurrent smokers had the worst OS and DSS, former smokers intermediate, non-smokers best; smoking history was an independent prognostic factor for OS and DSS (current > former > never risk); former smokers showed highest immune cell infiltration and immune/ESTIMATE scores and lowest tumor purity; smokers (current and former) had higher BCR diversity, leukocyte fraction, neoantigen load, intratumor heterogeneity, HRD and CTA scores than non-smokers; smoking was associated with higher stemness indices (mRNAsi, mDNAsi, etc.), higher TMB, and increased SNV incidence in multiple genes (e.g., TP53, TTN, MUC16, CSMD3, RYR2, LRP1B, USH2A, SYNE1, ZFHX4, FLG, XIRP2, PCLO) and higher CNV gain/loss burden at key loci (e.g., 3q26, 8q24, 9p21 CDKN2A/B), with partial reduction but not complete reversal after cessation; smokers had higher predicted IC50 (reduced sensitivity) for many targeted and cytotoxic drugs, with non-smokers generally most sensitive and former smokers intermediate; ceRNA network highlighted several miRNAs as potential mediators of tobacco-related tumor biology; 11 methylation driver genes showed inverse methylation-expression relationships and were linked to smoking status; 46-gene model risk scores were highest in current smokers, intermediate in former smokers, lowest in non-smokersOS and DSS significantly differed by smoking group with graded worsening from never to former to current smokers; no significant overall difference in PFI between smoking groups, although 10–15-year PFI tended to be best in non-smokers; smoking history remained an independent predictor of OS and DSS in multivariate Cox models; the 46-gene risk score was an independent risk factor for OS across cancer types and showed good predictive accuracy for 1-, 3-, 5-year OS (and also DSS and PFI), with nomograms (risk score + clinical variables) achieving good calibration and C-indices across tumor types

ACTB: beta-actin; AGPAT1: 1-acylglycerol-3-phosphate O-acyltransferase 1; AI: aromatase inhibitor; AIs: aromatase inhibitors; APOC1: apolipoprotein C1; ARID1B: AT-rich interaction domain-containing protein 1B; ASCO: American Society of Clinical Oncology; BCSS: breast cancer-specific survival; BH: Benjamini-Hochberg (multiple-testing correction); BRCA2: breast cancer 2, early-onset; C7orf23: chromosome 7 open reading frame 23; CENPN: centromere protein N; CETN1: centrin 1; COMT: catechol-O-methyltransferase; CTNNBL1: catenin beta-like 1; CVD: cardiovascular disease; DFS: disease-free survival; ER+: estrogen receptor-positive; FISH: fluorescence in situ hybridisation; HRT: hormone replacement therapy; IDC: invasive ductal carcinoma; ILC: invasive lobular carcinoma; IRF2: interferon regulatory factor 2; LCM: laser-capture microdissection; LECT2: leukocyte cell-derived chemotaxin 2; MARIE: population-based German breast cancer study (MARIE study); MARIEplus: Extended MARIE cohort; MSX1: Msh homeobox 1; NAT1: N-acetyltransferase 1; NAT2: N-acetyltransferase 2; NACT: neoadjuvant chemotherapy; NCOA1: nuclear receptor coactivator 1; Oncotype DX: 21-gene RT-PCR recurrence score assay; p53: tumor protein p53; pCR: pathological complete response; RNeasy: RNeasy RNA extraction kit; RIPK1: receptor-interacting serine/threonine-protein kinase 1; RS: recurrence score; RT: reverse transcription; RT-qPCR: reverse transcription quantitative polymerase chain reaction; RT-PCR: reverse transcription polymerase chain reaction; SOD1: superoxide dismutase 1; SOD2: superoxide dismutase 2; SYBR: SYBR Green fluorescent dye; TAM: tamoxifen; THAP4: THAP domain-containing protein 4; UBE2F: ubiquitin-conjugating enzyme E2 F; USA: United States of America.