Assessing the Predictive Efficacy of European-Based Systolic Blood Pressure Polygenic Risk Scores in Diverse Brazilian Cohorts
Recent research has provided compelling evidence of a significant association between increased Polygenic Risk Scores (PRS) and systolic blood pressure (SBP), as well as hypertension, in two Brazilian admixed samples. Interestingly, the associations were weaker in these Brazilian cohorts, particularly in the Baependi study, echoing the results seen in homogeneous African or Asian populations with diverse genetic backgrounds. This recurrent theme underscores the limitations of using European-based PRS in admixed populations such as those in Brazil. These insights are pivotal for forming disease prevention and management strategies that are tailored to ethnically diverse populations.
The UK Biobank (UKB) serves as a vast prospective study encompassing genetic and phenotypic information from nearly half a million UK residents. Our investigation drew data from 434,181 participants, spanning both genders and ages 38-73 at recruitment, all of whom had undergone at least one automated blood pressure measurement. These participants offered digital consent and completed comprehensive questionnaires covering lifestyle and health-related data, underwent physical evaluations, and provided biological samples—urine, saliva, and blood—for genotyping purposes. Our study was conducted under application number 14654 of the UKB resource.
In the study, SBP was determined by averaging two automated blood pressure recordings, with adjustments made for those on anti-hypertensive drugs, adding 15 mmHg to SBP measurements. Genotyping utilized two approaches: the UK BiLEVE Axiom Array chip for 46,578 individuals and the Affymetrix UKB Axiom Array chip for 408,268 participants. SNPs were imputed using a comprehensive reference panel from UK10K, 1000 Genomes phase 3, and the Haplotype Reference Consortium, yielding over 93 million autosomal variants.
The Baependi study in Brazil, a longitudinal family-based study, assessed the genetic and environmental impacts on cardiovascular risks. Initially recruiting 1,695 individuals in 2005, the cohort expanded to 2,495 individuals by 2010. Blood pressure assessments were conducted using a digital sphygmomanometer, with similar BP adjustment methods applied as in the UKB. Genotyping was performed on 2,113 individuals using two distinct array chips, resulting in 39,127,678 SNPs after imputation.
The research is part of the EpiGen-Brazil consortium, a significant Latin American initiative in population genomics and genetic epidemiology, studying 6,487 individuals from three Brazilian population-based cohorts. Especially notable is the Pelotas cohort, which has tracked 5,914 individuals born in 1982 through adulthood. In 2012, when subjects reached 30 years of age, data from 3,736 individuals were employed.
For all three cohorts, hypertension was defined using the following criteria: SBP ≥ 130 mmHg, DBP ≥ 80 mmHg, use of antihypertensive medication, or registry-based hypertension. The study also defined stages of hypertension from “Elevated” to “Stage 2,” based on SBP and DBP thresholds.
Standard quality controls were applied to genotyped datasets using the human genome reference GRCh37, with tools like PLINK2 software ensuring variant reliability. Given the diverse ethnic backgrounds, a principal component analysis was employed to assess population structure, using innovative tools tailored for admixed populations.
Global ancestry was inferred using ADMIXTURE, with reference from the Human Genome Diversity Project and the 1000 Genomes Project phase 3. Multiple reference groups, including European, South Asian, East Asian, African, and Native American populations, were utilized to determine ancestral backgrounds.
Genetic variants from all datasets were analyzed, with comprehensive association studies performed using tools like BOLT-LMM and GARSA. Independent significant SNPs with genome-wide significance were identified, forming the basis of the PRS.
The PRS was modeled using the LDPred2-auto algorithm, a Bayesian approach that recalculates effect sizes iteratively, offering an optimized, high-accuracy prediction model. By using effect sizes recalculated through LDPred2, the PRS was calculated in a UKB validation cohort and applied to Brazilian population testing.
Each cohort’s baseline characteristics, such as continuous phenotypes and categorical data, were analyzed using standard statistical methods, revealing the heritability of the genetic variants used in PRS on SBP. The analysis extended to include associations of PRS with hypertension, providing insight into the model’s predictive power.
Predictive models for hypertension were separately developed for the UKB, Baependi, and Pelotas datasets, incorporating clinical data and PRS. The models’ performance was assessed through various statistical measures, including AUC and F1 scores, with the DeLong test determining the statistical significance of performance differences.
Ultimately, this comprehensive study highlights the importance of tailoring genetic screening tools like PRS to accommodate the diverse genetic backgrounds inherent in admixed populations, thus enhancing their predictive efficacy and reliability.