Telomere-to-telomere Genome Assembly of Sorghum: Pioneering a Gap-free Genetic Blueprint

In an ambitious stride towards understanding the genetic intricacies of CHBZ, a variety of sorghum, researchers have leveraged cutting-edge sequencing technologies to produce the first telomere-to-telomere (T2T) gap-free genome assembly. This monumental effort not only paves the way for revolutionary agricultural advancements but also provides an invaluable framework for future genetic studies of sorghum and other crops.

The journey began at the Center for Agricultural Genetic Resources Research, Shanxi Agricultural University, Taiyuan, China, where CHBZ sorghum was carefully cultivated. To capture its genetic essence, young seedlings were flash-frozen in liquid nitrogen and stored at a brisk -80 °C. The DNA extraction process utilized the cetyltrimethylammonium bromide method, followed by qualitative assessments using both NanoDrop One and Qubit 3.0 technologies.

Utilizing Pacific Biosciences’ high-fidelity (HiFi) sequencing and Oxford Nanopore Technology’s (ONT) ultra-long sequencing, alongside High-through Chromosome Conformation Capture (Hi-C) sequencing, the study amassed an impressive dataset. In total, the sequencing efforts yielded 304.06 Gb of ONT reads, 28.65 Gb of PacBio HiFi CCS reads, and 304.93 Gb of Hi-C data, the combination of which provided a deeply comprehensive view of the sorghum genome.

The initial assemblies, constructed with Hifiasm using HiFi and ONT reads coupled with Hi-C data, laid down the groundwork. This was followed by meticulous gap-closing and polishing efforts, including the use of LR_Gapcloser with error-corrected ONT reads and various rounds of polishing to achieve a refined T2T assembly. The end product was a genome assembly of approximately 724.85 Mb, astoundingly represented in just 10 pseudochromosomes.

A critical part of this study involved the annotation and identification of repeat elements and genes within the assembled genome. Advanced tools such as RepeatModeler, LTR-FINDER, RepeatMasker, and Tandem Repeat Finder were employed to elucidate the repetitive landscape, revealing a repeat sequence composition of approximately 70.41%. Concurrently, gene prediction strategies amalgamated transcriptome-based, homology-based, and ab initio predictions, uncovering a rich tapestry of 32,855 protein-coding genes.

The gene expression profiling and syntenic relationship analysis, crucial for understanding gene function and evolutionary history, divulged 24,685 orthologous pairs between CHBZ and BTx623 sorghum genomes. Remarkably, the study also shed light on presence/absence variants (PAVs), identifying unique genes that differentiate the CHBZ genome from its counterparts.

In a novel approach, the research employed QuarTeT to identify centromeres and telomeres, showcasing the centromere regions’ dominance by repetitive sequences and highlighting the telomere-to-telomere assembly’s precision.

This study’s implications extend far beyond its immediate findings. By achieving a gap-free assembly of the CHBZ sorghum genome, the researchers have set a new benchmark for genetic studies in agriculture. Furthermore, the methodology and insights gleaned from this research can propel genomics research into a new era of precision and comprehensiveness. As we look towards future applications, the integration of this high-quality genomic data holds promise for enhancing crop resilience, improving yield, and fostering a deeper understanding of plant genetics.

In the field of agricultural genetics, the completion of the CHBZ sorghum T2T genome assembly marks a significant milestone. It not only enriches our genetic knowledge bank but also equips breeders and researchers with the tools to innovate and improve crop varieties, ensuring food security and sustainability for future generations.

Leave a Reply

Your email address will not be published. Required fields are marked *

You May Also Like

Charting New Terrain: Physical Reservoir Computing and the Future of AI

Beyond Electricity: Exploring AI through Physical Reservoir Computing In an era where…

Unveiling Oracle’s AI Enhancements: A Leap Forward in Logistics and Database Management

Oracle Unveils Cutting-Edge AI Enhancements at Oracle Cloud World Mumbai In an…

Challenging AI Boundaries: Yann LeCun on Limitations and Potentials of Large Language Models

Exploring the Boundaries of AI: Yann LeCun’s Perspective on the Limitations of…

The Rise of TypeScript: Is it Overpowering JavaScript?

Will TypeScript Wipe Out JavaScript? In the realm of web development, TypeScript…