Figure 1. Overview of Genome and Multi-omics Analysis in the Tohoku Medical Megabank Project. The TMM Project recruited 80,000 participants for its community-based cohort and 70,000 participants for its Birth-and-Three-Generation cohorts. As illustrated, various types of samples have been collected along with data on participants’ health, lifestyle, and medical history. Follow-up surveys are conducted every 5 years. Utilizing these samples and their associated rich datasets, genome and multi-omics analyses are actively being carried out.
TMM: Tohoku Medical Megabank.
From: Advancements in Whole-Genome Sequencing Protocols: A Decade of In-House Operations and Quality Controls at the Tohoku Medical Megabank

Figure 3. The annual progress of the TMM WGS project. The TMM genome reference panels have been continuously released since 2015 (red), and the TMM repositories, including samples related to family pedigrees, have been available since 2022 (blue). The schematic diagram on the left shows the HiSeq 2500, which was used at the project’s inception, while the diagram on the right shows the NovaSeq 6000, which accelerated the project. In March 2024, we achieved the milestone of completing the WGS of 100,000 participants. Informatics work is underway to release a new reference panel TMM-61KJPN, and a repository based on this dataset in 2025.
TMM: Tohoku Medical Megabank; WGS: whole-genome sequencing.
From: Advancements in Whole-Genome Sequencing Protocols: A Decade of In-House Operations and Quality Controls at the Tohoku Medical Megabank

Figure 7. Representative data of iDeal-based sequencing. A total of 96 pooled libraries were sequenced on the NovaSeq 6000 using three S4 flow cells. In the first run, equally pooled libraries were sequenced to obtain relative concentration data for each library based on its index ratio. Subsequently, using the obtained relative concentration data, the 96 libraries were re-pooled with adjusted volumes to ensure consistent final data by sequencing with the remaining two flow cells. The number of reads from each library in the first run is shown in blue, while those from the second and third runs are shown in orange and red, respectively.
From: Advancements in Whole-Genome Sequencing Protocols: A Decade of In-House Operations and Quality Controls at the Tohoku Medical Megabank

Figure 8. Snapshot of mean coverage data at the CYP2D6 gene locus. The data, sourced from the jBrowser embedded in jMorp, is displayed. The chromosome position and RefSeq genes at the locus are shown above. The mean coverage, with a map quality score (MAPQ) ≥20, is calculated from the WGS data of 1000 samples sequenced on each platform and protocol shown on the left. Inaccessible regions for all protocols are indicated by blue arrows, while an accessible region with the 161 or 162 bp PE protocol is marked with red arrows. Regions where DNBSEQ-T7 shows low mean coverage are indicated by green arrowheads.
PE: paired-end.
From: Advancements in Whole-Genome Sequencing Protocols: A Decade of In-House Operations and Quality Controls at the Tohoku Medical Megabank
