Corresponding author: Kenji Takehara, firstname.lastname@example.org
Received: November 25, 2022
Accepted: April 12, 2023
Advance Publication: June 19, 2023
Published: July 14, 2023
Cite this article as:
Suto M, Sugiyama T, Imai K, Furuno T, Hosozawa M, Ichinose Y, Ihana-Sugiyama N, Kodama T, Koizumi R, Shimizu-Motohashi Y, Murata S, Nakamura Y, Niino M, Sato M, Taguchi R, Takegami M, Tanaka M, Tsutsumimoto K, Usuda K, Takehara K, Iso H. Studies of Health Insurance Claims Data in Japan: A Scoping Review. JMA J. 2023;6(3):233-245.
Background: Health insurance claims data are used in various research fields; however, an overview on how they are used in healthcare research is scarce in Japan. Therefore, we conducted a scoping review to systematically map the relevant studies using Japanese claims data.
Methods: MEDLINE, EMBASE, and Ichushi-Web were searched up to April 2021 for studies using Japanese healthcare claims data. We abstracted the data on study characteristics and summarized target diseases and research themes by the types of claims database. Moreover, we described the results of studies that aimed to compare health insurance claims data with other data sources narratively.
Results: A total of 1,493 studies were included. Overall, the most common disease classifications were “Diseases of the circulatory system” (18.8%, n = 281), “Endocrine, nutritional, and metabolic diseases” (11.5%, n = 171; mostly diabetes), and “Neoplasms” (10.9%, n = 162), and the most common research themes were “medical treatment status” (30.0%, n = 448), “intervention effect” (29.9%, n = 447), and “clinical epidemiology, course of diseases” (27.9%, n = 417). Frequent diseases and themes varied by type of claims databases. A total of 19 studies aimed to assess the validity of the claims-based definition, and 21 aimed to compare the results of claims data with other data sources. Most studies that assessed the validity of claims data compared to medical records were hospital-based, with a small number of institutions.
Conclusions: Claims data are used in various research areas and will increasingly provide important evidence for healthcare policy in Japan. It is important to use previous claims database studies and share information on methodology among researchers, including validation studies, while informing policymakers about the applicability of claims data for healthcare planning and management.
Key words: health insurance claims, validation studies, healthcare policy, Japan, scoping review
Understanding disease and treatment patterns is important for developing appropriate health policies at the national and regional levels. For healthcare research, the use of healthcare databases that represent routine clinical practice has several advantages: an adequate population to study rare events, the reflection of real-world effectiveness and practice patterns, and a relatively low cost and short time (1).
One of the most used healthcare databases involves health insurance claims data for healthcare services, procedures, and pharmaceuticals (1). These claims data are used in various research areas, such as health service utilization, cost analysis, intervention and evaluation studies, drug risk assessment, health policy research, and guideline adherence (2), (3), (4). These data have the potential to provide important evidence for Japanese healthcare policies. To conduct future claims database studies, a systematic summary of previous studies would be a useful resource from several perspectives. First, a research overview of claims database studies will reveal the well-addressed research areas (e.g., targeted diseases and research themes) and warrant further research. Second, the list of relevant previous studies will provide methodological guidance for future research, including how to define diseases using diagnostic and procedure codes. In addition, findings regarding what data sources have been used for comparison are important in claims database studies (1), (5).
Some review studies and data profiles have described the research areas and validation studies for Japanese claims database studies. However, these reviews included only certain claims databases, such as the National Database of Health Insurance Claims (NDB) (6) and JMDC Claims Database (JMDC) (7); the overall distribution of research areas, as well as differences by type of claims database, has not yet been revealed. Regarding validation studies, one review study searched PubMed and reported the studies published in English (5). However, a comprehensive review, including Japanese electronic sources, remains lacking. A thorough literature review of previous claims database studies, using multiple electronic sources, is required to facilitate claims database study in Japan. Therefore, this scoping review, which systematically mapped the studies using claims data in Japan, aimed to investigate (1) the distribution of target diseases in each study, (2) the details and distribution of research themes, and (3) the types of studies that aimed to assess the validity of claims data or compare their results with other data sources, such as medical records, registries, and surveillance data.
Studies using Japanese health insurance claims data published after 2010 in Japanese or English were included.
Study papers using only health checkup data or long-term care insurance claims data, publicly available data that were freely available on the Internet (e.g., NDB Open Data, which are summary tables of NDB data compiled by the government), data combined with primary research (including controlled trials, cohort studies, and surveys), and not original data (e.g., editorials, commentaries, reviews, and conference abstracts) or theses were excluded. Hospital-based studies involving ten or fewer institutions were also excluded.
Studies that compared results of claims data with other data sources were included regardless of the number of institutions that participated in each study. This was because most studies that assessed the validity of claims data were hospital-based, with fewer institutions, and did not meet the abovementioned inclusion criteria.
To identify potentially relevant documents, electronic sources such as MEDLINE, EMBASE, and Ichushi-Web were searched up to April 2021. Two experienced information specialists assessed the search strategies (Supplementary File 1 shows the complete electronic search strategies).
A total of 14 reviewers working in pairs independently assessed the titles and abstracts retrieved from the electronic searches for review inclusion, using the Rayyan software. We sourced and assessed full papers when their eligibility for this review was unclear from the title and abstract alone. One of the reviewers conducted full-text screening and extracted data from the studies that potentially met our inclusion criteria, using the data-extracting form and manual developed for this review through discussion. Another reviewer (MS) confirmed the decisions regarding inclusion/exclusion and results of data extraction throughout the entire study to ensure consistency of categories.
We abstracted data on publication year, type of claims database (classified as NDB; JMDC; Medical Data Vision EBM Provider (MDV); Diagnosis Procedure Combination Database (DPC); National Health Insurance (NHI), including Kokuho Database (KDB)/Latter-Stage Elderly Healthcare System (LSEHS); Japan Health Insurance Association (JHIA); and Other/Multiple), study setting (national, regional, and others), age of study sample (children, older persons, and others), targeted disease (International Classification of Diseases-10 (ICD-10) chapter classification), and research theme (classified as 12 categories; these categories concerned related studies (2), (3), (4). For targeted disease and research themes, we selected up to two categories for each study (studies that targeted more than two diseases were classified as “others”).
We summarized the study characteristics (setting and age of study sample) (Table 1), target diseases (Table 2), and research themes (Table 3) by type of claims database. We picked up studies that aimed to assess the validity of claims data or compare their results with other data sources, describing the results narratively. We conducted this review according to the PRISMA-ScR reporting guideline (Supplementary File 2) (8).
Table 1. Characteristics of Included Studies.
NDB, National Database of Health Insurance Claims; JMDC, JMDC Claims Database; MDV, Medical Data Vision EBM Provider; DPC, Diagnosis Procedure Combination Database; NHI, National Health Insurance, including Kokuho Database (KDB)/Latter-Stage Elderly Healthcare System (LSEHS); JHIA, Japan Health Insurance Association
Table 2. Disease Characteristics by Type of Claims Databases.
|1) Certain infectious and parasitic diseases||117||7.8||15||14.7||29||8.5||16||8.6||41||7.0||4||2.5||0||0.0||12||11.1||14||14.9||0||0.0||103||8.1|
|3) Diseases of the blood and blood-forming organs and certain disorders involving the immune mechanism||19||1.3||0||0.0||0||0.0||3||1.6||15||2.6||0||0.0||0||0.0||1||0.9||4||4.3||0||0.0||15||1.2|
|4) Endocrine, nutritional, and metabolic diseases||171||11.5||11||10.8||62||18.2||40||21.4||9||1.5||21||13.1||3||25.0||25||23.1||5||5.3||4||3.0||162||12.8|
|5) Mental, behavioral, and neurodevelopmental disorders||110||7.4||13||12.7||47||13.8||2||1.1||22||3.8||13||8.1||3||25.0||10||9.3||6||6.4||13||9.7||91||7.2|
|6) Diseases of the nervous system||69||4.6||4||3.9||18||5.3||10||5.3||24||4.1||7||4.4||0||0.0||6||5.6||3||3.2||9||6.7||57||4.5|
|7) Diseases of the eye and adnexa||22||1.5||1||1.0||13||3.8||1||0.5||3||0.5||0||0.0||0||0.0||4||3.7||0||0.0||0||0.0||22||1.7|
|8) Diseases of the ear and mastoid process||4||0.3||0||0.0||3||0.9||0||0.0||1||0.2||0||0.0||0||0.0||0||0.0||3||3.2||0||0.0||1||0.1|
|9) Diseases of the circulatory system||281||18.8||11||10.8||49||14.4||41||21.9||138||23.6||21||13.1||3||25.0||18||16.7||3||3.2||11||8.2||267||21.1|
|10) Diseases of the respiratory system||135||9.0||11||10.8||38||11.2||10||5.3||56||9.6||13||8.1||0||0.0||7||6.5||24||25.5||22||16.4||89||7.0|
|11) Diseases of the digestive system||115||7.7||6||5.9||24||7.1||8||4.3||61||10.4||11||6.9||1||8.3||4||3.7||6||6.4||8||6.0||101||8.0|
|12) Diseases of the skin and subcutaneous tissue||17||1.1||2||2.0||9||2.6||1||0.5||3||0.5||1||0.6||0||0.0||1||0.9||3||3.2||1||0.7||13||1.0|
|13) Diseases of the musculoskeletal system and connective tissue||93||6.2||7||6.9||25||7.4||16||8.6||36||6.2||5||3.1||0||0.0||4||3.7||5||5.3||7||5.2||81||6.4|
|14) Diseases of the genitourinary system||75||5.0||4||3.9||15||4.4||9||4.8||34||5.8||7||4.4||1||8.3||5||4.6||1||1.1||5||3.7||69||5.5|
|15) Pregnancy, childbirth, and the puerperium||25||1.7||4||3.9||9||2.6||1||0.5||11||1.9||0||0.0||0||0.0||0||0.0||0||0.0||0||0.0||25||2.0|
|16) Certain conditions originating in the perinatal period||8||0.5||0||0.0||3||0.9||1||0.5||3||0.5||0||0.0||0||0.0||1||0.9||4||4.3||0||0.0||4||0.3|
|17) Congenital malformations, deformations, and chromosomal abnormalities||13||0.9||0||0.0||3||0.9||1||0.5||8||1.4||0||0.0||0||0.0||1||0.9||11||11.7||0||0.0||2||0.2|
|18) Injury, poisoning, and certain other consequences of external causes||110||7.4||11||10.8||9||2.6||3||1.6||77||13.2||9||5.6||0||0.0||1||0.9||7||7.4||22||16.4||81||6.4|
|19) Others (multidisease, not focused on specific diseases)||238||15.9||21||20.6||41||12.1||21||11.2||69||11.8||68||42.5||0||0.0||18||16.7||16||17.0||50||37.3||172||13.6|
|*Up to two diseases could be chosen.|
Table 3. Research Theme by Type of Claims Databases.
|Medical treatment status||448||30.0||40||39.2||149||43.8||73||39.0||106||18.2||43||26.9||2||16.7||35||32.4|
|Clinical epidemiology, course of diseases||417||27.9||33||32.4||92||27.1||50||26.7||187||32.0||36||22.5||4||33.3||15||13.9|
|Health policy evaluation and utilization||186||12.5||9||8.8||23||6.8||8||4.3||78||13.4||49||30.6||4||33.3||15||13.9|
|Quality of care||91||6.1||11||10.8||30||8.8||9||4.8||29||5.0||8||5.0||0||0.0||4||3.7|
|Patient health service utilization||76||5.1||1||1.0||31||9.1||6||3.2||3||0.5||16||10.0||2||16.7||17||15.7|
|*Up to two themes could be chosen.|
After duplicates were removed, 3,943 citations were identified from electronic searches. Based on the title and abstract, 1,992 were excluded. A total of 1,951 sources for eligibility in the full-text screening were assessed and 458 citations were excluded. The remaining 1,493 studies were considered eligible for this review. Figure 1 shows the study selection flowchart; Supplementary File 3 shows the excluded studies, with reasons for exclusion in the full-text screening.
The number of published studies using health insurance claims data increased since 2010 (Figure 2). We grouped these studies by the type of claims database and described the study setting and age of study sample for each group (Table 1). Supplementary File 4 shows characteristics of the individual included studies. The largest number of studies used DPC (n = 584), followed by JMDC (n = 340). We found 102 NDB studies, including 21 studies using NDB sampling data and 12 using accumulated NDB data. In the DPC studies, several types of databases were found: (1) data collected by DPC study groups and institutions, such as DPC Research Institute, Quality Indicator/Improvement Project (QIP) database, and National Hospital Organization and (2) data being combined with specific disease registries, such as Hospital-based Cancer Registries, J-ASPECT Study (nationwide stroke registry), and JROAD-DPC (Japanese Registry Of All cardiac and vascular Diseases). The “Other” category database included (1) health insurance societies-based claims database, such as JammNet claims database, MinaCare, and other corporate health insurance societies; (2) pharmacy claims data, such as IQVIA NPA data, Medi-Trend (Kyowa Kikaku), and Nihon-Chouzai pharmacy claims database; (3) databases sourced from medical institutions; and (4) studies using multiple claims databases together (e.g., JMDC and MDV).
NDB, JMDC, MDV, and DPC were mostly analyzed at the nationwide level, while NHI/LSEHS and JHIA were examined at the regional level. Regarding the age of the study sample, half of the studies (80/160 studies) using NHI and LSEHS, which are municipality-based claims databases and do not include the employee health insurance claims data, targeted the older population (several studies combined with long-term care insurance data), and only three studies targeted children. On the other hand, in JMDC studies, which collected data from corporate health insurance societies, 33 studies targeted children and only 9 targeted the older population.
We summarized target diseases for each type of claims database (Table 2). Overall, the most common disease classifications were “Diseases of the circulatory system” (18.8%, n = 281/1493), “Endocrine, nutritional, and metabolic diseases” (11.5%, n = 171/1493; the vast majority dealt with diabetes), and “Neoplasms” (10.9%, n = 162/1493). On the other hand, the number of studies regarding blood and immune system diseases, as well as eye, ear, skin, and perinatal diseases, was small.
The frequency of diseases varied depending on the type of claims database. There were only few studies on “Endocrine, nutritional, and metabolic diseases,” including diabetes, in DPC (1.5%, n = 9/584), while there were many such studies in JMDC and MDV. For “Neoplasms,” there were few studies in NDB (5.9%, n = 6/102), JMDC (6.5%, n = 22/340), and NHI/LSEHS (2.5%, n = 4/160).
Of the included studies, 238 were classified as “Others.” This category included topics such as antimicrobial use, healthcare costs, hospitalizations/intensive care unit, and dealing with multiple diseases. Many studies did not focus on a specific disease in NHI/LSEHS, especially those targeting the older population. Nevertheless, they dealt with healthcare costs or delivery systems, such as hospitalization and home healthcare.
In studies of children, respiratory and infectious diseases, such as asthma, upper respiratory tract infection, and influenza, were more common. At the same time, pneumonia and fractures were more common in studies of the older population. Supplementary File 5 shows a list of frequently occurring diseases by ICD-10 chapter classification.
We summarized research themes for each type of claims database (Table 3). Overall, the most common research objectives were “medical treatment status” (describe the patterns of providing medical care, such as diagnosis, treatment, tests, and prescriptions; 30.0%, n = 448/1493), “intervention effect” (examine the effects and risks of treatment, such as surgery, prescriptions, rehabilitation; 29.9%, n = 447/1493), and “clinical epidemiology, course of diseases” (examine the prevalence, risk factors, prognosis, etc.; 27.9%, n = 417/1493). These three research themes were common to all disease categories (Supplementary File 6). Table 4 shows more specific research themes.
Table 4. List of Research Theme.
|Medical treatment status||Diagnosis pattern|
|Screening, test, monitoring (frequency, rate)|
|Hospitalization, readmission, NICU|
|Intervention effect||Effectiveness (treatment, prevention): surgery, procedures, drug,
vaccine, periodontal management, follow-up, rehabilitation,
health guidance, etc.
|Clinical epidemiology, course of diseases||Prevalence, incidence, number of patients|
|Association, risk factor, predictor, prognostic factor|
|Clinical characteristics (age, sex)|
|Comorbidity, underlying conditions|
|Mortality, survival, prognosis|
|Surveillance, seasonality, yearly change|
|Factors associated with medical expenditure|
|Medical billing issues|
|Health policy evaluation and utilization||Health policy impact: medical subsidy, payment system, drug
approval, regulatory action on drug, labeling change on
prescriptions OTC switching, clinical guideline, information campaign, etc.
|Volume-outcome relationship (other healthcare factors)|
|Supply and demand for healthcare|
|Analysis of healthcare region|
|Use for healthcare planning|
|Evaluation of resource consumption|
|Analysis of home care service utilization|
|Quality of care||Adherence to clinical guideline|
|Research methodology||Validation study|
|Comparison with other data sources|
|Definition of death|
|Correction methods for medical fee revisions|
|Development of database|
|Development of indicator/algorithm|
|Patient health service utilization||Adherence, compliance, persistence|
|Patient choice, selection|
|Socioeconomic comparison||Regional variations|
|Age and sex distribution|
|Residence (home, care facilities)|
|Type of health insurance|
|Prediction model||Predict survival/mortality|
|Severe adverse events|
|High-need high-cost patients|
|Prediction model of infectious disease|
|COVID-19||Trends in hospitalizations (NICU) during the COVID-19 outbreak|
|Changes in intervention/care practice|
|Adverse events of COVID-19 vaccines|
|School closure and social distancing for COVID-19|
|Economic impact of COVID-19 pandemic|
|Others||Effect of earthquakes|
By type of claims database, DPC was characterized by a large number of studies on “intervention effect” (44.3%, n = 259/584) and a few on “medical treatment status” (18.2%, n = 106/584). Studies using NHI/LSEHS, a municipality-based claims database, often aimed at “health economics” and “health policy evaluation and utilization.” On the other hand, the number of studies varied by prefecture. The most common region was “Fukuoka,” with 32 studies; although some studies had anonymous municipality names, there was no report for several prefectures. The claims databases used in several studies on COVID-19 were MDV and DPC (QIP database) (literature search conducted in April 2021).
A total of 19 studies aimed to assess the validity of claims-based definition (9), (10), (11), (12), (13), (14), (15), (16), (17), (18), (19), (20), (21), (22), (23), (24), (25), (26), (27) and 21 aimed to compare results of claims data with other data sources, to evaluate the usefulness of claims databases as statistics or survey data (28), (29), (30), (31), (32), (33), (34), (35), (36), (37), (38), (39), (40), (41), (42), (43), (44), (45), (46), (47), (48). Depending on the purpose of the study, the following data sources were used as comparative data for claims data: to assess the accuracy of diagnoses and procedure records in claims data, 13 studies compared the data to medical records and laboratory data (chart review) (10), (11), (12), (13), (17), (19), (20), (21), (22), (23), (24), (26), (27), and 2 compared them to disease registries, linking claims data with individual-based information (9), (25). Two studies assessed claims-based definitions of death using enrollment data (14), (16). In addition, a study conducted a validity assessment of self-reported medication use that was collected in an annual health checkup, by comparing claims data to pharmacy insurance claims (18); one study examined the association between prognostic burn index (from DPC) and mortality (15). On the other hand, to understand the utility of claims data as statistical and survey data, such as the number of patients, incidence of events, and medication use, 4 studies made comparisons with electronic medical records data (29), (35), (47), (48), 4 with government statistics (28), (32), (43), (46), and 13 with disease registries, epidemiological studies, surveillance, post-marketing surveillance, and sales data (30), (31), (33), (34), (36), (37), (38), (39), (40), (41), (42), (44), (45).
The target diseases were mostly related to diabetes and/or cardiovascular diseases (13 studies). Studies that linked and compared results from claims data with individual data from medical records and disease registries were hospital-based, involving a single or few centers (five or fewer), with the exception of one study on cardiovascular disease in diabetic patients (20).
Disease classifications with the highest number of studies were “Diseases of the circulatory system,” “Endocrine, nutritional, and metabolic diseases” (the vast majority of studies dealt with diabetes), and “Neoplasms.” These three diseases are included in the five major diseases in Japan (cancer, stroke, acute myocardial infarction, diabetes, and psychiatric diseases), for which local governments are required to monitor healthcare indicators and have a large number of patients.
This study showed that there were still some disease classifications for which few previous studies existed. Among the five diseases, the number of studies on psychiatric diseases was relatively low. In addition, the number of studies on more specific diseases, such as blood and immune system diseases, as well as eye, ear, skin, and perinatal diseases, was low. For these disease classifications, it would be desirable to promote studies using claims data.
While some disease areas have specific reasons and technical barriers for the small number of studies, in recent years, with the increase in number of DPC studies, studies on diseases with fewer hospitalizations should account for a smaller percentage of total claims database studies. Moreover, in practice areas where there are nationwide databases, such as the National Clinical Database (with support from the Japan Surgical Society (49), there would be less demand for the use of claims data. There may also be technical barriers to claims database studies in specific disease areas, such as those with ambiguous diagnostic criteria. Large hospital-related variations make it difficult to determine the case definitions in claims data and describe the disease and treatment pattern. In addition, normal pregnancies and vaginal deliveries not covered by public health insurance make it difficult to obtain the whole picture of the perinatal disease. As described above, it should be noted that studies using a claims database may not be appropriate for some diseases.
The frequency of diseases also varied depending on the type of claims database. In DPC, the number of studies on “Endocrine, nutritional, and metabolic diseases,” including diabetes, was low. In JMDC claims databases sourced from health insurance societies, health checkup data were included to ascertain the health status of insured persons, as well as special health checkups to enable patient-based tracking if insured by the same health insurance society (7). These strengths make it easy to use in studies for chronic diseases.
Claims data were used for various research themes. Overall, the most common research objectives were “medical treatment status,” “intervention effect,” and “clinical epidemiology, course of diseases.” These three themes were considered to take advantage of claims data: (1) the studies involved a large population to understand real-world effectiveness and practice patterns for rare diseases (50), (51), (52), (2) these studies aimed to examine the external validity of clinical trials in a real-world setting with claims data (53), and (3) they provided a more detailed picture of the actual state of disease, combining the results of epidemiological studies or other data sources (42).
Studies that used NHI/LSEHS, a municipality-based claims database, often aimed at “health economics” and “health policy evaluation and utilization.” This review found various highly practical studies for municipal administrative planning and management. For instance, a study aimed to identify the number of vulnerable people in neighborhood units by using the NHI database to create an evacuation support plan (54). Some studies proposed to use claims database research as fundamental data for monitoring healthcare indicators in regional healthcare or cost moderation plans (55), (56), (57). While the number of reported papers varies among municipalities, it is important to provide information to policymakers and promote collaboration with researchers to use a claims database in each region.
The distribution of research themes by claims database was affected by the information contained in each database. For instance, DPC data (acute inpatient database) was characterized by many studies on the “intervention effect.” DPC databases include detailed information about patients and hospitalization, making it easier to determine the disease severity and clinical presentation; DPC data comprise basic and clinical information, including the day-to-day status of patients on Form 1 and the H-file (58). Likewise, MDV includes results of blood tests and other laboratory tests, in addition to DPC data information (31); JMDC claims databases sourced from medical institutions also include DPC assessment forms and clinical laboratory test values (59). Although NDB contains limited outcome data or clinical information, such as laboratory test results, compared to the above databases, it includes almost all health insurance claims in Japan and can be used to describe diseases and treatment patterns at the national level. JMDC claims databases sourced from health insurance societies have the ability to link household members, examining the impact of medication during pregnancy on infant outcomes (60), (61) and enabling analysis on patient/spouse pairs (62), (63). NHI and LSEMCS, sourced from municipal health insurance societies, also include health checkup results and enable individual-level linked data on long-term care insurance, thereby expanding research possibilities (64). As of April 2021, MDV and DPC (QIP database) have been used in several studies on COVID-19 (65), (66), (67), (68), (69), (70), (71), (72), (73); these databases are capable of rapid analysis in accordance with social conditions. It is important to consider the characteristics, strengths, and information in claims databases when a researcher plans to conduct a study using these databases.
Research using claims data is expected to significantly contribute to healthcare research. Nevertheless, a major challenge in using claims data for research purposes involves ensuring the data’s validity. Claims data are not collected for the primary purpose of research, so their quality may not be as robust as primary data collection (74). Therefore, validation studies are important to ensure the credibility of results.
We picked up two types of studies that compared claims data with other data sources: (1) studies that aimed to assess the validity of claims data and (2) those that aimed to evaluate the utility of claims data as statistical or survey data. For studies that aimed to assess the validity of data, the comparisons were conducted either at the aggregate level, such as using surveillance and statistical data, or at the individual data level, such as using medical records. Most of the studies that conducted chart reviews with medical records were hospital-based, with a small number of institutions. In these hospital-based studies, the authors mentioned a limitation in that it was unclear whether these results could be generalized to other hospitals. To ensure the credibility of results from claims data, more extensive support is needed (5).
Our scoping review has several limitations. First, although the search strategies were determined by experienced information specialists, our review failed to include some claims database studies. The search strategies in this review contained typical claims database names, such as NDB, DPC, JMDC, MDV, and KDB, and some related terms with “health insurance claims.” However, there are many ways to describe the names of each database and claims data. It was not possible to use all terms in our search strategies (e.g., we could not detect the studies described as “administrative data” in the title and abstract). This causes relevant records to be missed while acting as barrier to finding previous claims data studies. It may be necessary to unify the description method, such as including the database name in the title and abstract. In addition, our review did not include relevant studies not listed in the three electronic sources (MEDLINE, EMBASE, and Ichushi-Web). Second, two reviewers did not independently conduct full-text screening and data extraction due to the large number of retrieved studies. Among several reviewers, one conducted full-text screening and data extraction. A second reviewer confirmed the results of screening and data extraction throughout the study to minimize misclassification, considering whether the criteria differed among the reviewers.
Despite these limitations, we believe that this review can contribute to grasping the research overview of claims database studies in Japan, considering multiple electronic sources for literature search and any types of claims databases. Our findings showed that it is important to consider the strengths and limitations of each claims database when a researcher plans to conduct a study with them. In addition, when planning a new claims data study, the list of included studies in this review will provide the index information to previous claims data studies. It will allow researchers to refer to methodological issues, such as claims-based case definition. To facilitate healthcare research and evidence-based policy development, it is important to use previous studies using claims data and share information on methodology among researchers in each disease area and across diseases, including validation studies, while informing policymakers across the country about the applicability of claims data for healthcare planning and management.
This work was supported by Research Project for the Establishment of an NDB Research System for Health Policy and Other Purposes through 6NC Collaboration (2019-(1)-3)
The authors thank Mr. Masahiko Watanabe and Ms. Chiemi Kataoka for developing and executing the search strategy. We express our gratitude to Dr. Sho Nakakubo, Dr. Satoshi Kurita, Dr. Yuto Kiuchi, and Dr. Kazuhei Nishimoto for their supports in data extraction.
MaS, TS, KI, TF, MH, YI, NI, TK, RK, YM, SM, YN, MN, MiT, KoT, KU, KeT, and HI designed the study. MaS, TS, KI, TF, MH, NI, RK, SM, YN, MN, MiT, MoT, KoT, and KeT conducted title and abstract screening. MaS, TS, KI, TF, MH, YI, NI, RK, YM, SM, YN, MN, MiS, RT, MiT, MoT, KoT, KU, and KeT conducted full-text screening and data extraction for included studies. MaS and KeT drafted the initial manuscript. All authors reviewed and approved the final manuscript.
Schneeweiss S, Avorn J. A review of uses of health care utilization databases for epidemiologic research on therapeutics. J Clin Epidemiol. 2005;58(4):323-37.
Martin-Latry K, Bégaud B. Pharmacoepidemiological research using French reimbursement databases: yes we can! Pharmacoepidemiol Drug Saf. 2010;19(3):256-65.
Tricco AC, Pham B, Rawson NS. Manitoba and Saskatchewan administrative health care utilization databases are used differently to answer epidemiologic research questions. J Clin Epidemiol. 2008;61(2):192-7.
Hoffmann F. Review on use of German health insurance medication claims data for epidemiological research. Pharmacoepidemiol Drug Saf. 2009;18(5):349-56.
Koram N, Delgado M, Stark JH, et al. Validation studies of claims data in the Asia-Pacific region: a comprehensive review. Pharmacoepidemiol Drug Saf. 2019;28(2):156-70.
Hirose N, Ishimaru M, Morita K, et al. A review of studies using the Japanese National Database of Health Insurance Claims and Specific Health Checkups. Ann Clin Epidemiol. 2020;2(1):13-26.
Nagai K, Tanaka T, Kodaira N, et al. Data resource profile: JMDC claims database sourced from health insurance societies. J Gen Fam Med. 2021;22(3):118-27.
Tricco AC, Lillie E, Zarin W, et al. PRISMA Extension for Scoping Reviews (PRISMA-ScR): checklist and explanation. Ann Intern Med. 2018;169(7):467-73.
Takaoka M, Okuyama A, Mekata E, et al. Staging discrepancies between Hospital-Based Cancer Registry and Diagnosis Procedure Combination data. Jpn J Clin Oncol. 2016;46(8):788-91.
Yamana H, Moriwaki M, Horiguchi H, et al. Validity of diagnoses, procedures, and laboratory data in Japanese administrative data. J Epidemiol. 2017;27(10):476-82.
Iwamoto M, Higashi T, Miura H, et al. Accuracy of using diagnosis procedure combination administrative claims data for estimating the amount of opioid consumption among cancer patients in Japan. Jpn J Clin Oncol. 2015;45(11):1036-41.
Yamana H, Horiguchi H, Fushimi K, et al. Comparison of procedure-based and diagnosis-based identifications of severe sepsis and disseminated intravascular coagulation in administrative data. J Epidemiol. 2016;26(10):530-7.
Ishikawa T, Oyanagi G, Obara T, et al. Validity of congenital malformation diagnoses in healthcare claims from a university hospital in Japan. Pharmacoepidemiol Drug Saf. 2021;30(7):975-8.
Ooba N, Setoguchi S, Ando T, et al. Claims-based definition of death in Japanese claims database: validity and implications. PloS One. 2013;8(5):e66116.
Tagami T, Matsui H, Fushimi K, et al. Validation of the prognostic burn index: a nationwide retrospective study. Burns. 2015;41(6):1169-75.
Sakai M, Ohtera S, Iwao T, et al. Validation of claims data to identify death among aged persons utilizing enrollment data from health insurance unions. Environ Health Prev Med. 2019;24(1):63.
Fujihara K, Yamada-Harada M, Matsubayashi Y, et al. Accuracy of Japanese claims data in identifying diabetes-related complications. Pharmacoepidemiol Drug Saf. 2021;30(5):594-601.
Fujita M, Sato Y, Nagashima K, et al. Validity assessment of self-reported medication use by comparing to pharmacy insurance claims. BMJ Open. 2015;5(11):e009490.
Okui T, Nojiri C, Kimura S, et al. Performance evaluation of case definitions of type 1 diabetes for health insurance claims data in Japan. BMC Med Inform Decis Mak. 2021;21(1):52.
Ono Y, Taneda Y, Takeshima T, et al. Validity of claims diagnosis codes for cardiovascular diseases in diabetes patients in Japanese Administrative Database. Clin Epidemiol. 2020;12:367-75.
Imai S, Yamana H, Inoue N, et al. Validity of administrative database detection of previously resolved hepatitis B virus in Japan. J Med Virol. 2019;91(11):1944-8.
Nakai M, Iwanaga Y, Sumita Y, et al. Validation of acute myocardial infarction and heart failure diagnoses in hospitalized patients with the nationwide claim-based JROAD-DPC database. Circ Rep. 2021;3(3):131-6.
Ando T, Ooba N, Mochizuki M, et al. Positive predictive value of ICD-10 codes for acute myocardial infarction in Japan: a validation study at a single center. BMC Health Serv Res. 2018;18(1):895.
Tanaka S, Hagino H, Ishizuka A, et al. Validation study of claims-based definitions of suspected atypical femoral fractures using clinical information. Jpn J Pharmacoepidemiol. 2016;21(1):13-9.
Sato I, Yagata H, Ohashi Y. The accuracy of Japanese claims data in identifying breast cancer cases. Biol Pharm Bull. 2015;38(1):53-7.
Shima D, Ii Y, Higa S, et al. Validation of novel identification algorithms for major adverse cardiovascular events in a Japanese claims database. J Clin Hypertens (Greenwich Conn). 2021;23(3):646-55.
Tamiya R, Miyake M, Kido A, et al. Validation study of the claims-based definition for age-related macular degeneration at a single university hospital in Japan. Jpn J Ophthalmol. 2021;65(3):388-94.
Shibata A, Katanoda K, Matsuda T, et al. [Usefulness of a healthcare insurance claims database for statistical data in cancer patients]. J Health Welf Stat. 2014;61(12):6-12. Japanese.
Toba M, Moriwaki M, Sase Y, et al. [Selection of a method for detecting cases of severe pneumothorax associated with central venous catheterization using information on fee-for-medical-service potential utilizability of information from fee-for-medical for internal audit]. J Jpn Soc Healthc Admin. 2016;53(4):217-25. Japanese.
Hirano Y, Asami Y, Kuribayashi K, et al. Possibility of database research as a means of pharmacovigilance in Japan based on a comparison with sertraline postmarketing surveillance. Value Health Reg Issues. 2018;15:1-5.
Hashikata H, Harada KH, Kagimura T, et al. Usefulness of a large automated health records database in pharmacoepidemiology. Environ Health Prev Med. 2011;16(5):313-9.
Yanase T, Takenoshita H, Murase K, et al. [Utility study of a basic database collected from patients with diabetes mellitus in 15 hospitals by retrospective analysis]. Pharm Med. 2011;29(1):145-52.
Asami Y, Kuribayashi K, Hirano Y, et al. [A perspective of pharmacovigilance using the healthcare claims database, based on comparison with sertraline post marketing surveillance]. Pharm Med Dev Regul Sci. 2016;47(6):466-71.
Sasaki N, Kunisawa S, Otsubo T, et al. The relationship between the number of cardiologists and clinical practice patterns in acute heart failure: a cross-sectional observational study. BMJ Open. 2014;4(12):e005988.
Tanaka C, Kusama Y, Muraki Y, et al. Evaluation of the usefulness of antimicrobial use survey using claims data. Jpn J Chemother. 2019;67(6):640-4.
Ohkusa Y, Sugawara T, Takahashi K, et al. Comparative study of preciseness in the regional variation of influenza in Japan among the National Official Sentinel Surveillance of Infectious Diseases and the National Database of Electronic Medical Claims. Biosci Trends. 2018;12(6):636-40.
Yamasaki D, Tanabe M, Muraki Y, et al. The first report of Japanese antimicrobial use measured by national database based on health insurance claims data (2011-2013): comparison with sales data, and trend analysis stratified by antimicrobial category and age group. Infection. 2018;46(2):207-14.
Nakamura Y, Kawanohara H, Kamei M. Evaluation of the number of varicella patients estimated by prescription surveillance. Kansenshogaku Zasshi. 2015;89(1):23-9. Japanese.
Nakamura Y, Sugawara T, Kawanohara H, et al. Evaluation of estimated number of influenza patients from national sentinel surveillance using the national database of electronic medical claims. Jpn J Infect Dis. 2015;68(1):27-9.
Nakamura Y, Kawanohara H, Kamei M. Evaluation of estimated number of influenza patients from prescription Surveillance using the national database of electronic medical claims. J Health Welf Stat. 2015;62(2):1-6.
Tanihara S, Okamoto E, Imatoh T, et al. Evaluating measles surveillance: comparison of sentinel surveillance, mandatory notification, and data from health insurance claims. Epidemiol Infect. 2011;139(4):516-23.
Iwanaga T, Anzawa K, Mochizuki T. [An investigation of the clinical practice for dermatophytosis treatment using reimbursement data from health insurance societies in Japan]. Jpn J Dermatol. 2015;125(12):2289-99. Japanese.
Tsuneishi M, Yamamoto T, Yamaguchi T. [To presence of teeth type using the dental notation of periodontitis patients: a cross-sectional study using the receipt and health checkup information database in Japan]. Jpn J Dent Pract Admin. 2019;54(3):184-90. Japanese.
Koretsune Y, Yamashita T, Yasaka M, et al. Usefulness of a healthcare database for epidemiological research in atrial fibrillation. J Cardiol. 2017;70(2):169-79.
Kakizaki M, Sawada N, Yamagishi K, et al. [Study on the probability of incident stroke and acute myocardial infarction using DPC data]. Nihon Koshu Eisei Zasshi. 2018;65(4):179-86. Japanese.
Tanihara S, Tsuji M, Kawazoe M, et al. Number of claims data by disease category: comparison of Statistics of Medical Care Activities in public health insurance, and claims data from corporate health insurance societies. J Health Welf Stat. 2017;64(13):1-8.
Takeda T, Mihara N, Murata T, et al. Estimating the ratio of patients with a certain disease between hospitals for the allocation of patients to clinical trials using health insurance claims data in Japan. Stud Health Technol Inform. 2016;228:537-41.
Toba M, Moriwaki M, Yokouchi K, et al. [Develop a monitoring procedure for cases of bone fractures and intracranial bleeding due to falls based on integrating adverse event data extracted from medical fee information and other data]. Jpn J Qual Saf Healthc. 2017;12(3):270-8. Japanese.
Gotoh M, Miyata H, Hashimoto H, et al. National Clinical Database feedback implementation for quality improvement of cancer treatment in Japan: from good to great through transparency. Surg Today. 2016;46(1):38-47.
Matsubayashi K, Kawakami K. Prevalence, incidence, comorbidities, and treatment patterns among Japanese patients with acromegaly: a descriptive study using a nationwide claims database. Endocr J. 2020;67(10):997-1006.
Ogino M, Okamoto S, Ohta H, et al. Prevalence, treatments and medical cost of multiple sclerosis in Japan based on analysis of a health insurance claims database. Clin Exp Neuroimmunol. 2017;8(4):318-26.
Ishikawa H, Ohbe H, Omachi N, et al. Spinal cord infarction after bronchial artery embolization for hemoptysis: a nationwide observational study in Japan. Radiology. 2021;298(3):673-9.
Kawasaki R, Konta T, Nishida K. Lipid-lowering medication is associated with decreased risk of diabetic retinopathy and the need for treatment in patients with type 2 diabetes: A real-world observational analysis of a health claims database. Diabetes Obes Metab. 2018;20(10):2351-60.
Fujiu M, Morisaki Y, Takayama J, et al. Evaluation of regional vulnerability to disasters by people of Ishikawa, Japan: a cross sectional study using National Health Insurance Data. Int J Environ Res Public Health. 2018;15(3):507.
Matsuda S, Fujimori K. Analysis of disease structure for the Regional Health Care Plan based on the National Database. AsiaN Pacific J Dis Manag. 2015;6(3):61-6.
Okumura Y, Sakata N, Takahashi K, et al. Epidemiology of overdose episodes from the period prior to hospitalization for drug poisoning until discharge in Japan: an exploratory descriptive study using a nationwide claims database. J Epidemiol. 2017;27(8):373-80.
Nishi T, Maeda T, Babazono A. [Study of the indicators of affordable healthcare planning using electronic medical claim data]. Jpn J Health Care Manag Mark. 2012;7(1):1-8. Japanese.
Hayashida K, Murakami G, Matsuda S, et al. History and profile of Diagnosis Procedure Combination (DPC): development of a real data collection system for acute inpatient care in Japan. J Epidemiol. 2021;31(1):1-11.
Nagai K, Tanaka T, Kodaira N, et al. Data resource profile: JMDC claims databases sourced from medical institutions. J Gen Fam Med. 2020;21(6):211-8.
Michihata N, Shigemi D, Sasabuchi Y, et al. Safety and effectiveness of Japanese herbal Kampo medicines for treatment of hyperemesis gravidarum. Int J Gynaecol Obstet. 2019;145(2):182-6.
Obara T. Elucidation of the relationship between use of psychotropic drugs in pregnant women and malformation of infants. Adv Pharm Sci. 2020;(36):99-107.
Akechi T, Mishiro I, Fujimoto S, et al. Risk of major depressive disorder in spouses of cancer patients in Japan: a cohort study using health insurance-based claims data. Psychooncology. 2020;29(7):1224-7.
Matsubayashi K, Kawakami K. Syphilis testing among spouses of patients with syphilis in Japan: an epidemiological study using an administrative claims database. Int J STD AIDS. 2020;31(3):214-20.
Nakatani E, Tabara Y, Sato Y, et al. Data resource profile of Shizuoka Kokuho Database (SKDB) using integrated health- and care-insurance claims and health checkups: the Shizuoka Study. J Epidemiol. 2022;32(8):391-400.
Abe K, Miyawaki A, Nakamura M, et al. Trends in hospitalizations for asthma during the COVID-19 outbreak in Japan. J Allergy Clin Immunol Pract. 2021;9(1):494-6.e1
Sano K, Nakamura M, Ninomiya H, et al. Large decrease in paediatric hospitalisations during the COVID-19 outbreak in Japan. BMJ Paediatr Open. 2021;5(1):e001013.
Maeda Y, Nakamura M, Ninomiya H, et al. Trends in intensive neonatal care during the COVID-19 outbreak in Japan. Arch Dis Child Fetal Neonatal Ed. 2021;106(3):327-9.
Ikesu R, Miyawaki A, Sugiyama T, et al. Trends in diabetes care during the COVID-19 outbreak in Japan: an observational study. J Gen Intern Med. 2021;36(5):1460-2.
Miyawaki A, Tomio J, Nakamura M, et al. Changes in surgeries and therapeutic procedures during the COVID-19 outbreak: a longitudinal study of acute care hospitals in Japan. Ann Surg. 2021;273(4):e132-4.
Kishimoto K, Bun S, Shin J-H, et al. Early impact of school closure and social distancing for COVID-19 on the number of inpatients with childhood non-COVID-19 acute infections in Japan. Eur J Pediatr. 2021;180(9):2871-8.
Nagano H, Takada D, Shin J-H, et al. Hospitalization of mild cases of community-acquired pneumonia decreased more than severe ones during the COVID-19 epidemic. Int J Infect Dis. 2021;106:323-8.
Shin J-H, Takada D, Morishita T, et al. Economic impact of the first wave of the COVID-19 pandemic on acute care hospitals in Japan. PloS one. 2020;15(12):e0244852.
Morishita T, Takada D, Shin J-H, et al. Trends, treatment approaches, and in-hospital mortality for acute coronary syndrome in Japan during the coronavirus disease 2019 pandemic. J Atheroscler Thromb. 2022;29(5):597-607.
Milea D, Azmi S, Reginald P, et al. A review of accessibility of administrative healthcare databases in the Asia-Pacific region. J Mark Access Health Policy. 2015;3(1):28076.