RVENet

A Large Echocardiographic Dataset for the Deep Learning-Based Assessment of Right Ventricular Function

BACKGROUND

Two-dimensional (2D) echocardiography is the most frequently performed imaging test to assess right ventricular (RV) function. However, conventional 2D parameters are unable to reliably capture RV dysfunction across the entire spectrum of cardiac diseases. Three-dimensional (3D) echocardiography-derived RV ejection fraction (RVEF) – a sensitive and reproducible parameter that has been validated against cardiac magnetic resonance imaging – can bypass most of their limitations. Nonetheless, 3D echocardiography has limited availability, is more time-consuming, and requires significant human expertise. Therefore, novel automated tools that utilize readily available and routinely acquired 2D echocardiographic recordings to predict RVEF and detect RV dysfunction reliably would be highly desirable. To enable the implementation of such innovative solutions, publicly available and sufficiently large dedicated datasets would be pivotal. Motivated by this, we created the RVENet dataset comprising 3,583 labeled echocardiographic videos of 831 individuals.

PURPOSE

The RVENet dataset was primarily designed to enable the training and evaluation of deep learning models that predict RVEF from 2D echocardiographic videos. The fact that each 2D video is labeled with a 3D echocardiography-derived RVEF value makes our dataset one of its kind. Beyond serving as a benchmark dataset in the task mentioned above, the RVENet dataset may represent a valuable resource for several other research projects in the intersection of computer vision and cardiovascular imaging.

DATASET

The RVENet dataset consists of two major components: (i) a large set of echocardiographic videos (in DICOM format) and (ii) the corresponding labels and additional patient or video-related data (in a single separate CSV file).

The RVENet dataset contains 3,583 2D apical four-chamber view echocardiographic videos from 944 examinations of 831 individuals in DICOM format. Each subject underwent one or more 3D transthoracic echocardiographic examinations between November 2013 and March 2021 at the Heart and Vascular Center of Semmelweis University. The dataset comprises ten distinct subgroups of subjects: (i) healthy adult volunteers (n=192), (ii) healthy pediatric volunteers (n=54), (iii) elite athletes (n=139), (iv) patients with heart failure and reduced left ventricular EF (LVEF, n=98), (v) patients with LV non-compaction cardiomyopathy (n=27), (vi) patients with aortic valve disease (n=85), (vii) patients with mitral valve disease (n=70), (viii) patients who underwent orthotopic heart transplantation (n=87), (ix) pediatric patients who underwent kidney transplantation (n=23), and (x) others (n=56).
Except for removing DICOM tags containing protected health information, no preprocessing was performed on the videos. Files were named according to the following naming convention: [patient hash]_[# of the echocardiographic examination of the given patient]_[# of the video in the given examination].dcm.

Apical four-chamber view video of a healthy individual
Apical four-chamber view video of a heart failure patient

A comprehensive list and description of the labels are provided in Table 1. RV end-diastolic and end-systolic volumes, as well as RVEF, were computed from 3D echocardiographic recordings using a commercially available software solution (4D RV Function 2, TomTec Imaging, Unterschleissheim, Germany). These parameters were calculated only once for each echocardiographic examination. However, an examination may contain multiple 2D apical four-chamber view videos; thus, the same label was linked to all 2D videos within that given examination. Of note, the 3D recordings are not published as part of the dataset.
All of the 2D echocardiographic videos were reviewed by a single experienced echocardiographer who (i) assessed the image quality using a 5-point Likert scale (1 – non-diagnostic, 2 – poor, 3 – moderate, 4 – good, 5 – excellent), (ii) labeled videos as either standard or RV-focused, and (iii) determined LV/RV orientation (Mayo – RV on the right side and LV on the left side; Stanford – LV on the right side and RV on the left side). These annotations are also provided in a tabular format, along with the primary diagnosis, age, biological sex, and the train-validation splitting (80:20 ratio) that we used for the training and the evaluation of the models in our experiments. In addition, the ultrasound system utilized for video acquisition, the frame rate, and the total number of frames are also reported for each video.

Table 1 The list and description of labels and other variables provided for each video of the dataset
Variable Description
FileName Hashed file name used to link videos and labels
PatientHash Hashed patient name
PatientGroup Patient subgroup referring to the primary diagnosis
Age Age in years, rounded to the nearest year
Sex Sex reported in the medical record (M – male, F – female)
UltrasoundSystem Ultrasound system used for video acquisition
FPS Frames per second (1/s)
NumFrames Number of frames in the whole video
VideoViewType Standard or RV-focused apical four-chamber view
VideoOrientation LV/RV orientation (Mayo or Stanford)
VideoQuality 2D video quality on a 5-point scale (1 – non-diagnostic, 2 – poor, 3 – moderate, 4 – good, 5 – excellent)
RVEDV 3D echocardiography-derived RV end-diastolic volume (mL)
RVESV 3D echocardiography-derived RV end-systolic volume (mL)
RVEF 3D echocardiography-derived RV ejection fraction (%)
Split Train-validation splitting used in our experiments

Prior to publication, all DICOM files of the RVENet dataset were processed to remove any protected health information. We also ensured that no protected health information was included among the published labels. Thus, the RVENet dataset complies with the General Data Protection Regulation (GDPR) of the European Union.
The publication of the RVENet dataset and protocol of our studies using the dataset conform with the principles outlined in the Declaration of Helsinki and were approved by the Semmelweis University Regional and Institutional Committee of Science and Research Ethics (approval No. 190/2020).

REQUEST ACCESS

Please read the Research Use Agreement below for the official terms and conditions. If you agree to these terms of access, please apply for access by filling this form.

Research Use Agreement

By requesting access to the RVENet dataset, you agree to the following Research Use Agreement:


1. Permission is granted to view and use the RVENet dataset without charge for personal, non-commercial research purposes only. Any commercial use, sale, or other monetization is prohibited.
2. Other than the rights granted herein, the authors and Semmelweis University retains all rights, title, and interest in the RVENet dataset.
3. You may make a verbatim copy of the RVENet dataset for personal, non-commercial research use as permitted in this Research Use Agreement. If another user within your organization wishes to use the RVENet dataset, they must register as an individual user and comply with all the terms of this Research Use Agreement.
4. YOU MAY NOT DISTRIBUTE, PUBLISH, OR REPRODUCE A COPY of any portion or all of the RVENet dataset to others without specific prior written permission from the authors and Semmelweis University.
5. YOU MAY NOT SHARE THE DOWNLOAD LINK to the RVENet dataset with others. If another user within your organization wishes to use the RVENet dataset, they must register as an individual user and comply with all the terms of this Research Use Agreement.
6. You must not modify, reverse engineer, decompile, or create derivative works from the RVENet dataset. You must not remove or alter any copyright or other proprietary notices in the RVENet dataset.
7. The RVENet dataset is for non-clinical, Research Use Only. In no event shall data or images generated through the use of the RVENet dataset be used or relied upon in the diagnosis or provision of patient care.
8. THE RVENET DATASET IS PROVIDED “AS IS,” AND THE AUTHORS, SEMMELWEIS UNIVERSITY, AND ITS COLLABORATORS DO NOT MAKE ANY WARRANTY, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE, NOR DO THEY ASSUME ANY LIABILITY OR RESPONSIBILITY FOR THE USE OF THE RVENET DATASET.
9. You will not make any attempt to re-identify any of the individual data subjects. Re-identification of individuals is strictly prohibited. Any re-identification of any individual data subject shall be immediately reported to the authors.
10. Any violation of this Research Use Agreement or other impermissible use shall be grounds for immediate termination of use of the RVENet dataset. In the event that the authors or Semmelweis University determines that the recipient has violated this Research Use Agreement or other impermissible use has been made, they may direct that the undersigned data recipient immediately return all copies of the RVENet dataset and retain no copies thereof, even if you did not cause the violation or impermissible use.
11. You agree to cite the papers listed among the publications in any work using the RVENet dataset.

In consideration for your agreement to the terms and conditions contained here, the authors and Semmelweis University grant you permission to view and use the RVENet dataset for personal, non-commercial research. You may not otherwise copy, reproduce, retransmit, distribute, publish, commercially exploit, or otherwise transfer any material.

Limitation of Use

You may use the RVENet dataset for legal purposes only.
You agree to indemnify and hold the authors and Semmelweis University harmless from any claims, losses, or damages, including legal fees, arising out of or resulting from your use of the RVENet dataset or your violation or role in the violation of these Terms. You agree to fully cooperate in the authors’ and Semmelweis University’s defense against any such claims. These Terms shall be governed by and interpreted in accordance with the laws of Hungary and the European Union.

PUBLICATIONS

Deep Learning-Based Prediction of Right Ventricular Ejection Fraction Using 2D Echocardiograms [Publication]
Tokodi M, Magyar B, Soós A, Takeuchi M, Tolvaj M, Lakatos BK, Kitano T, Nabeshima Y, Fábián A, Szigeti MB, Horváth A, Merkely B, Kovács A

RVENet: A Large Echocardiographic Dataset for the Deep Learning-Based Assessment of Right Ventricular Function [Publication] [PDF]
Magyar B, Tokodi M, Soós A, Tolvaj M, Lakatos BK, Fábián A, Surkova E, Merkely B, Kovács A, Horváth A

CODES

The demo code for Deep Learning-Based Prediction of Right Ventricular Ejection Fraction Using 2D Echocardiograms is available here.

The source code for RVENet: A Large Echocardiographic Dataset for the Deep Learning-Based Assessment of Right Ventricular Function is available here.

CONTACT


For inquiries related to the dataset, contact Attila Kovács, M.D., Ph.D. (attila.kovacs@med.semmelweis-univ.hu ; kovatti@gmail.com) and Márton Tokodi, M.D., Ph.D. (tokmarton@gmail.com).

For inquiries related to the code base and technical implementation, contact Bálint Magyar, M.Sc. (magyar.balint@itk.ppke.hu).

For inquiries related to our publications, contact the corresponding authors.