Boosting segmentation accuracy of the deep learning models based on the synthetic data generation

Dettagli Bibliografici
Parent link:The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences (ISPRS Archives)
Vol. XLIV-2/W1-2021 : 4th International Workshop on Photogrammetric and computer vision techniques for video surveillance, biometrics and biomedicine, 26–28 April 2021, Moscow, Russia.— 2021.— [P. 33-40]
Enti autori: Национальный исследовательский Томский политехнический университет Инженерная школа информационных технологий и робототехники Научно-образовательная лаборатория обработки и анализа больших данных, Национальный исследовательский Томский политехнический университет Инженерная школа информационных технологий и робототехники Отделение информационных технологий
Altri autori: Danilov V. V. Vyacheslav Vladimirovich, Gerget O. M. Olga Mikhailovna, Kolpashchikov D. Yu. Dmitry Yurjevich, Laptev N. V. Nikita Vitalievich, Manakov R. Roman, Hernandez-Gomez L. A., Alvarez F. Frederic, Ledesma-Carbayo M. J.
Riassunto:Title screen
In the era of data-driven machine learning algorithms, data represents a new oil. The application of machine learning algorithms shows they need large heterogeneous datasets that crucially are correctly labeled. However, data collection and its labeling are time-consuming and labor-intensive processes. A particular task we solve using machine learning is related to the segmentation of medical devices in echocardiographic images during minimally invasive surgery. However, the lack of data motivated us to develop an algorithm generating synthetic samples based on real datasets. The concept of this algorithm is to place a medical device (catheter) in an empty cavity of an anatomical structure, for example, in a heart chamber, and then transform it. To create random transformations of the catheter, the algorithm uses a coordinate system that uniquely identifies each point regardless of the bend and the shape of the object. It is proposed to take a cylindrical coordinate system as a basis, modifying it by replacing the Z-axis with a spline along which the h-coordinate is measured. Having used the proposed algorithm, we generated new images with the catheter inserted into different heart cavities while varying its location and shape. Afterward, we compared the results of deep neural networks trained on the datasets comprised of real and synthetic data. The network trained on both real and synthetic datasets performed more accurate segmentation than the model trained only on real data. For instance, modified U-net trained on combined datasets performed segmentation with the Dice similarity coefficient of 92.6±2.2%, while the same model trained only on real samples achieved the level of 86.5±3.6%. Using a synthetic dataset allowed decreasing the accuracy spread and improving the generalization of the model. It is worth noting that the proposed algorithm allows reducing subjectivity, minimizing the labeling routine, increasing the number of samples, and improving the heterogeneity.
Lingua:inglese
Pubblicazione: 2021
Soggetti:
Accesso online:https://doi.org/10.5194/isprs-archives-XLIV-2-W1-2021-33-2021
Natura: Elettronico Capitolo di libro
KOHA link:https://koha.lib.tpu.ru/cgi-bin/koha/opac-detail.pl?biblionumber=667889

MARC

LEADER 00000naa0a2200000 4500
001 667889
005 20251127124035.0
035 |a (RuTPU)RU\TPU\network\39100 
035 |a RU\TPU\network\38533 
090 |a 667889 
100 |a 20220512d2021 k||y0rusy50 ba 
101 0 |a eng 
135 |a drcn ---uucaa 
181 0 |a i  
182 0 |a b 
200 1 |a Boosting segmentation accuracy of the deep learning models based on the synthetic data generation  |f V. V. Danilov, O. M. Gerget, D. Yu. Kolpashchikov [et al.] 
203 |a Text  |c electronic 
300 |a Title screen 
330 |a In the era of data-driven machine learning algorithms, data represents a new oil. The application of machine learning algorithms shows they need large heterogeneous datasets that crucially are correctly labeled. However, data collection and its labeling are time-consuming and labor-intensive processes. A particular task we solve using machine learning is related to the segmentation of medical devices in echocardiographic images during minimally invasive surgery. However, the lack of data motivated us to develop an algorithm generating synthetic samples based on real datasets. The concept of this algorithm is to place a medical device (catheter) in an empty cavity of an anatomical structure, for example, in a heart chamber, and then transform it. To create random transformations of the catheter, the algorithm uses a coordinate system that uniquely identifies each point regardless of the bend and the shape of the object. It is proposed to take a cylindrical coordinate system as a basis, modifying it by replacing the Z-axis with a spline along which the h-coordinate is measured. Having used the proposed algorithm, we generated new images with the catheter inserted into different heart cavities while varying its location and shape. Afterward, we compared the results of deep neural networks trained on the datasets comprised of real and synthetic data. The network trained on both real and synthetic datasets performed more accurate segmentation than the model trained only on real data. For instance, modified U-net trained on combined datasets performed segmentation with the Dice similarity coefficient of 92.6±2.2%, while the same model trained only on real samples achieved the level of 86.5±3.6%. Using a synthetic dataset allowed decreasing the accuracy spread and improving the generalization of the model. It is worth noting that the proposed algorithm allows reducing subjectivity, minimizing the labeling routine, increasing the number of samples, and improving the heterogeneity. 
461 |t The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences (ISPRS Archives) 
463 |t Vol. XLIV-2/W1-2021 : 4th International Workshop on Photogrammetric and computer vision techniques for video surveillance, biometrics and biomedicine, 26–28 April 2021, Moscow, Russia  |v [P. 33-40]  |d 2021 
610 1 |a электронный ресурс 
610 1 |a труды учёных ТПУ 
610 1 |a data synthesis 
610 1 |a echocardiography 
610 1 |a catheter segmentation 
610 1 |a forward kinematics 
610 1 |a spline coordinate system 
610 1 |a эхокардиография 
701 1 |a Danilov  |b V. V.  |c specialist in the field of informatics and computer technology  |c engineer of Tomsk Polytechnic University  |f 1989-  |g Vyacheslav Vladimirovich  |3 (RuTPU)RU\TPU\pers\37831 
701 1 |a Gerget  |b O. M.  |c Specialist in the field of informatics and computer technology  |c Professor of Tomsk Polytechnic University, Doctor of Sciences  |f 1974-  |g Olga Mikhailovna  |3 (RuTPU)RU\TPU\pers\31430  |9 15593 
701 1 |a Kolpashchikov  |b D. Yu.  |c specialist in the field of engineering  |c engineer of Tomsk Polytechnic University  |f 1992-  |g Dmitry Yurjevich  |3 (RuTPU)RU\TPU\pers\41099 
701 1 |a Laptev  |b N. V.  |c Specialist in the field of mechanical engineering  |c Engineer of Tomsk Polytechnic University  |f 1995-  |g Nikita Vitalievich  |3 (RuTPU)RU\TPU\pers\45864 
701 1 |a Manakov  |b R.  |g Roman 
701 1 |a Hernandez-Gomez  |b L. A. 
701 1 |a Alvarez  |b F.  |g Frederic 
701 1 |a Ledesma-Carbayo  |b M. J. 
712 0 2 |a Национальный исследовательский Томский политехнический университет  |b Инженерная школа информационных технологий и робототехники  |b Научно-образовательная лаборатория обработки и анализа больших данных  |3 (RuTPU)RU\TPU\col\23599 
712 0 2 |a Национальный исследовательский Томский политехнический университет  |b Инженерная школа информационных технологий и робототехники  |b Отделение информационных технологий  |3 (RuTPU)RU\TPU\col\23515 
801 0 |a RU  |b 63413507  |c 20220512  |g RCR 
856 4 |u https://doi.org/10.5194/isprs-archives-XLIV-2-W1-2021-33-2021 
942 |c CF