Advertisement
Digestive Endoscopy| Volume 52, ISSUE 5, P566-572, May 2020

Download started.

Ok

Diagnosing chronic atrophic gastritis by gastroscopy using artificial intelligence

Open AccessPublished:February 12, 2020DOI:https://doi.org/10.1016/j.dld.2019.12.146

      Abstract

      Background

      The sensitivity of endoscopy in diagnosing chronic atrophic gastritis is only 42%, and multipoint biopsy, despite being more accurate, is not always available.

      Aims

      This study aimed to construct a convolutional neural network to improve the diagnostic rate of chronic atrophic gastritis.

      Methods

      We collected 5470 images of the gastric antrums of 1699 patients and labeled them with their pathological findings. Of these, 3042 images depicted atrophic gastritis and 2428 did not. We designed and trained a convolutional neural network-chronic atrophic gastritis model to diagnose atrophic gastritis accurately, verified by five-fold cross-validation. Moreover, the diagnoses of the deep learning model were compared with those of three experts.

      Results

      The diagnostic accuracy, sensitivity, and specificity of the convolutional neural network-chronic atrophic gastritis model in diagnosing atrophic gastritis were 0.942, 0.945, and 0.940, respectively, which were higher than those of the experts. The detection rates of mild, moderate, and severe atrophic gastritis were 93%, 95%, and 99%, respectively.

      Conclusion

      Chronic atrophic gastritis could be diagnosed by gastroscopic images using the convolutional neural network-chronic atrophic gastritis model. This may greatly reduce the burden on endoscopy physicians, simplify diagnostic routines, and reduce costs for doctors and patients.

      Keywords

      1. Introduction

      Stomach cancer is the fifth most common cancer and the third leading cause of cancer-related death worldwide [
      • Bray F.
      • Ferlay J.
      • Soerjomataram I.
      • Siegel R.L.
      • Torre L.A.
      • Jemal A.
      Global cancer statistics 2018: globocan estimates of incidence and mortality worldwide for 36 cancers in 185 countries.
      ]. Gastric mucosal atrophy is a crucial stage in the progress of gastric cancer [
      • Uemura N.
      • Okamoto S.
      • Yamamoto S.
      • et al.
      Helicobacter pylori infection and the development of gastric cancer.
      ], and the extent of atrophy is an important risk factor for gastric cancer [
      • Song J.H.
      • Kim S.G.
      • Jin E.H.
      • Lim J.H.
      • Yang S.Y.
      Risk factors for gastric tumorigenesis in underlying gastric mucosal atrophy.
      ,
      • Sugano K.
      • Tack J.
      • Kuipers E.J.
      • et al.
      Kyoto global consensus report on Helicobacter pylori gastritis.
      ,
      • Cheung D.Y.
      Atrophic gastritis increases the risk of gastric cancer in asymptomatic population in Korea.
      ]. Treatment aimed at eradicating Helicobacter pylori can significantly improve atrophy in atrophic gastritis, thus reducing the risk of gastric cancer in patients with chronic atrophic gastritis (CAG) [
      • Hwang Y.-J.
      • Kim N.
      • Lee H.S.
      • Lee J.B.
      • Choi Y.J.
      • Yoon H.
      • et al.
      Reversibility of atrophic gastritis and intestinal metaplasia after Helicobacter pylori eradication—a prospective study for up to 10 years.
      ]. Therefore, the early diagnosis of atrophic gastritis is important to prevent the occurrence and development of gastric cancer [
      • Hwang Y.-J.
      • Kim N.
      • Lee H.S.
      • Lee J.B.
      • Choi Y.J.
      • Yoon H.
      • et al.
      Reversibility of atrophic gastritis and intestinal metaplasia after Helicobacter pylori eradication—a prospective study for up to 10 years.
      ].
      In clinical endoscopy, whether patients will undergo endoscopic biopsy depends on the experience of endoscopists. If the doctor thinks that the patient’s gastric mucosa is atrophic, he/she performs endoscopic biopsy to obtain clear findings. Otherwise, endoscopic biopsy is not required. However, the sensitivity of endoscopy in diagnosing atrophic gastritis is 42% [
      • Du Y.
      • Bai Y.
      • Xie P.
      • Fang J.
      • Wang X.
      • Hou X.
      • et al.
      Chronic gastritis in China: a national multi-center survey.
      ], which leads to a high rate of missed diagnosis of CAG. Furthermore, the accuracy of pathological biopsy largely depends on the experience of the doctor in choosing the location and depth of the mucosa for biopsy. The New Sydney Classification of Chronic Gastritis guidelines require at least five biopsies for gastroscopy [
      • Dixon M.F.
      • Genta R.M.
      • Yardley J.H.
      • Correa P.
      Classification and grading of gastritis.
      ,
      • Bogomoletz W.V.
      The “Sydney System”: a consensus approach to gastritis. Is a new “classification” necessary?.
      ]. Multipoint biopsy increases gastric trauma and risk of bleeding, and it cannot be performed if the patient is taking medications like aspirin. Furthermore, biopsies are costly and time-consuming. Recently, both chromoendoscopy combined with magnifying endoscopy and confocal laser microscopy have been important tools for the diagnosis and differential diagnosis of chronic gastritis [
      • Liu T.
      • Zheng H.
      • Gong W.
      • Chen C.
      • Jiang B.
      The accuracy of confocal laser endomicroscopy, narrow band imaging, and chromoendoscopy for the detection of atrophic gastritis.
      ]. However, these advanced endoscopy techniques only provide images of the gastrointestinal tract mucosal surface. Therefore, the diagnostic accuracy still depends on the standardized operations of experienced endoscopists and the accumulation of a large volume of pathological mucosa [
      • Imaeda A.
      Confocal laser endomicroscopy for the detection of atrophic gastritis: a new application for confocal endomicroscopy?.
      ].
      With the advances in computing power and access to big data, convolutional neural network (CNN) has emerged as a promising tool for medical imaging. Compared with data mining, machine learning focuses more on the design of algorithms [
      • Erickson B.J.
      • Korfiatis P.
      • Akkus Z.
      • Kline T.L.
      Machine learning for medical imaging.
      ], enabling computers to “learn” rules from data and use existing records to extrapolate unknown data. CNN with excellent performance in image recognition is currently a hot topic in the field of machine learning. Deep learning has achieved surprising results in the identification and diagnosis of various ophthalmic diseases [
      • Wang L.
      • Zhang K.
      • Liu X.
      • et al.
      Comparative analysis of image classification methods for automatic diagnosis of ophthalmic images.
      ,
      • Zhang K.
      • Liu X.
      • Jiang J.
      • Li W.
      • Wang S.
      • Liu L.
      • et al.
      Prediction of postoperative complications of pediatric cataract patients using data mining.
      ,
      • Zhang K.
      • Liu X.
      • Liu F.
      • He L.
      • Zhang L.
      • Yang Y.
      • et al.
      An interpretable and expandable deep learning diagnostic system for multiple ocular diseases: qualitative study.
      ] and tumor detection, including gastric cancer [
      • Hirasawa T.
      • Aoyama K.
      • Tanimoto T.
      • Ishihara S.
      • Shichijo S.
      • Ozawa T.
      • et al.
      Application of artificial intelligence using a convolutional neural network for detecting gastric cancer in endoscopic images.
      ], pulmonary nodules [
      • Cheng J.Z.
      • Ni D.
      • Chou Y.H.
      • Qin J.
      • Tiu C.M.
      • Chang Y.C.
      • et al.
      Computer-aided diagnosis with deep learning architecture: applications to breast lesions in us images and pulmonary nodules in ct scans.
      ], and breast cancer [
      • Bejnordi B.E.
      • Veta M.
      • Van Diest P.J.
      • van Ginneken B.
      • Karssemeijer N.
      • Litjens G.
      • et al.
      Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer.
      ].
      Machine learning, represented by CNN, has demonstrated outstanding performance in the fields of image recognition, image segmentation, and image understanding. It has developed swiftly in the medical and health fields, wherein various medical imaging data can be used to train models. Generally, the CNN is composed of a convolutional layer, pooling layer, and fully connected layer. The convolutional layer extracts different types of feature information by sliding the convolution kernels over the image. Each convolution kernel is essentially a signal filter that is used to extract a specific type of feature. Through the combination of multiple convolutional layers and nonlinear activation functions, complex features (the combination of multi-view features) and descriptors including sufficient information are obtained. The common pooling operations include maximum pooling and averaging pooling, which use the mean or maximum value of a region to represent this region so that the dimension is reduced to facilitate downstream computation. Research shows that maximum pooling is better for obtaining texture information, and average pooling is better for obtaining background information [
      • Boureau Y.-L.
      • Bach F.
      • LeCun Y.
      • Ponce J.
      Learning mid-level features for recognition.
      ].
      Recently, researchers have designed many new network structures and models such as VGG, Inception, Residual, and Skip Connection. The architecture of CNN evolves by adjusting the convolution kernel size and convolutional layer connection mode, which makes the CNN not only deeper but also wider. Moreover, the deeper the network, the stronger the ability of the CNN to extract features. Many related technologies have been concurrently proposed to improve the performance of CNN, including dropout, data augmentation, batch normalization, label smooth, and various types of activation functions.
      Skip connection is proven to be an important structure that greatly improves the training process of a deep learning model [
      • Zhang K.
      • Liu X.
      • Liu F.
      • He L.
      • Zhang L.
      • Yang Y.
      • et al.
      An interpretable and expandable deep learning diagnostic system for multiple ocular diseases: qualitative study.
      ]. By connecting the shallow structure to a deeper structure with skip-layer connections, gradient disappearance in error backpropagation can be avoided; consequently, more abstract and abundant information is extracted. The loss of key information in forwarding propagation is reduced by the reuse feature, which improves the recognition of micro-targets and micro-features. Skip connections mainly include residual structures in ResNet [
      • He K.
      • Zhang X.
      • Ren S.
      • Sun J.
      Deep residual learning for image recognition.
      ,
      • Huang G.
      • Liu Z.
      • van der Maaten L.
      • Weinberger K.Q.
      Densely connected convolutional networks.
      ] and dense block structures in DenseNet [
      • Zhang K.
      • Liu X.
      • Liu F.
      • He L.
      • Zhang L.
      • Yang Y.
      • et al.
      An interpretable and expandable deep learning diagnostic system for multiple ocular diseases: qualitative study.
      ].
      Gastroscopy is necessary for medical professionals to detect patients’ stomach diseases. In long-term medical practice, hospitals accumulate a large number of gastroscopic images that provide abundant data for the computer-aided diagnosis of CAG, giving physicians a chance to study gastroscopic images via deep learning. Researchers have begun to use deep learning algorithms to detect diseases such as gastric cancer, gastric polyps, and erosion. Billah et al. [
      • Billah M.
      • Waheed S.
      • Rahman M.M.
      An automatic gastrointestinal polyp detection system in video endoscopy using fusion of color wavelet and convolutional neural network features.
      ] used capsule endoscopy to record endoscopic videos in the gastrointestinal tract and then used CNN and color wavelet features to recognize gastrointestinal polyps; the accuracy rate was 98.65%, which was better than that of previous research methods based on color, location, and textural features such as color histograms and local binary patterns [
      • Szczypiński P.
      • Klepaczko A.
      • Pazurek M.
      • Daniel P.
      Texture and color based image segmentation and pathology detection in capsule endoscopy videos.
      ]. Byrne et al. [
      • Byrne M.F.
      • Chapados N.
      • Soudan F.
      • Oertel C.
      • Linares Pérez M.
      • Kelly R.
      • et al.
      Real-time differentiation of adenomatous and hyperplastic diminutive colorectal polyps during analysis of unaltered videos of standard colonoscopy using a deep learning model.
      ] used a deep learning model to diagnose colorectal polyps and adenomas from endoscopic video frames and achieved high recognition accuracy. Urban et al. [
      • Urban G.
      • Tripathi P.
      • Alkayali T.
      • Mittal M.
      • Jalali F.
      • Karnes W.
      • et al.
      Deep learning localizes and identifies polyps in real time with 96% accuracy in screening colonoscopy.
      ] used a deep neural network to identify polyps from colonoscopy detection videos and achieved an accuracy rate of 96.4%. Zhang et al. [
      • Zhang X.
      • Hu W.
      • Chen F.
      • Liu J.
      • Yang Y.
      • Wang L.
      • et al.
      Gastric precancerous diseases classification using CNN with a concise model.
      ] proposed a network (GPD Net) to detect precancerous lesions of gastric cancer. Lee et al. [
      • Lee J.H.
      • Kim Y.J.
      • Kim Y.W.
      • Park S.
      • Choi Y.I.
      • Kim Y.J.
      • et al.
      Spotting malignancies from gastric endoscopic images using deep learning.
      ] used a residual network combined with transfer learning to discern gastric cancer, ulcers, and normal gastric images with 90% accuracy. Compared with the morphological characteristics of polyps and tumors, those of mucosal atrophy are subtle and difficult to observe, making computer-aided diagnosis of atrophic gastritis challenging. Pathological images of atrophic gastritis were classified and diagnosed using deep learning technology [
      • Al-Omari F.A.
      • Matalka I.I.
      • Al-Jarrah M.A.
      • Obeidat F.N.
      • Kanaan F.M.
      An intelligent decision support system for quantitative assessment of gastric atrophy.
      ]. In 2005, Lahner et al. used clinical and biological variables, and artificial neural networks and linear destructive analysis to build a decision support system to diagnose atrophic gastritis without endoscopy. This system achieved a high accuracy rate of 100% [
      • Lahner E.
      • Grossi E.
      • Intraligi M.
      • Buscema M.
      • Corleto V.D.
      • Delle Fave G.
      • et al.
      Possible contribution of artificial neural networks and linear discriminant analysis in recognition of patients with suspected atrophic body gastritis.
      ]. However, CAG is closely related to gene expression and eating habits. Especially, in East Asia and Europe, the characteristics of CAG are very different [
      • Cheung D.Y.
      Atrophic gastritis increases the risk of gastric cancer in asymptomatic population in Korea.
      ], and thus, this method has not been widely used in East Asia. Patients with atrophic gastritis have no specific symptoms, even no obvious complaints, and in East Asia, gastroscopy and pathological biopsy are still the main methods for the diagnosis of gastric diseases. Thus, gastroscopy remains an essential diagnostic modality. However, a gap remains in the computer-aided screening of CAG using gastroscopy. Therefore, it is important to continuously develop and use deep learning to assist in the screening of CAG with gastroscopy. This study aimed to construct a convolutional neural network (CNN) to improve the diagnostic rate of CAG. As expected, the CNN-CAG model will be promising in drastically reducing the burden on endoscopy physicians, simplifying diagnostic routines, and reducing costs for both doctors and patients.

      2. Materials and methods

      2.1 Datasets and preprocessing

      The study was approved by the Ethics Committee of Shanxi Provincial People’s Hospital, Taiyuan, Shanxi; informed consent was obtained from each patient included in the study. From April 2018 to April 2019, two endoscopists with more than 5 years’ experience in gastroscopy reviewed gastroscopic images from the Gastroscopy Image Database of Shanxi People’s Hospital and selected high-quality images for visual observation. We collected 5470 images of the gastric antrums of 1699 patients and used the pathological findings of these images to label these images. A total of 3042 images depicted atrophic gastritis, including 1458 images of mild atrophic gastritis, 1348 images of moderate atrophic gastritis, 38 images of severe atrophic gastritis, and 198 images of gastric antrums with an unmarked degree of atrophy. Information about the gastroscopic images with different extents of atrophy is shown in Table 1. Additional 2428 images did not show atrophic gastritis. The negative samples used in the study included images depicting chronic inflammation of the gastric mucosa, chronic non-atrophic gastritis, and gastric mucosal active inflammation.
      Table 1Image characteristics of atrophic and non-atrophic gastritis in the training and testing datasets.
      Atrophic conditionSex (n)Number of imagesExtentNumber of images
      Atrophic gastritisFemale (338)1277Mild654
      Moderate545
      Severe6
      Unknown72
      Male (455)1744Mild804
      Moderate803
      Severe32
      Unknown105
      Unknown (4)21
      Non-atrophic gastritisFemale (465)12642428
      Male (437)1164
      Unknown (0)0
      In total, 70% and 30% of the gastritis images were randomly assigned to the training set and testing set, respectively. Five-fold cross-validation was performed on the training set to ensure the reliability of the model, and the testing set was then used to evaluate the effectiveness of the CNN-CAG. The proportions of images with atrophic gastritis of different severities were consistent between the training and testing sets.
      In this study, three different models of gastroscopes were used: Pentax EPK-i5000, Pentax EPK-i7000, and Pentax EPK-i (Tokyo, Japan). The gastric mucosa was observed using the white-light i-Scan mode. The images obtained by different gastroscopes had different formats; therefore, all images were transformed into an uncompressed BMP format. These images had high resolution, and the size of each image was more than 3 Mb.
      As atrophic gastritis primarily occurs in the antrum, the images used in this study included the entire gastric antrum. The training set excluded images of poor quality; the exclusion criteria included the presence of a staining artifact created by mucus, poor focus, insufficient contrast, motion-blurring, and gastric cancer. We selected pictures capturing different visual angles of the gastric mucosa in the same position to reduce selection bias. The clinical diagnoses used for labeling were based on the degree of mucosal atrophy according to the Sydney System [
      • Al-Omari F.A.
      • Matalka I.I.
      • Al-Jarrah M.A.
      • Obeidat F.N.
      • Kanaan F.M.
      An intelligent decision support system for quantitative assessment of gastric atrophy.
      ]. The histologic reports regarding these gastric mucosae were also accumulated. If the gastroscopy results of gastritis and the histologic report were inconsistent, the final diagnosis was based on the pathological results. Three gastrointestinal pathologists—three attending staff members and two senior fellow—conducted histologic assessments to determine the type of lesion.
      The INPAINT_TELEA algorithm was used to remove information, including age, sex, time, and system, from the upper left, upper right, and lower left regions of the gastroscopic images. This process is shown in Fig. 1. The aim of this step was not only to remove any sensitive patient information but also to avoid any interference of the white watermarking in the analysis of the atrophic gastritis images. Image sizes were unified (512 × 512) using bilinear interpolation. All gastroscopic images were uncompressed, and all information from the images was preserved.
      Fig. 1
      Fig. 1INPAINT_TELEA algorithm for removing information watermarks. (a) Watermarking information, such as age, sex, time, and system, exists in the upper left, upper right, and lower left regions of the gastroscopy image. Some of the watermarking information is located in the black background area of the image, whereas others appear directly in the target area of the image. (b) The watermarking mask contains the upper-left, lower-left, and upper-right parts of the original image. The watermarking information of these areas is consistent on each gastroscopy image. (c) The region contained in the watermarking mask has been repaired, but some watermarking information remains in the upper right corner of the repaired image. This is because the watermarking content of the upper right part of the region is closely related to the patient’s personal information. It changes with the patient’s time of visiting the doctor. Therefore, it is impossible to confirm the repair boundary through a unified watermarking mask.

      2.2 Experimental settings

      To make full use of available computing resources, this study used different computing devices. All image preprocessing algorithms were run on a standard computer with a 64-bit Windows 7 operating system and Python lab environment provided by Anaconda 2.5.0. Simultaneously, the SkImage library was used for the watermark removal, fast marching method algorithm and image extension, and flip algorithm. All experiments using deep learning for model training were performed on high-speed workstations with four Titan X GPUs. The workstations were equipped with a 64-bit Ubuntu system with Pytorch 0.4 that provided a framework for quickly implementing deep learning algorithms. The computing resources and computer-aided toolkits met the needs of the experiment.
      The deep learning model applied in the experiment was provided by the Pytorch framework, which provides various deep learning algorithms and pre-training models. Each pretrained model is a set of parameters trained on an ImageNet dataset for better results [
      • Lahner E.
      • Grossi E.
      • Intraligi M.
      • Buscema M.
      • Corleto V.D.
      • Delle Fave G.
      • et al.
      Possible contribution of artificial neural networks and linear discriminant analysis in recognition of patients with suspected atrophic body gastritis.
      ]. Low-level semantic features, such as edge shape and color distribution, were unchanged in different classification tasks. The combinations of these low-level features play a decisive role in object classification. Therefore, we could use transfer learning to fine tune model parameters, avoiding learning from scratch and speeding up the training process [
      • Oquab M.
      • Bottou L.
      • Laptev I.
      • Sivic J.
      Learning and transferring mid-level image representations using convolutional neural networks.
      ].

      2.3 Convolutional neural network architectures

      DenseNet was used to identify CAG lesions in gastric antrum images. The model architecture is shown in Fig. 2. The DenseNet model has made significant improvements to classical CNN through the basic DenseBlock module. Each convolutional layer used the feature map information passed from the upstream convolution layer multiple times. With the help of deeper networks and more complex model structures, the abstracting ability for image features was considerably stronger.
      Fig. 2
      Fig. 2Architecture of the DenseNet121 model.
      Our specific parameter settings were as follows: the number of convolution kernels of the first convolutional layer was 64, and the growth rate of each Dense Block was 32. The batch size was set to 128, and 2000 iterations were performed. The initial learning rate was set to 0.1, and the attenuation coefficient of learning rate was 0.9. The loss function was a cross-entropy loss function, and the optimization method was stochastic gradient descent.
      Physicians with different qualifications, including three experts and two novices, were asked to classify the data of the same testing set, and their results were compared with those of the CNN-CAG model. We calculated the accuracy, sensitivity, and specificity in identifying CAG, which were used to plot the ROC curve, P–R curve, and histogram in Fig. 4. The ROC and P–R curves are comprehensive indicators for specificity and sensitivity [
      • Wang L.
      • Zhang K.
      • Liu X.
      • et al.
      Comparative analysis of image classification methods for automatic diagnosis of ophthalmic images.
      ].
      To explore the basis of the decisions made by DenseNet, we used class activation mapping (CAM) [
      • Selvaraju R.R.
      • Cogswell M.
      • Das A.
      • et al.
      Grad-CAM: visual explanations from deep networks via gradient-based localization.
      ] to generate a heat map to understand which pixels in the gastroscopic images determined the model classification decision.
      Relevant codes and models can be freely accessed at https://www.github.com/yuanfuqiang456/CAG.

      3. Results

      3.1 Testing results

      In the model selection process, we used many well-known deep learning models for the initial training. The effect of different network architectures on identification of CAG is shown in Fig. 3.
      Fig. 3
      Fig. 3Training process of different models at different epochs.
      The accuracy of the different networks in the testing set varied greatly. In the training process, the model adopted the random shearing strategy for the original images. In the testing process, the model adopted the central clipping strategy for the original images. Our research showed that the DenseNet network achieved the highest performance in the recognition of CAG. Therefore, considering the accuracy and complexity of the algorithm, the DenseNet 121 network architecture was selected to train the CNN-CAG model.
      Fig. 4 shows that the CNN-CAG had a strong classification performance for the recognition of CAG based on gastric antrum images, and the areas under the [
      • Zhang K.
      • Liu X.
      • Jiang J.
      • Li W.
      • Wang S.
      • Liu L.
      • et al.
      Prediction of postoperative complications of pediatric cataract patients using data mining.
      ] P–R curve and ROC curve approached 0.99. The diagnostic accuracy, sensitivity, and specificity of the CNN model for atrophic gastritis were 0.9424, 0.9458, and 0.9401, respectively, which exceeded that of human experts. The detection rates of mild, moderate, and severe atrophic gastritis were 93%, 95%, and 99%, respectively.
      Fig. 4
      Fig. 4Receiver operating characteristic curve, precision-recall curve, and measurement indicators about accuracy, sensitivity, and specificity of different diagnosing ways.

      3.2 Interpretability studies

      By comparing these heat maps with the image areas used by the doctors for diagnosis, we were able to verify whether the deep learning model had actually learned information regarding atrophic gastritis (Fig. 5). The heat maps generated not only completed the image classifications but also directly located the relative positions of the lesions.
      Fig. 5
      Fig. 5Interpretable thermodynamic maps in automatic diagnosis of chronic atrophic gastritis. (a) Original images. The yellow boxes indicate the randomly cropped areas of the DenseNet model on the original image. (b) Clipped areas from the original images. The red boxes are the areas of focus labeled by the doctor. (c) Heat maps generated with class activation mapping. The orange-red regions of the heat maps are consistent with the atrophic mucosa labeled by the doctors according to pathological results.

      4. Discussion

      Evidence has shown that the atrophic extent of the gastric mucosa is an independent risk factor for gastric cancer [
      • Kaji K.
      • Hashiba A.
      • Uotani C.
      • Yamaguchi Y.
      • Ueno T.
      • Ohno K.
      • et al.
      Grading of atrophic gastritis is useful for risk stratification in endoscopic screening for gastric cancer.
      ]; therefore, we classified atrophic gastritis into mild, moderate, and severe atrophic gastritis. In our experiment, 1458 cases of mild, 1348 cases of moderate, and 38 cases of severe atrophic gastritis were used to train the CNN model. The accuracy of our CNN system in the diagnosis of mild, moderate, and severe atrophic gastritis were 0.93, 0.95, and 0.99, respectively, indicating that the detection rates for moderate and severe atrophic gastritis by CNN were higher than those for mild atrophic gastritis. Therefore, the model proposed in this paper was deemed to be accurate and effective. This result may be attributable to the fact that changes in glands and blood vessel textures are more evident in moderate and severe atrophic gastritis than those in mild atrophic gastritis, and the atrophic area is larger. Additionally, atrophic symptoms are regional, and some of the atrophic areas were small and unclear due to the capturing angle. Using multiple images from a single patient may yield more accurate diagnostic results through integrated analysis.
      In future studies, we will adopt this method of target detection to train the CNN to learn endoscopic pictures of atrophic gastritis, reduce interference in unrelated areas, and improve the diagnostic rate of mild atrophic gastritis. This will enable early intervention to slow or even reverse the progression of atrophic gastritis in patients.
      Our study had several limitations. First, atrophic gastritis primarily occurs in the gastric antrum; however, a small percentage occurs in the cardia, stomach angle, and other locations. Our research mainly focused on the diagnosis and recognition of atrophic gastritis of the gastric antrum. Future studies should focus on whole gastric images for comprehensive diagnosis of atrophic lesions in the entire gastric mucosa. Second, dynamic videos can simulate the process of gastroscopy more realistically. Using video data to train the CNN-CAG model further could improve the clinical applicability of this model. Third, in the training set, high-quality and clear gastroscopic images were used; actual gastroscopic images may not be as clear due to the instability of the handheld mirror, and these blurred gastroscopic images may hinder the recognition of gastric mucosal lesions. Thus, in training and learning data sets, we should gradually increase the number of low-quality images and test the accuracy of the CNN-based diagnosis. Fourth, the etiology of atrophic gastritis includes H. pylori infection. In our experiment, we failed to classify gastric mucosa features caused by different etiologies of atrophic gastritis. Fifth, this study used datasets from a single health center; multicenter data validation should be conducted to improve the accuracy of the CNN-based diagnosis. Sixth, as this was a retrospective study, prospective studies are needed to verify the diagnostic ability of the CNN. Finally, focus localization in current research is achieved by the visualization of model results. In future studies, we need to use the sliding window and target detection algorithms to detect lesions in CAG to achieve better classification.
      In this study, we achieved a high accuracy rate in CAG screening, higher than that previously reported. Furthermore, we visualized the analytical methods used by the model and compared them to those of diagnosing physicians. These results showed that lesions detected by the model were consistent with those detected by doctors, verifying the accuracy and validity of the model trained in this study. In conclusion, the diagnosis of CAG can be confirmed by gastroscopic images using a CNN. We recommend that CNN be widely used in clinical practice to help endoscopists diagnose atrophic gastritis. This should be confirmed by future research and validated in a large-scale study.

      Conflict of interest

      None declared.

      Funding

      This study was funded by the Post-doctoral Research Projects of Shanxi Provincial Department of Human Society (2017-92).

      Acknowledgments

      The authors thank the endoscopists at the Shanxi Provincial People’s Hospital of Gastroenterology who helped perform the gastroscopy. We are also grateful to the engineers at Baidu Online Network Technology (Beijing) Corporation and School of Computer Science and Technology, Xidian University, who helped develop and test the convolutional neural networks. We would like to thank Editage (www.editage.cn) for English language editing.

      References

        • Bray F.
        • Ferlay J.
        • Soerjomataram I.
        • Siegel R.L.
        • Torre L.A.
        • Jemal A.
        Global cancer statistics 2018: globocan estimates of incidence and mortality worldwide for 36 cancers in 185 countries.
        CA Cancer J Clin. 2018; 68: 394-424
        • Uemura N.
        • Okamoto S.
        • Yamamoto S.
        • et al.
        Helicobacter pylori infection and the development of gastric cancer.
        N Engl J Med. 2001; 345: 784-789
        • Song J.H.
        • Kim S.G.
        • Jin E.H.
        • Lim J.H.
        • Yang S.Y.
        Risk factors for gastric tumorigenesis in underlying gastric mucosal atrophy.
        Gut Liver. 2017; 11: 612-619
        • Sugano K.
        • Tack J.
        • Kuipers E.J.
        • et al.
        Kyoto global consensus report on Helicobacter pylori gastritis.
        Gut. 2015; 64: 1353-1367
        • Cheung D.Y.
        Atrophic gastritis increases the risk of gastric cancer in asymptomatic population in Korea.
        Gut Liver. 2017; 11: 575-576
        • Hwang Y.-J.
        • Kim N.
        • Lee H.S.
        • Lee J.B.
        • Choi Y.J.
        • Yoon H.
        • et al.
        Reversibility of atrophic gastritis and intestinal metaplasia after Helicobacter pylori eradication—a prospective study for up to 10 years.
        Aliment Pharmacol Ther. 2017; 47: 380-390
        • Du Y.
        • Bai Y.
        • Xie P.
        • Fang J.
        • Wang X.
        • Hou X.
        • et al.
        Chronic gastritis in China: a national multi-center survey.
        BMC Gastroenterol. 2014; 7: 14
        • Dixon M.F.
        • Genta R.M.
        • Yardley J.H.
        • Correa P.
        Classification and grading of gastritis.
        Am J Surg Pathol. 1996; 20: 1161-1181
        • Bogomoletz W.V.
        The “Sydney System”: a consensus approach to gastritis. Is a new “classification” necessary?.
        Gastroen Clin Biol. 1991; 15: 925-928
        • Liu T.
        • Zheng H.
        • Gong W.
        • Chen C.
        • Jiang B.
        The accuracy of confocal laser endomicroscopy, narrow band imaging, and chromoendoscopy for the detection of atrophic gastritis.
        J Clin Gastroenterol. 2015; 49: 379-386
        • Imaeda A.
        Confocal laser endomicroscopy for the detection of atrophic gastritis: a new application for confocal endomicroscopy?.
        J Clin Gastroenterol. 2015; 49: 355-357
        • Erickson B.J.
        • Korfiatis P.
        • Akkus Z.
        • Kline T.L.
        Machine learning for medical imaging.
        Radiographics. 2017; 37: 505-515
        • Wang L.
        • Zhang K.
        • Liu X.
        • et al.
        Comparative analysis of image classification methods for automatic diagnosis of ophthalmic images.
        Sci Rep. 2017; 7: 41545
        • Zhang K.
        • Liu X.
        • Jiang J.
        • Li W.
        • Wang S.
        • Liu L.
        • et al.
        Prediction of postoperative complications of pediatric cataract patients using data mining.
        J Transl Med. 2019; 17: 2
        • Zhang K.
        • Liu X.
        • Liu F.
        • He L.
        • Zhang L.
        • Yang Y.
        • et al.
        An interpretable and expandable deep learning diagnostic system for multiple ocular diseases: qualitative study.
        J Med Internet Res. 2018; 20e11144
        • Hirasawa T.
        • Aoyama K.
        • Tanimoto T.
        • Ishihara S.
        • Shichijo S.
        • Ozawa T.
        • et al.
        Application of artificial intelligence using a convolutional neural network for detecting gastric cancer in endoscopic images.
        Gastric Cancer. 2018; 21: 653-660
        • Cheng J.Z.
        • Ni D.
        • Chou Y.H.
        • Qin J.
        • Tiu C.M.
        • Chang Y.C.
        • et al.
        Computer-aided diagnosis with deep learning architecture: applications to breast lesions in us images and pulmonary nodules in ct scans.
        Sci Rep. 2016; 6
        • Bejnordi B.E.
        • Veta M.
        • Van Diest P.J.
        • van Ginneken B.
        • Karssemeijer N.
        • Litjens G.
        • et al.
        Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer.
        JAMA. 2017; 318: 2199
        • Boureau Y.-L.
        • Bach F.
        • LeCun Y.
        • Ponce J.
        Learning mid-level features for recognition.
        2010 IEEE computer society conference on computer vision and pattern recognition. 2010; (IEEE)
        • He K.
        • Zhang X.
        • Ren S.
        • Sun J.
        Deep residual learning for image recognition.
        2016 IEEE conference on computer vision and pattern recognition (CVPR). 2016; (IEEE)
        • Huang G.
        • Liu Z.
        • van der Maaten L.
        • Weinberger K.Q.
        Densely connected convolutional networks.
        2017 IEEE conference on computer vision and pattern recognition (CVPR). 2017; (IEEE)
        • Billah M.
        • Waheed S.
        • Rahman M.M.
        An automatic gastrointestinal polyp detection system in video endoscopy using fusion of color wavelet and convolutional neural network features.
        Int J Biomed Imaging. 2017; 2017: 1-9
        • Szczypiński P.
        • Klepaczko A.
        • Pazurek M.
        • Daniel P.
        Texture and color based image segmentation and pathology detection in capsule endoscopy videos.
        Comput Methods Programs Biomed. 2014; 113: 396-411
        • Byrne M.F.
        • Chapados N.
        • Soudan F.
        • Oertel C.
        • Linares Pérez M.
        • Kelly R.
        • et al.
        Real-time differentiation of adenomatous and hyperplastic diminutive colorectal polyps during analysis of unaltered videos of standard colonoscopy using a deep learning model.
        Gut. 2017; 68: 94-100
        • Urban G.
        • Tripathi P.
        • Alkayali T.
        • Mittal M.
        • Jalali F.
        • Karnes W.
        • et al.
        Deep learning localizes and identifies polyps in real time with 96% accuracy in screening colonoscopy.
        Gastroenterology. 2018; 155 (1069–1078.e8)
        • Zhang X.
        • Hu W.
        • Chen F.
        • Liu J.
        • Yang Y.
        • Wang L.
        • et al.
        Gastric precancerous diseases classification using CNN with a concise model.
        PLoS One. 2017; 12e0185508
        • Lee J.H.
        • Kim Y.J.
        • Kim Y.W.
        • Park S.
        • Choi Y.I.
        • Kim Y.J.
        • et al.
        Spotting malignancies from gastric endoscopic images using deep learning.
        Surg Endosc. 2019; 33 (Epub 2019 Feb 4): 3790-3797https://doi.org/10.1007/s00464-019-06677-2
        • Al-Omari F.A.
        • Matalka I.I.
        • Al-Jarrah M.A.
        • Obeidat F.N.
        • Kanaan F.M.
        An intelligent decision support system for quantitative assessment of gastric atrophy.
        J Clin Pathol. 2011; 64: 330-337
        • Lahner E.
        • Grossi E.
        • Intraligi M.
        • Buscema M.
        • Corleto V.D.
        • Delle Fave G.
        • et al.
        Possible contribution of artificial neural networks and linear discriminant analysis in recognition of patients with suspected atrophic body gastritis.
        World J Gastroenterol. 2005; 11: 5867-5873
        • Cheung D.Y.
        Atrophic gastritis increases the risk of gastric cancer in asymptomatic population in Korea.
        Gut Liver. 2017; 11: 575
        • Oquab M.
        • Bottou L.
        • Laptev I.
        • Sivic J.
        Learning and transferring mid-level image representations using convolutional neural networks.
        2014 IEEE conference on computer vision and pattern recognition. 2014; (IEEE)
        • Selvaraju R.R.
        • Cogswell M.
        • Das A.
        • et al.
        Grad-CAM: visual explanations from deep networks via gradient-based localization.
        2017 IEEE international conference on computer vision (ICCV). 2017; (IEEE)
        • Kaji K.
        • Hashiba A.
        • Uotani C.
        • Yamaguchi Y.
        • Ueno T.
        • Ohno K.
        • et al.
        Grading of atrophic gastritis is useful for risk stratification in endoscopic screening for gastric cancer.
        Am J Gastroenterol. 2019; 114: 71-79