key: cord-0035255-43igjn6x authors: Kylberg, Gustaf; Sintorn, Ida-Maria; Borgefors, Gunilla title: Towards Automated TEM for Virus Diagnostics: Segmentation of Grid Squares and Detection of Regions of Interest date: 2009 journal: Image Analysis DOI: 10.1007/978-3-642-02230-2_18 sha: 55aecc3d08ef7e06b5c5130477558a6cd96a25e0 doc_id: 35255 cord_uid: 43igjn6x When searching for viruses in an electron microscope the sample grid constitutes an enormous search area. Here, we present methods for automating the image acquisition process for an automatic virus diagnostic application. The methods constitute a multi resolution approach where we first identify the grid squares and rate individual grid squares based on content in a grid overview image and then detect regions of interest in higher resolution images of good grid squares. Our methods are designed to mimic the actions of a virus TEM expert manually navigating the microscope and they are also compared to the expert’s performance. Integrating the proposed methods with the microscope would reduce the search area by more than 99.99 % and it would also remove the need for an expert to perform the virus search by the microscope. Ocular analysis of transmission electron microscopy (TEM) images is an essential virus diagnostic tool in infectious disease outbreaks as well as a means of detecting and identifying new or mutated viruses [1, 2] . In fact, virus taxonomy, to a large extent, still uses TEM to classify viruses based on their morphological appearance, as it has since it was first proposed in 1943 [3] . The use of TEM as a virus diagnostic tool in an infectious emergency situation was, for example, shown in both the SARS pandemic and the human monkey pox outbreak in the US 2003 [4, 5] . The viral pathogens were identified using TEM before any other method provided any results or information. It can provide an initial identification of the viral pathogen faster than the molecular diagnostic methods more commonly used today. The main problems with ocular TEM analysis are the need of an expert to perform the analysis by the microscope and that the result is highly dependent on the expert's skill and experience. To make virus diagnostic using TEM more useful, automated image acquisition combined with automatic analysis would hence be desirable. The method presented in this paper focuses on the first part, i.e., enabling automation of the image acquisition process. It is part of a project with the aim to develop a fully automatic system for virus diagnostics based on TEM in combination with automatic image analysis. Modern transmission electron microscopes are, to a large extent, controlled via a computer interface. This opens up the possibility to add on software to automate the image acquisition procedure. For other biological sample types and applications (mainly 3D reconstructions of proteins and protein complexes), procedures for fully automated or semi automated image acquisition already exist as commercially available software or as in house systems in specific labs, i.e., [6, 7, 8, 9, 10] . For the application of automatically diagnosing viral pathogens, a pixel size of about 0.5 nm is necessary to capture the texture on the viral surfaces. If images with such high spatial resolution would be acquired over the grid squares of a TEM grid with a diameter of 3 mm, one would end up with about 28.3 terapixels of image data, where only a small fraction might actually contain viruses. Consequently, to be able to create a rapid and automatic detection system for viruses on TEM grids the search area has to be narrowed down to areas where the probability of finding viruses is high. In this paper we present methods for a multi resolution approach, using low resolution images to guide the acquisition of high resolution images, mimicking the actions of an expert in virus diagnosis using TEM. This allows for efficient acquisition of high resolution images of regions of an TEM grid likely to contain viruses. The main concept in the method is to: 1. segment grid squares in overview images of an TEM grid, 2. rate the segmented grid squares in the overview images, 3. identify regions of interest in images with higher spatial resolution of single good squares. An EM grid is a thin-foil mesh of usually 3.05 mm in diameter. They can be made from a number of different metals such as copper, gold or nickel. The mesh is covered with a thin film or membrane of carbon and on top of this sits the biological material. Overview images of 400-Mesh EM grids at magnifications between 190× and 380× show a number of bright squares which are the carbon membrane in the holes of the metal grid, see Fig. 1 (a). One assumption is made about the EM grid in this paper; the shape of the grid squares is quadratic or rectangular with parallel edges. Consequently there should exist two main directions of the grid square edges. Detecting Main Directions. The main directions in these overview images are detected in images that are downsampeled to half the original size, simply to save computational time. The gradient magnitude of the image is calculated using the first order derivative of a Gaussian kernel. This is equivalent to computing the derivative in a pixel-wise fashion of an image smoothed with a Gaussian. This can be expressed in one dimension as: where f (x) is the image function and G(x) is a Gaussian kernel. The smoothing properties makes this method less noise sensitive compared to calculating derivatives with Prewitt or Sobel operators [11] . The Radon transform [12] , with parallel beams, is applied on the gradient magnitude image to create projections in angles from 0 to 180 degrees. In 2D the Radon transform integrates the gray-values along straight lines in the desired directions. The Radon space is hence a parameter space of the radial distance from the image center and angle between the image x-axis and the normal of the projection direction. To avoid the image proportions to bias the Radon transform only a circular disc in the center of the gradient magnitude image is used. Figure 2 (a) shows the Radon transform for the example overview image in Fig. 1(a) . A distinct pattern of local maxima can be seen at two different angles. These two angles correspond to the two main directions of the grid square edges. These two main directions can be separated from other angles by analyzing the variance of the integrated gray-values for the angles. Figure 2 (b) shows the variance in the Radon image for each angle. The two local maxima correspond to the angles of the main directions of the grid square borders. These angles can be even better identified by finding the two lowest minima in the second derivative, also shown in Fig. 2(b) . If there are several broken grid squares with edges in the same direction analyzing the second derivative of the variance is necessary. To find the straight lines connecting the edges in the gradient magnitude image the Radon transform is applied once more, but now only in the two main directions. Figure 3 (a) shows the Radon transform for one of the main directions. These functions are fairly periodic, corresponding to the repetitive pattern of grid square edges. The periodicity can be calculated using autocorrelation. The highest correlation occurs when the function is aligned with itself, the second highest peak in the correlation occurs when the function is shifted one period etc., see Fig. 3 (b). In Fig. 3 (c) the function is split into its periods and stacked (cumulatively summed). These summed periods have one high and one low plateau separated by two local maxima which we want to detect. By using Otsu's method for binary thresholding [13] these plateaux are detected. Thereafter, the two local maxima surrounding the low plateau are found. The high and low plateaux correspond to the inside and outside of the squares, respectively. Knowing the distance between the peaks (the length of the high plateau) and the period length the peak positions can be propagated in the Radon transform. This enables filling in missing lines, due to damaged grid square edges. The distance between the lines, representing the square edges, may vary a few units throughout the function, therefore, the peak positions are fine tuned by finding the local maxima in a small region around the peak position, shown as red circles and crosses in Fig. 3(a) . This step completes the grid square segmentation. The segmented grid squares are rated on a five level scale from 'good' to 'bad'. The rating system mimics the performance of an expert operator. The rating is based on whether a square is broken, empty or too cluttered with biological material. Statistical properties of the gray level histogram such as mean and the central moments variance, skewness and kurtosis are used to differentiate between squares with broken membranes, cluttered squares and squares suitable for further analysis. To get comparable mean gray values of the overview images their intensities are normalized to [0, 1] . A randomly selected set of 53 grid squares rated by a virologist was used to train a naive Bayes classifier with a quadratic discriminant function. The rest of the segmented grid squares was rated with this classifier and compared with the rating done by the virologist, see Sec. 4. In order to narrow down the search area further, only the top rated grid squares should be imaged at higher resolution at an approximate magnification of 2000× to allow detection of areas more likely to contain viruses. We want to find regions with small clusters of viruses. When large clusters have formed, it can be too difficult to detect single viral particles. In areas cluttered with biological material or too much staining, there are small chances of finding separate virus particles. In fecal samples areas cluttered with biological material are common. The sizes of the clusters or objects that are of interest are roughly in the range of 100 to 500 nm in diameter. In our test images with a pixel size of 36.85 nm these objects will be about 2.5 to 14 pixels wide. This means that the clusters can be detected at this resolution. To detect spots or clusters of the right size we use difference of Gaussians which enhances edges of objects of a certain width [14] . The difference of Gaussian image is thresholded at the level corresponding to 50 % of the highest intensity value. The objects are slightly enlarged by morphologic dilation, in order to merge objects close to each other. Elongated objects, such as objects along cracks in the gray level image, can be excluded by calculating the roundness of the objects. The roundness measure used is defined as follows: where the area is the number of pixels in the object and the perimeter is the sum of the local distances of neighbouring pixels on the eight connected border of the object. The remaining objects correspond to regions with a higher probability to contain small clusters of viruses. Human fecal samples and domestic dog oral samples were used, as well as cellcultured viruses. A standard sample preparation protocol for biological material with negative staining was used. The samples were diluted in 10% phosphate buffered saline (PBS) before being applied to carbon coated 400-Mesh TEM grids and let to adhere for 60 seconds before excess sample were blotted of with filter paper. Next, the samples were stained with the negative staining phosphotungstic acid (PTA). To avoid PTA crystallization the grids were tilted 45 • . Excess of PTA was blotted off with filter paper, and left to air dry. The different samples contained adenovirus, rotavirus, papillomavirus and semliki forest virus. These are all viruses with icosahedral capsids. A Tecnai 10 electron microscope was used and it was controlled via Olympus AnalySIS software. The TEM camera used was a CCD based side-mounted Olympus MegaView III camera. The images were acquired in 16 bit gray scale resolution TIFF format with a size of 1376×1032 pixels. For grid square segmentation overview images in magnifications between 190× and 380× were acquired. To decide the size of the sigmas used for the Gaussian kernels in the difference of Gaussian in Sec. 2.3 image series with decreasing magnification of manually detected regions with virus were acquired. To verify the method image series with increasing magnification of manually picked regions were taken. Magnification steps in the image series used were between 650× and 73000×. The methods described in Sec. 2 were implemented in Matlab [15] . The computer used was a HP xw6600 Workstation running the Red Hat Linux distribution with the GNOME desktop environment. Segmenting and Rating Grid Squares. The method described in Sec. 2.1 was applied on 24 overview images. One example is shown in Fig. 1 . The sigma for the Gaussian used in the calculation of the gradient magnitude was set to 1 and the filter size was 9×9. The Radon transform was used with an angular resolution of 0.25 degrees. The fine tuning of peaks was done within ten units of the radial distance. All the 159 grid squares completely within the borders of the 24 overview images were correctly segmented. The segmentation of the example overview image is shown in Fig. 1(a) . The segmented grid squares were classified according to the method in Sec. 2.2. One third, 53 squares, of the manually classified squares were randomly picked as training data and the other two thirds, 106 squares, were automatically classified. This procedure was repeated twenty times. The resulting average confusion matrix is shown in Table 1 . When rating the grid squares they were on the average, 73.1 % correctly classified according to the rating done by the virologist. Allowing the classification to deviate ± 1 from the true rating 97.2 % of the grid squares were correctly classified. The best preforming classifier in these twenty training runs was selected as the classifier of choice. Detecting Regions of Interest. Eight resolution series of images with decreasing resolutions on regions with manually detected virus clusters were used to choose suitable sigmas for the Gaussian kernels in the method in Sec. 2.3. The sigmas were set to 2 and 3.2 for images with a pixel size of 36.85 nm and scaled accordingly for images with other pixel sizes. The method was tested on the eight resolution series with increasing magnification available. The limit for roundness of objects was set to 0.8. Figure 4 shows a section of one of the resolution series for one detected virus cluster at three different resolutions. In this paper we have presented a method that enables reducing the search area considerably when looking for viruses in TEM grids. The segmentation of grid squares, followed by rating of individual squares, resembles how a virologist operates the microscope to find regions with high probability to have virus content. The segmentation method utilizes information from several squares and their regular patterns to be able detect damaged squares. If overview images are acquired with a very low contrast between the grid and the membrane or if all squares in the image are lacking the same edges, the segmentation method might be less successfull. This is, however, an unlikely event. By decreasing the magnification, more squares can be fit in a single image and the probability that all squares have the same defects will decrease. Another solution is to use information from adjacent images from the same grid. This grid-square segmentation method can be used in in other TEM applications using the same kind of grids. The classification result when rating grid squares shows that the size of the training data is adequate. Resuts when using different sets of 53 manually rated grid squares to train the naive Bayes classifier indicates that the choise of training set is sufficient as long as each class is represented in the training set. The detection of regions of interest narrows down the search area within good grid squares. For the images at a magnification of 1850×, showing a large part of one grid square, the decrease in search area was calculated to be on average a factor 137. In other terms on average 99.3 % of the area of each analyzed grid square was discarded. The remaining regions have higher probability of containing small clusters of viruses. By combining the segmentiation and rating of grid squares with the detection of regions of interest in the ten highest rated grid squares (usually more than ten good grid squares are never visually analyzed by an expert) the search area can be decreased with a factor of about 4000, assuming a standard 400 mesh TEM grid is used. This means that about 99.99975 % of the original search area can be descarded, assuming a standard 400 mesh TEM grid is used. Parallel to this work we are developing automatic segmentation and classification methods for viruses in TEM images. Future work includes integration of these methods and those presented in this paper with softwares for controlling electron microscopes. Electron microscopy for rapid diagnosis of infectious agents in emergent situations Rapid viral diagnosis: role of electron microscopy Helmut ruska and the visualisation of viruses The detection of monkeypox in humans in the western hemispher A novel coronavirus associated with severe acute respiratory syndrome Automated molecular microscopy: The new Leginon system Automated acquisition of cryo-electron micrographs for single particle reconstruction on an fei Tecnai electron microscope Automated 100-position specimen loader and image acquisition system for transmission electron microscopy Automated data collection with a tecnai 12 electron microscope: Applications for molecular imaging by cryomicroscopy Automatic particle selection: results of a comparative study Ch. 10.2.6. In: Digital Image Processing Ch. 5.11.3. In: Digital Image Processing A threshold selection method from gray-level histograms Ch. 5.3.3. In: Image Processing, Analysis, and Machine Vision Matlab: system for numerical computation and visualization We would like to thank Dr. Kjell-Olof Hedlund at the Swedish Institute for Infectious Disease Control for providing the samples and being our model expert, and Dr. Tobias Bergroth and Dr. Lars Haag at Vironova AB for acquiring the image. The work presented in this paper is part of a project funded by the Swedish Agency for Innovative systems (VINNOVA), Swedish Defence Materiel Administration (FMV), and the Swedish Civil Contingencies Agency (MSB). The project aims to combine TEM and automated image analysis to develop a rapid diagnostic system for screening and identification of viral pathogens in humans and animals.