Domain-Specific Software Engineering Design for Diabetes Mellitus Study Through Gene and Retinopathy Analysis

Software engineering designs and practices differ widely among various application domains. This chapter is concentrating on high performance software engineering design for bioinformatics and more specifically for diabetes mellitus study through gene and retinopathy analysis. Complex gene interaction study offers an effective control of blood glucose, blood pressure and lipids. Early detection of retinopathy is effective in minimizing the risk of irreversible vision loss and other long-term consequence associated with diabetes mellitus. Type 2 diabetes mellitus is a disorder of glucose homeostasis involving complex gene and environmental interactions that are incompletely understood. Mammalian homologs of nematode sex determination genes have recently been implicated in glucose homeostasis and type 2 diabetes mellitus. The Fem1b knockout (Fem1b-KO) mice have been developed, with targeted inactivation of Fem1b, a homolog of the nematode fem-1 sex determination gene. It shows that the Fem1b-KO mice display abnormal glucose tolerance and that this is due predominantly to defective glucose-stimulated insulin secretion. Arginine-stimulated insulin secretion is also affected. These data implicate Fem1b in pancreatic islet function and insulin secretion, strengthening evidence that a genetic pathway homologous to nematode sex determination may be involved in glucose homeostasis and suggesting novel genes and processes as potential candidates in the pathogenesis of diabetes mellitus. In addition, this chapter is going to introduce basic gene analysis approaches that can be applied on diabetes mellitus study. These approaches include searching Genbank online database using BLAST, mapping DNA, locating genes, aligning different DNA or protein sequences, determining genotypes, and comparing nucleotide or amino acid sequences using global and local alignment algorithms. Fem1b gene, as an example, is going to be discussed with these basic gene analysis approaches. Diabetic retinopathy is the leading cause of new cases of blindness among Americans aged 20 to 64 in both predominantly white and black populations [1]. Despite the recommendation for yearly eye examinations and efforts to achieve this, of the approximately 17 million Americans with diabetes, about 6 million nationwide remain undiagnosed and untreated, or not receiving annual eye examinations, which can lead to diabetic retinopathy [2]. Early indications of retinal blood vessel abnormalities and complications provide important indicators for clinical timely diagnosis and treatment of diabetes mellitus and eye disorders. The software engineering design tool facilitates increasing the number of annual diabetic


Diabetes mellitus type 2 and fem1b gene 2.1 Diabetes mellitus type 2 occurrence and its diagnosis
Diabetes mellitus type 2 is a metabolic disorder that is characterized by high blood glucose in the context of insulin resistance and relative insulin deficiency [3]. The pathophysiology of Type 2 diabetes mellitus involves impaired insulin secretion, and impaired insulin action in regulating glucose and fatty acid metabolism in the liver, skeletal muscle, and adipose tissue. Many individuals with Type 2 diabetes mellitus have hypertension and perturbations of lipoprotein metabolism, as well as other manifestations of the insulin resistant syndrome. In addition to the risk for development of diabetes -specific complications of retinopathy, Type 2 diabetes mellitus is recognized as a substantial risk factor for cardiovascular disease [4]. It is recommended by the National Diabetes Data Group that diagnosis of diabetes mellitus be based on [5] 1. Two fasting plasma glucose levels of 126 mg/dL (7.0 mmol/L) or higher; 2. Two two-hour postprandial plasma glucose (2hrPPG) readings of 200 mg/dL (11.1 mmol/L) or higher after a glucose load of 75 g; 3. Two casual glucose readings of 200 mg per dL (11.1 mmol per L) or higher. Fasting plasma glucose was selected as the primary diagnostic test because it predicts adverse outcomes (e.g., retinopathy) and is easy to perform in a clinical setting. A mammalian Fem1 gene family, encoding homologs of fem-1, has been characterized and consists of at least three members in the mouse, designated Fem1a, Fem1b, and Fem1c; these have highly conserved homologs in humans, designated FEM1A, FEM1B, and FEM1C, respectively. Mammalian homologs of two other nematode sex determination genes, tra-2 and tra-3, have recently been implicated in glucose homeostasis and type 2 diabetes mellitus. In producing susceptibility to type 2 diabetes mellitus, NIDDM1 is known to interact with a gene, whose identity is unknown, on human chromosome 15 near the CYP19 locus at 15q21.3 [6]. This is near 15q22, where FEM1B, the human homolog of mouse Fem1b, localizes [7].

www.intechopen.com
Domain-Specific Software Engineering Design for Diabetes Mellitus Study Through Gene and Retinopathy Analysis 449

Glucose and insulin measurements
Blood glucose from tail vein was measured using an OneTouch FastTake Glucometer. Insulin was measured from plasma, tissue extracts, or cell supernatants using the Rat (Mouse) Sensitive Insulin radioimmunoassay (RIA) kit and the manufacturer's instructions. For the intraperitoneal glucose tolerance test (iPGTT), intraperitoneal insulin tolerance test (iP-ITT), and acute-phase glucosestimulated insulin secretion (A-GSIS) test, there were 12 animals in each group (12 male homozygous Fem1b-KO, 12 male wild-type controls, 12 female homozygous Fem1b-KO, and 12 female wild-type controls), aged 3 to 4 months. The arginine-stimulated insulin secretion test compared eight Fem1b-KO homozygous males with eight wild-type males, aged 6 months. D-Glucose (200 mg/ml) was administered at 2 mg/g body weight by intraperitoneal injection [8]. Tail vein blood was sampled for blood glucose determination from nonsedated animals before and at 15, 30, 60, and 120 min after glucose administration.

Fem1b-KO mice development
In this study, the gene targeting by homologous recombination has been used to generate Fem1b knockout (Fem1b-KO) mice with inactivation of the Fem1b gene. It was performed with a deletion of Fem1b coding exon 1, which contains the translation initiation codon and the first two ankyrin repeats [9]. The results show that these mice display abnormal glucose homeostasis, with abnormal glucose tolerance tests and defective glucose-stimulated insulin secretion. These findings indicate that Fem1b is involved in pancreatic islet -cell function and provide further evidence for involvement of a pathway resembling nematode sex determination in mammalian glucose homeostasis. This approach utilized standard methodology ( Figure 1) and the basic elements of the targeting vector and screening strategy by Southern blot and PCR genotyping. Figure 2 was generated by Zeiss AvioVision with the immunohistochemical analysis demonstrates a loss of specific Fem1b staining in islets of Fem1b-KO homozygotes.

Glucose homeostasis in Fem1b-KO mice
As noted above, mammalian homologues of nematode sex determination genes have recently been shown to be involved in glucose homeostasis and type 2 diabetes mellitus. Based on this logic, glucose homeostasis was evaluated in the Fem1b-KO mice by using established experimental methods. As a first-line screen, these mice were between 3 and 4 months of age. The iP-ITT showed minimally abnormal results (Figure 3), suggesting that insulin resistance is not the primary defect in homozygotes, although it could be contributing. To evaluate whether the defective acute-phase insulin secretion is related to a defect in secretion per se as opposed to a defect in insulin production, the insulin content was measured o in these mice (Figure 4), which demonstrates that Fem1b-KO homozygotes have increased insulin content compared to that of wild-type controls.

Immunostaining of Fem1b in pancreatic islets
In humans, FEM1B has been shown to be expressed within whole pancreas [10], but cell type distribution within this organ was unknown. Immunostaining of wild-type pancreas with immunoaffinity-purified antibody shows that Fem1b protein is expressed in pancreatic islets ( Figure 5). Immunostaining of the pancreas with a commercially available goat polyclonal antibody against Fem1b demonstrates the same islet staining pattern, with an absence of specific staining in the Fem1b-KO homozygotes. Coimmunostaining with antibodies to insulin, acell marker, demonstrates that Fem1b is expressed in virtually all cells ( Figure 6). Fig. 6. Immunofluorescence staining of insulin (green), Fem1b (red), and merged image demonstrating that Fem1b is expressed not only in insulin-positive cells but also in insulin-negative non-cells.
The coimmunostaining with antibodies to glucagon and somatostatin, markers for cells and δ cells, respectively, demonstrates that the Fem1b protein is also expressed in these noncells ( Figure 7).

BLAST search
Basic Local Alignment Search Tool (BLAST) is an algorithm for comparing amino-acid sequences or the nucleotides. By performing a BLAST search, one is able to compare an unknown sequence with a library or database of known sequences, and identify library sequences that resemble the unknown sequence above a certain score percentage [11] (usually 40%). This chapter is going to give an example that follows the discovery of a previously unknown fem1b gene in the mouse and performs a BLAST search of the human genome to see if humans carry a similar fem1b gene. BLAST identifies sequences in the human fem1b genome that resemble the mouse fem1b gene based on similarity of sequence. Given a sequence of one fragment of mouse gene (Figure 8), BLAST software is going to search all human gene banks and find similar genes. http://blast.ncbi.nlm.nih.gov/Blast.cgi is the official BLAST website. When the results page appears, click the identifier with the highest score and you will see the following information. Here the highest score is 481. The score was calculated on the match quality and the length of the most-similar segments that occur between the unknown mouse gene and the target human fem1b gene.
When you scroll down the page, you see reach a long list of the human fem1b nucleotide sequence starting with

Sequence statistics analysis
Sections of a nucleotide sequence with a certain percentage of A+T or C+G usually indicates intergenic parts of the sequence. Figure 9 is a plot of monomer densities and combined monomer densities. One can use such statistic plot to determine if the sequence has the characteristics of a protein-coding region. Figure 10 is the visualization of the nucleotide distribution. Figure 11 is the codon distribution showing a high amount of GAA, GAT and AAC. The amino acids for GAA, GAT and AAC are Glutamate, Aspartate, and Asparagine respectively. The corresponding bar chart distribution is displayed at figure 12. It is noticeable that it contains high volume of leucine, alanine, and valine; low volume of tryptophan, methionine, and proline.

Open reading frame of Fem1b gene from both human and mouse
An open reading frame (ORF) is a nucleotide sequence without having a stop codon in a given reading frame. ORFs can be identified by examining each of the three possible reading frames on each strand. A DNA sequence must contain a translation start codon and it is usually "AGT". Possible stop codons are "TAA", "TAG and "TGA" [11]. Identifying the start and stop codons for translation determines the ORF in a given nucleotide sequence.
Once an ORF is located for a gene or mRNA, a nucleotide sequence can be translated into its corresponding amino acid sequence. Figure 13 -15 display three reading frames for human's and mouse's fem1b gene sequences. Both genes show the longest ORF on the first reading frame. Dot plots are one of the easiest ways to look for similarity between two sequences. The diagonal line shown in figure 16 indicates a good alignment between the human's and mouse's fem1b gene.

Sequence alignment 3.4.1 Global alignment
The Needleman-Wunsch algorithm, which was first published by Saul Needleman and Christian Wunsch in 1970 [12], performs a global alignment on two amino acid or nucleotide sequences. Such algorithm was the first application of dynamic programming to molecular sequence comparison. The following output was performed on two nucleotide sequences of mouse's and human's by the Needleman-Wunsch algorithm

Local alignment
The Smith-Waterman algorithm was first published by Temple Smith and Michael Waterman in 1981 [13]. It is a well-known dynamic programming algorithm for local amino acid or nucleotide sequence alignment. Unlike the global alignment, the Smith-Waterman algorithm performs comparison among segments of all lengths and optimizes the similarity. It is guaranteed to find the optimal local alignment with respect to the scoring method. However, the Smith-Waterman algorithm requires O(mn) (m and n are the length of two input sequences) . In practical use, it has been replaced by the heuristic BLAST algorithm, which is much more efficient although not guaranteed to find the optimal alignments. The following output was from local alignment of the amino acid sequences of mouse's and human's using the Smith-Waterman algorithm.

Diabetic retinopathy study through retinal image fusion
Hypoxia of the retina is believed to be a factor in the development of diabetic retinopathy, the leading cause of blindness worldwide. Retina image fusion provides a practical way for determination of the oxygenation status of the ocular fundus. Such method would be a valuable medical diagnostic tool for diabetic retinopathy [14], age-related macular degeneration, glaucoma [15], retinopathy of prematurity, and central retinal vein occlusion [16].

Acquisition of retinal images
Retinal images presented in this chapter were taken by a modified Topcon TRC-50EX fundus camera, with a lens and a c-mount through the vertical path of the camera. Hyperspectral images were taken through the vertical viewing port by an imaging www.intechopen.com  Figure 17). The subjects of the retinal images were Cynomolgus monkeys of 4 to 4.5 years of age and 2.5 to 3 kg body weight with normal eyes [17]. The use of animals for taking retinal images was approved by Louisiana State University Health Sciences Center Institutional Animal Care and Use Committee [18]. This animal usage is also conformed to the ARVO Statement for the Use of Animals in Ophthalmic and Vision Research. The monkeys were housed in an air conditioned room with normal temperature and humidity with a 12 hour light-dark diurnal cycle. Fig. 17. Hyperspectral imaging system in relation to the fundus camera. The image is redirected upward by a mirror. The imaging system is translated over the camera port by a linear actuator mounted below the imaging spectrograph and CCD camera [17].

Retinal image fusion
There are five major steps involved in image fusion (Diagram 1): 1. The first step is the image segmentation. The segmentation subdivides an input image into its constituent regions or objects and extract/detect salient features/structure for the automated procedure. 2. The second step feature extraction is going to detect the salient structures on the target images for the feature-based approach. 3. The third step is the feature matching. The purpose of feature matching is to bringing together the information that represents same features detected at different images. The first three steps will provide the initial guess of the features for the fusion algorithm. 4. The fourth step is the optimization of the initial guess. The previously detected features will be adjusted in this step through a certain objective function. 5. The final step will transforms the images from single or different modalities into spatial alignment [19] through a certain mathematical model and then display combined view of the involved images.

Retinal vasculature extraction using canny edge detector
The Canny operator [20] is less likely than the others to be "fooled" by noise, and more likely to detect true weak edges. The Canny operator preserves most edges among all other edge detectors. Therefore, the Canny edge detector is employed in this research to extract the retinal vasculature edges. There are two criteria used in the Canny Operator to locate the rapidly changed intensity pixels.
where, n is the direction of the gradient of the image; G is the edge signal; I is the image intensity. The zero-crossings of Canny's method correspond to the first directionalderivative's maxima and minima in the direction of the gradient. Edges will be identified as the maxima in magnitude. Each pixel's edge gradient is computed and compared with the gradients of its neighbors along the gradient direction. If the central pixel is smaller, mark the current edge's intensity as 0; if largest among all neighbors, keep the original intensity. Based on the nine-pixel neighborhood, the normal to the edge direction has two u x and u y . In order to estimate the gradient on the discrete sampling, two pixels closest to u are selected. A plane can be identified by the gradient magnitudes of three pixels. By using this plane, the gradient magnitude and the intensity at each pixel on the line can be locally estimated. The gradient magnitude at P x+1, y+1 and P x-1, y-1 ( Figure 18) can be calculated as: If the gradient at P x,y is greater than both of 1, 1 () xy GP ++ and 1, 1 () xy GP −− , P x,y will be identified as a maximum. In order to make the localization of magnitude maxima accurate, Canny defined a filter by optimizing a performance index which enhances real positive and real negative. The filter is used to minimize the probability of non-detected edge points and false detection. where, SNR stands for Signal-to-Noise Ratio; f is the filter; denominator is the RMSE response to noise n(x). The identification of the real edge localization is defined as: Two adaptive thresholds are used in Canny's method. They are high threshold and low threshold. The high threshold is used to find the start point of strong edges. Any points that meet the high threshold will be selected as the edge point. These start points are growing into different directions as long as there is no edge strength falling below the low threshold. Figure 19 is the 3D shaded surface plots of the original retinal angiogram image. The X-Y axis corresponds to the original image size. The height Z axis is a single-valued function defined over a geometrically rectangular grid. Z specifies the color data as well as surface height, so color is proportional to surface height with range of [0, 1] of each pixel on the image. All the retinal salient features are preserved in the Canny edges. Those salient features are the retinal vessel bifurcations, from which the control points will be selected using the Adaptive Exploratory Algorithm. Figure 20 and 21 show the retinal vessel edges detected by the Canny operator.

Control point detection
A good-guess of the initial control point selection ensures fused image generated at an efficient computational time. Bad control point selection will significantly increase the computation cost, or even cause the image fusion fail. Vessels or some particular abnormalities make images not necessarily matching the retina structures. Even when structure and function correspond, the abnormality still happens sometimes if inconsistence exists between structural and functional changes. Further more, angiogram images usually have higher resolution and are rich in information, whereas fundus images have lower resolution and are indeed abstract with some details or even missing some small vessels. Practically, those situations are unavoidable and will create difficulties in extracting the control points because the delineation of the vein boundaries may not be precise. In this study, control points are detected using the adaptive exploratory algorithm (Figure 20 and 21) [21].  www.intechopen.com

Heuristic optimization algorithm
An optimization procedure is required to adjust the initial good-guess control points in order to achieve the optimal result. The process can be formulated as a heuristic problem of optimizing an objective function that maximizes the Mutual-Pixel-Count between the reference and input images. The algorithm finds the optimal solution by refining the transformation parameters in an ordered way. By maximizing the objective function, one image's vessels are supposed to be well overlaid onto those of the other image ( Figure 23). Mutual-Pixel-Count measures the optic nerve head vasculature overlapping for corresponding pixels in both images. It is assumed that the retinal vessels are represented by 0 (black pixel) and background is represented by 1 (white pixels) in the binary 2D map. When the vasculature pixel's transformed (u, v) coordinates on the input image correspond to the vasculature pixel's coordinates on the reference image, the MPC is incremented by 1 ( Figure 22). MPC is assumed be maximized when the image pair is perfectly geometrically aligned by the transformation. After pre-processing, the binary images of the reference and input images are obtained, i.e. I ref and I input . Only black pixels from both images contribute to MPC. The ideal case is that all zero pixels of the input image are mapped onto zero pixels of the reference image. The problem can be mathematically formulated as the maximization of the following objective function: Coordinates' adjustment is iteratively implemented until one of the following convergence criteria is reached: (1). Predefined maximum number of loops is reached; or (2). the updated f MPC is smaller than ε , i.e.
where ε is a very small non-negative threshold.

Affine transformation model
A mathematical model is the tool for transforming the target images and fusing them into one single volume. Affine model is a basic geometrical transformation in image processing and is defined as Eq. 8 and 9. The DOF of the affine model is 6 because it has six parameters, i.e. a 1 , a 2 , b 1 , b 2 , a 3 , and a 4 .
Affine model's advantage lies in that it can measure lost information such as skew, translation, rotation, shearing and scaling that maps finite points to finite points and parallel lines to parallel lines ( Figure 24 and Table 1). Its drawback lies in the strict requirement that at least six pairs of control points are needed [19].

Transformation Description Rotation/Skew
Points are rotated by an angle θ.

Translation
A linear shift in the position of the vertical and horizontal coordinates of the image in one plane to another set in the same spatial domain.

Scaling
A transformation of the horizontal and vertical coordinate points characterized by a certain scale factor.

Shearing
A transformation in which all points along a line remain fixed while other points are shifted parallel to the line by a certain distance proportional to their perpendicular distance from the line [22].

Conclusion and future directions
Fem1b functions in vivo to regulate insulin secretion and plasma glucose levels. Fem1b-KO mice do not have fasting hyperglycemia but rather have defective acutephase GSIS. Such defective acute-phase GSIS is the earliest detectable defect in humans destined to develop diabetes and may represent the primary genetic risk factor predisposing to diabetes [23]. With aging and superimposed insulin resistance, fasting hyperglycemia and overt diabetes later develop. Therefore, the Fem1b-KO mouse model is a key component of the complex pathogenesis of type 2 diabetes mellitus. Both male and female homozygotes display abnormal glucose tolerance. The role of Fem1b in pancreatic islet insulin secretion strengthens evidence that a genetic pathway homologous to nematode sex determination may be involved in mammalian glucose homeostasis. This novel pathway could be involved in the -cell dysfunction seen in type 2 diabetes mellitus. Since Calpain-10/NIDDM1 is known to interact with a gene that is near where human FEM1B localizes, to increase susceptibility to type 2 diabetes [24], whether FEM1B could be the responsible interacting gene becomes a pertinent question Although the mechanism of this regulation by Fem1b remains to be established, this finding strengthens evidence that a genetic pathway homologous to nematode sex determination may be involved in mammalian glucose homeostasis and promises to offer insight into novel genes and processes as potential candidates in the pathogenesis of diabetes mellitus. Multi-modality analysis has been emerging as a major trend in the remote sensing, computer visualization, and biomedical image fusion. Fusing biomedical images is a very challenging problem because of the possible vast content change and non-uniform distributed intensities of the involved images. The new algorithm presented in this chapter, which consists of the Adaptive Exploratory Algorithm for the control point detection and heuristic optimization fusion, is reliable and time efficient. The new approach has achieved an excellent result by giving the visualization of fundus image with a complete angiogram overlay. By locking the multi-sensor images in one place, the algorithm allows ophthalmologists to match the same eye over time to get a sense of disease progress and pinpoint surgical tools to increase accuracy and speed of the surgery. The new algorithm can be easily expanded to human or animals' 3D eye, brain, or body image feature extraction, registration, and fusion. Many biomedical registration and fusion methods are still primarily used for research activity [19]. Very few of them have been developed into the integrated user-friendly computer software. The eventual aim is to developing and distributing the advanced and easy-to-use software which is suitable for various clinical environments. This plan requires intensive user interface developing work, which allows users adjusting a few threshold parameters if necessary. The user interface must be stable, simple and informative for every day clinical routine, because most of the end-users do not have much knowledge about algorithms and computer programs. Working closely with clinicians is extremely important when applying the new methods to practical clinic applications. As computation speed has been dramatically increased, real-time live ophthalmic image processing [25] will be used to handle larger and larger volumes of data in short periods. The data transmission rate, image size, higher resolution pixels, and many other issues will inevitably stress such live imaging fusion systems. The algorithms presented in this book have potential ability to handle those challenges. The presented method is a promising step towards useful clinical tools for retinopathy diagnosis, and thus forms a good foundation for further development.