e= Indigenous, += Introduced, o= not known if indigenous, and +?=not known if the introduction failed
List of fish species in the Issyk-Kul Lake [15].
Lake Issyk-Kul in Kyrgyzstan is the second largest mountain lake in the world (after Lake Titicaca in South America). It is situated in a basin surrounded by high mountains. While its water level is at 1608 m altitude, the mountain ranges of Kungei Ala-Too in the north reach 4711 meters, and those of Terskei Ala-Too in the south 5216 m. These mountains represent the major part of the Issyk-Kul catchment of 22,080 km2 and provide most of the water to the lake.
Issyk-Kul Lake is 180 km long and 60 km wide, its surface area is 6240 km2, and the shoreline 670 km (Figure 1.). The mean depth is 280 m, and the maximum 702 m, making it the fifth deepest lake in the world. The area covered by a depth of 0-100 m represents 38% of the total area. This is the major production zone of the lake [1].
Native fish stocks in this high-altitude saline lake have been subjected to predatory pressure from large number of introduced alien fish species. Previous papers and fishers are convinced that these predators are the most destructive to fish biodiversity in the lake, but this study wants to raise also other reasons which could explain at least part of the loss of fish stocks. The rapid growth in human activities with the development of tourism industry; irrigation; water eutrophication and pollution, and climate change impacts are alternative factors this presentation focuses on. This chapter also reviews the fish stocks and fishery management measures to increase the fish yields at the Lake.
Measures taken in order to protect the decreased fish stocks and endemic fish species include a Moratorium for Artisanal and Commercial Fish catching for a period of 10 years (2003-2013). Despite of the Moratorium at least 500 people continue their activity as illegal fishermen. Impacts of illegal and over-fishing are evaluated as anthropogenic activities. It is also noted that the disintegration of the Soviet Union had profound economic and social effects on many of the newly independent transition economies, like Kyrgyzstan.
Lake Issyk-Kul is large like a sea. Opposite shore is often not visible. Photo: Azat Alamanov
Although located at a high altitude, Issyk-Kul Lake never freezes over. The water temperature does not fall below the temperature of maximum density (2.75 oC at the mineral concentration of 6‰) except in shallow Rybachinsky and Tyup bays. The climate is continental with a short hot summer and a cold long winter. In summer the surface water temperature in the central part reaches 18oC, in winter it is seldom above 4oC [2]. The temperature may drop by 12-14oC down to 50 m depth and a further 1.5-2oC to the depth of 100 m. The water layer at 100-200 m depth maintains almost a stable temperature with changes kept within 0.1-0.3oC [3].
The chemical composition of the lake is as follows in mg/l-1: Calcium – 121, Magnesium – 287, Sodium+Potassium – 1544, Chloride – 1596, Bicarbonate+Carbon trioxide – 318, Sulfate – 2102. Total cations: 1952, total anions 4016 [4]. So its mineral content is chloride/sulfate/sodium/magnesium based. With the drop in water level also comes a certain increase in salinity. Data from 1932 shows that the salinity measured 5.8‰, and by 1984 it had increased to 6.0‰. Over this period water level dropped by 2.5 m. Current measurements show that the salinity between October 2008 and November 2009 varied in Bosteri between 2 and 9‰ and the average was 5.9‰; which indicates that it is going down (Mikkola, unpublished). Since 1986, the decline in water level has stopped and the lake level has started to rise again. Low salinity (less than 20% that of seawater) indicates that in historical terms Issyk-Kul has only relatively recently become a closed lake. Hydrologists have suggested that, deep underground, the lake water filters into the Chui River. It looks as though the river Chui never “found” its way to the lake as the river bends a mere 4 km of the lake to the west, disappearing into the desert of Kazakhstan. During the very high water cycles the lake’s water may overflow to the river through a natural depression – the Kutemaldu channel.
Currently the pH ranges between 7.7 and 7.9. The waters of Issyk-Kul are rich in oxygen, as a result of aeration and movement of lake waters. First of all, water is well oxygenated because it is regularly mixed by strong winds. During the warm period of the year, the surface water moves from the central part of the lake towards the shores and it is replaced by deeper cool water. In the middle of the lake the water is stratified down to 5-10 m whereas near the shores the thermal discontinuity is at a depth of 20-30 m owing to the warm water inflow. Apart from the central upwelling there is also lateral upwelling that is, caused by the wind driving surface water from the shore to the open parts of the lake [5]. Two major currents, driven by two wind streams, can almost always be observed: one follows the northern shore in a westerly direction, and the other flows east along the southern shore. The transparency of Issyk-Kul waters approaches that of seawater, and in the open part of the lake Secchi disc readings range from 30-47 m, but are reduced even down to 50 cm at the mouths of the inflowing rivers.
Lake Issyk-Kul flora contains emerged macrophytes, like Phragmites australis, Typha latifolia and Scirpus tabermaemontani until the depth of 1.5 m. Submerged macrophytes like Potamogeton pectinatus, Myriophyllum spicatum and Najas marina and attached algae can go down to 30-40 m. Mean annual macrophyte production is about 277 g/m2 [6]. Characeae are the most common macrophytes, representing 96% of the total annual macrophyte production, and are present in almost all plant associations. Four species of Chara grow in shallow water, and three benthic species exist further down. Dense growth of Charophyte green algae extends to 40 m depth. Issyk-Kul Lake water is rich in phytoplankton, with 299 identified species. Blue-green algae (Cyanophyceae) dominate, but their standing crop is low [7]. Phytoplankton production is at the level of 49 g/m3 [6].
Zooplankton includes 117 species and is dominated by rotifers (84%), followed by cladocerans (9%) and copepods (7%). Zooplankton production is 91g/m3 [6]. Zooplankton and phytoplankton distribution in the lake is uneven, with bays and shallows being richer than open water. Arctodiaptomus salinus is present in all parts of the lake and over the year it may represent 75-95% of the total number of zooplankton and 95-99% of the biomass. This species migrates during the night into the surface water where its concentration reaches up to 35,000 ind/m3 [8], thus representing an important food source for all plankton-eating fish like Issyk-Kul Dace Leuciscus bergi.
Zoobenthos comprises 224 species. Most benthos occurs between the shoreline and 40 m depth, which comprises the Charophyte zone. According to [6] the mean annual production of zoobenthos is 10 g/m3. It has been calculated that the average biomass of zoobenthos in the gulfs with open zones is 93.6 kg/ha [9]. Chironomids, mollusks, gammarids and Mysis comprise 75-80 per cent of the total. In the deeper zone beyond the zone of Characeae and down to about 70 m, the biomass is 2.5-3.5 g/m2 and is dominated by chironomids and the mollusk Radix auricularia. Three Mysis species introduced into Issyk-Kul from Lake Balkhash, Kazakhstan, in 1965-1968 are now permanently established in shallows, mostly in 1.5-1.8 m depth, but reaching down to 10 m. Their mean biomass in such waters has been measured to range from 1.5 to 2.5 g/m2 [10].
The original fish fauna comprised twelve indigenous species and two subspecies particular to this lake (Table 1). The long historical and geographical isolation of the lake favoured the formation of endemic forms. This fauna is a typical example of the local Central Asian fish complex originating from Central Asian Mountain fauna (a term used by Berg, 1949), which is characterized by the presence of the loaches and cyprinids, with a small addition of leuciscins of Siberian origin. In the native fish fauna of the lake there were no predators although large Naked Osmans Gymnodiptychus dybowskii are said to feed partly on small fish [11].
Strictly endemic fish Schmidt’s Dace Leuciscus schmidti is present throughout the shallow littoral zone but goes during winter down to 35-40 m. It appears in two forms, a common fast-growing lake form and a slow-growing bay form. The fast-growing form reaches 31 cm, a weight of 650 g, and age of 11 years. It spawns on stony beds at depths of 0.5-10 m between the end of March (water temperature 5oC) and mid-May. Fecundity is 6,000-65,000 eggs per year. It feeds largely on Characeae, but also on mollusks. The slow-growing form is present throughout the shallows. It reaches a length of 23 cm, a weight of 220 g, and a maximum age of 13 years. It has a similar fecundity and it spawns on the same substrate as the other form, but later.
Issyk-Kul Dace was the dominant fish until 1997, when Schmidt’s Dace became for the first time the most numerous commercial fish in the lake. Issyk-Kul Dace inhabits the whole littoral zone, but is more pelagic than Schmidt’s Dace. During the winter it is found down to depths 120-150 m, and reaches a maximum body-length of 17.5 cm and weight of 60 g. It spawns in shallow waters at depths between 1-8 m, and feeds mostly on plankton. During recent years the number and distribution of this species have sharply declined.
There are two endemic species distributed in mountain waters of Middle and Central Asia. The Scaly Osman Diptychus maculatus inhabits high-mountain streams, but enters also into Lake Issyk-Kul. It can grow 50 cm long and weighs up to 1 kg. It feeds on vegetation and invertebrates. The fish spawns in the spring and summer. It has a dwarf form, which lives mainly in the incoming small rivers, and may not live in the lake. It does not exceed 25 cm in length and weighs less than 200 g.
The Naked Osman is found in mountain rivers and lakes (Figure 2.). It also appears in two forms: one inhabits in rivers and the other in lakes. Lake living fish appear to have two ecological morphs: a winter lake morph and a summer migratory morph which spawns in rivers with a sandy bottom. Forms and eco-morphs would indicate that taxonomic studies are needed. The winter morph spawns from February to April and its fecundity is 13,000-14,500 per year. The summer morph is smaller, has a lower fecundity of 5,500-12,000, and spawns from April until September [12,13]. Both morphs are omnivorous. In the lake it feeds mostly on mollusks over muddy and loamy bottoms at 15-30 m deep. The largest Naked Osmans in the lake attain the age of 20 years and can grow up to 60 cm long and 3 kg of weight. It was once important commercial fish in the Issyk-Kul Lake, but there are indications that it is close to extinction [14].
The first Naked Osman captured alive 2009 in the UNDP/GEF Project. Photo:Azat Alamanov
Scientific name | Common name | Indigenous | Introduced |
Onchorhynchus mykiss | Rainbow Trout | + | |
Salmo ischchan | Sevan Trout | + | |
Coregonus lavaretus | Common Whitefish | + | |
Coregonus widegreni | Valaam Whitefish | + | |
Coregonus autumnalis | Baikal Omul | + | |
Leuciscus schmidti | Schmidt’s Dace | e | |
Leuciscus bergi | Issyk-Kul Dace | e | |
Phoxinus issykkulensis | Issyk-Kul Minnov | e | |
Tinca tinca | Tench | + | |
Gobio gobio latus | Issyk-Kul Gudgeon | e | |
Schizothorax pseudoaksaiensis issykkuli | Issyk-Kul Marinka | e | |
Diptychus maculatus | Scaly Osman | e | |
Gymnodiptychus dybowskii | Naked Osman | e | |
Alburnoides taeniatus | Striped Bystranka | + | |
Abramis brama orientalis | Oriental Bream | + | |
Cyprinus carpio | Common Carp | o | |
Ctenopharyngodon idella | Grass Carp | + | |
Hypophtalmichtys molitrix | Silver Carp | + | |
Carassius auratus auratus | Goldfish | + | |
Pseudoraspora parva | Stone Moroko | + | |
Capoeta capoeta capoeta | Transcaucasian Barb | +? | |
Triplophysa stoliczkai | Tibetan Stone Loach | e | |
Triplophysa stoliczkai elegans | Tyanschan Loach | e | |
Triplophysa dorsalis | Grey Loach | e | |
Triplophysa strauchii strauchii | Spotted Thicklip Loach | e | |
Triplophysa labiata | Plain Thicklip Loach | + | |
Triplophysa ulacholicus, including T.u. dorsaloides | Issyk-Kul Naked Loach | e | |
Sander lucioperca | Pike-perch | + | |
Micropercops cinctus | Eleotris or Odontobutid | + | |
Glyptosternum reticulatum | Turkestan Catfish | e | |
Aspius aspius | Asp | +? | |
Coregonus albula | Vendace (Ryapushka) | +? | |
Coregonus peled | Peled | +? |
e= Indigenous, += Introduced, o= not known if indigenous, and +?=not known if the introduction failed
List of fish species in the Issyk-Kul Lake [15].
Issyk-Kul Lake and incoming rivers have five indigenous and one alien loach species which are common in littoral underwater meadows, but are also found down to 100 m depth. They feed on benthos, plankton and eggs of other fish [13]. They have never been recorded by name in the catch of the lake except maybe in the “others” component. Subspecies would urgently require taxonomic revision, especially Triplophysa ulacholicus versus Triplophysa u. dorsaloides which are here synonymized.
The Issyk-Kul Gudgeon Gobio gobio latus spawns in June-July in shallows and feeds on benthos, detritus and fish eggs. It is preyed upon by Spotted Thicklip Loach Triplophysa s. strauchii, Sevan Trout Salmo ischchan and Pike-perch Sander lucioperca [13]. Again this fish has no commercial value and falls into “others” category in fish statistics.
Issyk-Kul Minnov Phoxinus issykkulensis is one of the strictly endemic fish species of Issyk-Kul Lake, but unfortunately there is no data on biology or abundance as it has never been important in commercial fishery.
Common Carp Cyprinus carpio is a widespread freshwater fish which has been introduced from Asia to every part of the world and it is included in the list of the world’s worst invasive species. Many people in Kyrgyzstan, however, see Common Carp as indigenous calling it ‘wild form’ of Common Carp (Sazan). Most likely it was also introduced into the lake, but probably during the ancient times. It was known to be cultured in Kyrgyzstan at least since 1852 [15]. If accepting the ‘wild origin’ then the Issyk-Kul populations can be considered vulnerable to extinction.
Issyk-Kul Marinka Schizothorax pseudoaksaiensis issykkuli is an endemic species, which reaches 70 cm and a weight of 8 kg, and spawns from May until mid-July on rocky substratum in shallows near aquatic plants (Figure 3.). Its fecundity is 25,000 per year. It is omnivorous. Between 1985 and 1989 it formed 6% of the fish catch but after 1992 it disappeared completely (Table 2).
At least 19 species have been introduced to the lake by humans, either on purpose or accidentally. Introduction of Vendace Coregonus albula and Peled Coregonus peled failed, and also survival of Asp Aspius aspius and Transcaucasian Barb Capoeta c. capoeta is doubtful as these species have not been reported recently. Formerly, the small Issyk-Kul Dace was the major item in fish catches, where it represented about 90% of total biomass. It was, however, considered to have a low value and this led to a proposal to introduce new fish species into the lake [16]. The introduction of the Sevan Trout from Armenia was recommended and, in 1930, 755,000 fertilised eggs were released, followed in 1936 by a further 800,000. Until 1964 Sevan Trout remained rare in the lake due to the shortage of suitable spawning grounds (Figure 4.). At its best, 1976 and1979, 51,6 and 53,8 tonnes of Sevan Trout were captured from the Issyk-Kul Lake. This was mainly due to state owned hatcheries which released 79 million fry into the lake during the 1970s [17].
One of the few Issyk-Kul Marinka captured alive during the study 2008-2011. Photo: Azat Alamanov
Fish species | 1955- 1964 | 1965- 1969 | 1970- 1974 | 1975- 1979 | 1980- 1984 | 1985- 1989 | 1990- 1994 | 1995- 1999 | 2000- 2003 |
Schmidt’s Dace | 482 | 225 | 544 | 496 | 292 | 241 | 223 | 105 | 44 |
Issyk-Kul Dace | 9586 | 10741 | 9147 | 5736 | 1123 | 1064 | 790 | 94 | 12 |
Issyk-Kul Marinka | 263 | 39 | 16 | 3 | 34 | 138 | 1 | - | - |
Osmans* | 114 | 10 | 13 | 17 | 10 | 10 | 19 | - | - |
Common Carp | 75 | 85 | 29 | 5 | 7 | 32 | 22 | - | - |
Whitefish | - | - | 1 | 35 | 106 | 248 | 163 | 57 | 11 |
Sevan Trout | 0.3 | 30 | 123 | 457 | 244 | 206 | 91 | 29 | 1 |
Pike-perch | - | 287 | 1364 | 895 | 340 | 320 | 227 | 25 | 1 |
Others | 46 | 51 | 51 | 15 | 15 | 48 | 98 | 74 | 12 |
Total | 10566 | 11468 | 11288 | 7659 | 2171 | 2307 | 1634 | 384 | 81 |
Original data from Fisheries Department/Mairam Sarieva. Note that the data is in centners of Soviet Union. One centner is 100 kg not one ton as so often misquoted in previous publications. *Osmans = Scaly- and Naked Osman
Fish catch from the Issyk-Kul Lake in five year averages from 1955-2003.
Sevan Trout is a colourful fish which grows well in the Issyk-Kul Lake. Photo: Azat Alamanov
Following its introduction, Sevan Trout became an active predator of other fish in the lake and developed several special features. Its growth rate was 4 to 6 times that in Sevan Lake in Armenia. In Issyk-Kul it grows to a bigger size. It matures earlier, and its fecundity has increased five-fold to 3,300-17,300 eggs per fish. The limiting factors for this species in Issyk-Kul are food resources and habitat for reproduction.
In the 1950s, there were further introductions of fish species in order to establish diverse stocks of piscivorous fish, introduce species feeding on phytoplankton and aquatic macrophytes, and increase the number of benthos- and plankton-feeding fish [18]. Pike-perch, Tench Tinca tinca and Oriental Bream Abramis brama orientalis were introduced in 1954-1956 [19]. They became established predominantly in the eastern part of the lake but started soon spreading all over the lake. The introduction of Grass Carp Ctenopharyngodon idella and Silver Carp Hypophtalmichtys molitrix, and with them inadvertent introductions of Goldfish Carassius a. auratus, Stone Moroko Pseudoraspora parva and Eleotris Micropercops cinctus, were successful but caused a disaster to ‘wild carps’. Grass and Silver Carp brought infectious ascites of carps into the lake, and the numbers of ‘wild carps’ started to decrease due to disease.
In the early 1970s, a decision was taken to transform the lake into a trout-whitefish water body at the expense of the local Issyk-Kul Dace population. However, Whitefish Coregonus lavaretus is mainly a plankton and benthos feeder, but large individuals can occasionally take other fish. Common Whitefish was introduced from Lake Sevan, Valaam Whitefish or Ludoga Coregonus widegreni from Lake Ladoga, and from Lake Baikal came Arctic Cisco or Baikal Omul Coregonus autumnalis and Peled [20]. Eggs of Whitefish from Sevan Lake were transferred to the Ton hatchery from which four-day-old fry were released into Lake Issyk-Kul. During 1966-1973 87 million fry were released. In 1974, the first 500 kg of Whitefish were harvested from Tyup Bay.
There were also proposals to replace the Issyk-Kul Dace with the Peled and Vendace (Ryapushka), more nutritious food fish species. However, Peled and Vendace soon disappeared from the lake most likely due to lack of suitable reproduction conditions. Baikal Omul was still observed in the lake as of late 1970s. After that only Whitefish established itself as commercial fish and the highest catch recorded was 35.3 tonnes in 1989. After that the catch started to go down mainly due to reproduction problems and hatchery failure (Table 4).
Most harmful introduction took place accidentally from the cage culture of Rainbow Trout Onchorhynchus mykiss (Figure 5.). Since 1980s a lot of small and some large fish started to escape from the culture operations, and now Rainbow Trout is very common all over the lake, but especially near eight cage culture farms. Rainbow Trout moves to fish diet at the size of 35-40 cm [21]. It is not clear if Rainbow Trout would be able reproduce in some incoming rivers, as nowadays the lake seems to have all aged and sized Rainbow Trout.
The fish food base was successfully enriched by the introduction of mysids, which became targeted by the introduced coregonids. In the Issyk-Kul Lake the introduced Whitefish benefitted most of mollusks and mysids. Their growth rate was faster than that of the original stock in Lake Sevan, and they also started maturing at an earlier age. However, there was a high mortality of coregonid fry which found insufficient food in Issyk-Kul Lake and were heavily preyed on by the endemic fish. Also, it was believed that the higher salinity of Issyk-Kul as compared to Sevan and Baikal could have had a negative impact on fry [20]. The decision was made to stock advanced fry and fingerlings, of which millions were stocked in the following years (at least 10 million from 1977to 1988).
Amateur fishing in the lake started in the 1870s. At first it was unorganized and no statistics were collected. During the 1890s fish catch ranged between 17 – 105 tonnes [2]. At that time, and for many years after, the major commercial fish were Naked Osman, Common Carp, Issyk-Kul Marinka, Issyk-Kul Dace and Schmidt’s Dace. According to available information in 1941-1945 harvests of Issyk-Kul Marinka, Naked Osman and Common Carp reached 61 tonnes per year. During the same period Issyk-Kul Dace catch varied from 551 to 900 tonnes per year. It is important to note that during this period lake had already one alien predator, Sevan Trout, but catch of these had not started. It is also interesting to note that ‘native’ species did not start to go down after the introduction of Sevan Trout [11].
More detailed catch statistics are available from 1955 until the moratorium in 2003 (Table 2). As Table clearly illustrates, fish landings shrunk sharply after the Issyk-Kul Dace population collapsed. Increased fishing activity could be one explanation, but by the end of the Soviet period the state fishing industry was at its peak and encompassed only 300 fishers, 122 boats and 8640 nets [11]. This number of fishers is not much for that size of lake, and over fishing hardly explains alone the decline of fish catches. However, by 1990 Issyk-Kul landings were barely twenty percent of the levels recorded a quarter of a century earlier.
Introduction of one more predator, Pike-perch, into the lake 1954-56 had no immediate and dramatic impact on both lake landings and the species composition of landings. It seems that the production of the lake went down clearly only 1975 onwards, as in the transition from one feeding level (plankton and benthos feeders) to another (predatory fish); the lost feed coefficient against the productivity is approximately ten times. The Pike-perch production reached its peak in 1974, and then the production of Issyk-Kul Dace started to go down very fast as shown in Table 2. The role of Sevan Trout is not as clear as it achieved the highest population only in 1979 after Issyk-Kul Dace had already diminished to one third from the starting level.
Whitefish is often listed as predatory fish but its possible bad habits are not showing clearly in this material. Whitefish achieved the peak population in 1989 when all other species had started to go down very rapidly. Interesting is that the peak populations of Osmans occurred already 1961-62, when all predatory species recorded zero catch. Issyk-Kul Marinka had the peak population in 1988. Total disappearance of Osmans from the catch data happened 1988 and that of Issyk-Kul Marinka 1993. Disappearance of Common Carp took place during the same year (1993) as that of Issyk-Kul Marinka. The predators alone cannot explain these losses, and it is difficult to see any direct relation with fishing either.
Issyk-Kul Dace made 94 per cent of the catch in 1965-69 and less than 15 per cent in 2000-2003, same time the total catch of the lake went down from 1147 to 8 tonnes. Issyk-Kul Marinka, Naked Osman and Sazan Carp are those commercial and indigenous species which are most seriously endangered.
The contribution of fishing to annual average income of Issyk-Kul district families is from 5 to 10 per cent and only for some small groups up to 30%. The monthly income of fishermen is not more than 40 USD and that of women processing the fish 54 USD. Women’s income is little higher than men’s [22]. Although income from fish is small, it allows the families to have cash on daily basis and facilitates implementation of other cash requiring activities (purchase of seeds, forage for animal-breeding etc.). For more details see [15].
In historical terms the water level of Issyk-Kul Lake has obviously fluctuated. Some changes are gradual, others sudden and disastrous since they were caused by earthquakes and torrents of water rush from mountains. Large ancient city has been located at depths of 5 to 10 meters near the north coast of the lake, but it was destroyed maybe some 2500 years ago by one of the many local floods which are known to occur every 500 to 700 years [23]. Between years 600 and 1200 AD Issyk-Kul shoreline was again some 500 m lower and after that in the fifteenth century the water level of the lake was more than 10 m higher than it is today.
On Issyk-Kul basin118 rivers and streams flow toward the lake, but only 42% of them actually drain into it and 25% have discharges less than 1 m3 s-1. Only 9% of these are rivers with catchment areas of more than 300 km2. Rivers are fed predominantly by melt water from glaciers and snow above 3300 m. The river system reflects also the distribution of rainfall in the basin with low precipitation in the west, where the river system is poorly developed. In the east, where the precipitation is heavier, the hydro-network is denser and the rivers fuller. The greatest volume of flow comes through rivers on the basin’s eastern side. Water from most of the rivers has been completely diverted for irrigation before it enters the lake. Therefore, bays in the northern and western coast suffer from increased mineralization. The rivers supply the lake with 3720 million m3 of water per year [24], but they are not the only water supply the lake is receiving. The annual surface water discharge, precipitation and groundwater discharge to the lake are 21, 29, and 33cm respectively; the evaporation from the lake is 82cm. For more details see [7].
Until 1985 water level in Issyk-Kul was falling. Between 1876 and 1972 the decrease was 9 m [25]. During 1960-1979 when the fish catches started to decrease clearly the total decline of water level was 140 cm, at an average rate of seven cm annually. That loss of water level has been one important factor affecting the fish stocks and fisheries.
Uptake of water for irrigation is one of the factors seen to be responsible for the present changes in the water level. Irrigation also hinders the river spawning of many species, as it prevents small rivers and streams from reaching the lake. Irrigation has led to drying and silting up of spawning grounds and death of the fry themselves as they are poured out with the river water to irrigated fields. During 1960-1979 irreversible uptakes of water from rivers for irrigation reduced the volume of river water entering the lake by an estimated 23 per cent [6]. While in 1930 there were 50,000 ha of irrigated area in the Lake Issyk-Kul catchment, by 1980 the irrigated area reached 154,000 ha [26]. However, even without this irrigation loss, the lake would still have declined at the rate of five cm annually between 1960 and 1979 [27]. This would indicate that climatic factors have also been involved in the fall of the water level.
This chapter is based on my field work and data collection in Kyrgyzstan when working in the UNDP/GEF Project: “Strengthening policy and regulatory framework for mainstreaming biodiversity into fisheries sector” as International Fishery Policy Adviser 2008 to 2009 and in the FAO Project GCP/KYR/003/FIN: “Support to Fishery and Aquaculture Management in the Kyrgyz Republic” 2009-2010 as Technical Advisor. A lot of generally unknown ‘grey literature’ in Russian has been translated and used in this text. More information about interviews and field experimentation is documented in [15].
It is obvious that Sevan Trout and Pike-perch introductions can be blamed for the reduction in catch. There is clear positive correlation between Sevan Trout and Pike-perch and Schmidt’s Dace catch (Table 3). Interestingly Sevan Trout has correlation also with Pike-perch catch. There is positive correlation between prey species, like between Marinka and Osmans as well as between Issyk-Kul and Schmidt’s Dace and Sazan Carp, but only Whitefish has strong negative correlation with Issyk-Kul Dace.
Species | Pike- perch | Sevan Trout | White- fish | Marinka | Osmans | Issyk-Kul Dace | Schmidt’s Dace | Sazan Carp |
Pike-perch | 1 | ,535 | -,056 | -,176 | ,088 | ,268 | ,559 | -,012 |
Sevan Trout | ,535 | 1 | ,311 | ,013 | ,036 | -,136 | ,339 | -,193 |
White-fish | -,056 | ,311 | 1 | ,143 | -,316 | -,574 | -,209 | -,149 |
Marinka | -,176 | ,013 | ,143 | 1 | ,492 | ,116 | ,124 | -,288 |
Osmans | ,088 | ,036 | -,316 | ,492 | 1 | ,480 | ,424 | ,261 |
Issyk-Kul Dace | ,268 | -,136 | -,574 | ,116 | ,480 | 1 | ,622 | ,468 |
Schmidt’s Dace | ,559 | ,339 | -.209 | ,124 | ,424 | ,622 | 1 | ,065 |
Sazan Carp | -,012 | -,193 | -,149 | ,288 | ,261 | ,468 | , 065 | 1 |
Red marked numbers correlation is significant at the 0.01 level (2-tailed). Green marked have correlation at the 0.05 level.
Correlations of main fish species in the catch between 1955 and 2003.
But correlation does not necessarily mean causation. This far the introduction of predatory fish species has been seen as the major if not the only reason why native fish stocks collapsed [11,15,28]. This same conclusion was also made in Africa, where introduction of the Nile Perch Lates niloticus was believed to have caused the greatest vertebrate mass extinction in recorded history [29,30]. Approximately 150 different species of Haplocromis chicklids became extinct in recent times in Lake Victoria. Now, however, reevaluated data shows that Nile Perch did not really succeed until, and after, its prey (the haplochromines) had disappeared. The increased eutrophication of the lake and oxygen problems may explain more the diversity changes than the single species predation or fisheries exploitation [31,32].
This does not mean that one should support the alien introductions, and precautionary principles are necessary. Precautionary principle states that one has to expect that new introduced species, although in closures or in the cage, tend to escape for one reason or another into nature [33]. Any new voluntary alien introduction should be understood with that background and the rule should be clear: that new human introduced alien species are not allowed to enter into the country. Still it remains inevitable that some invasive species will arrive without any help of humans.
As shown in Table 4 there are seven clear reasons which could explain the loss of fish stocks and biodiversity and another five reasons which have had at least some negative impact. The reviews and field work highlights that all these twelve negative factors have been present more or less at the same time, so it is not possible to single out any one of them as the most important. Surely they have rather caused the loss of fish stocks and biodiversity together. Some of these factors have been listed already before, like over fishing, disintegration of the Soviet Union, irrigation and water level. Over fishing of Issyk-Kul Dace stocks by the Soviet fleet based at Issyk-Kul Lake was presented in [11]. The disintegration of the Soviet Union had profound economic and social effects, especially in the fisheries sector of the newly independent transition economies [28]. Nowhere were these production shortfalls bigger than in Kyrgyzstan where Lake Issyk-Kul fish landings were in 2003 less than 7 per cent of the catch level recorded in 1989. The major consequence for the fisheries sector was the spectra of uncertainty which included the uncertainty of, how the sector was to be managed, how access to water bodies was to be regulated, how to maintain the backward and forwards supply chains which underpinned pond aquaculture, and livelihoods – as the Soviet guarantee of job security was rescinded. Many experts and professional fishers left the sector to find employment in other sectors in Kyrgyzstan or abroad. Intensive irrigation led to reduced water levels in the lake and more importantly heavy water abstraction caused drying of many incoming streams that the endemic fish species previously used for feeding and/or spawning [34,35]. It has been shown that biological productivity of Lake Issyk-Kul decreased from 1973 to 1981 when the water level was declining at a rate of 7-10 cm per year [36].
Reproduction of many alien and endemic fish species was severely constrained by the limited number of suitable spawning rivers. As a consequence, the state established hatcheries on the Ton (1964) and Karakol (1969) rivers – with the brief to capture spawning fish, extract the eggs, raise the fry-fingerlings produced, and thence restock the lake. According to [7], the minimum Sevan Trout return in landings is given as 2% of releases; that means that at least 750,000 fry, each of 1 g weight, must be produced and released annually. Assuming an egg mortality of 50%, hatcheries should produce 1.5 million eggs per year. Ton hatchery produced 9 million fry annually in 1989-91. After the breakdown of the former Soviet Union the state hatchery production went down sharply. Over the period 2004-8 Ton Hatchery continued to restock the lake with Sevan Trout at much reduced rate, 446,000 fingerlings annually. Nowadays (2010) Ton Hatchery is able to release some 900,000 fry with 40% egg survival. No endemic fish fry have been produced despite of the capacity, but Rainbow Trout fingerlings have been produced on a contractual basis for the cage farmers.
Estimated IMPACT | Strong negative impact | Some negative impacts | Not visible impact | Some positive impacts |
Introduction of alien fish species | Yes | |||
Introduction of alien food species | Yes | |||
Over fishing | Yes | |||
Illegal fishing | Yes | |||
Disintegration of the Soviet Union 1991 | Yes | |||
Cage culture | Yes | |||
Moratorium | Yes | |||
Hatchery failure | Yes | |||
Tourism | Yes | |||
Water level | Yes | |||
Irrigation | Yes | |||
Water pollution | Yes | |||
Climate change | Yes | |||
Radioactive leakage | Yes | |||
Military activities | Yes | |||
Mining activities | Yes |
Impact evaluation of different natural and anthropogenic factors on fish stocks and biodiversity
During 1966-1973 over 12 million Whitefish fry were released annually from the state-owned Karakol Hatchery, but 1977-1988 fingerling production went down to 1 million per year. After privatization Karakol Hatchery has been able to produce below 2.5 million Whitefish fry annually, explaining why the collapsed Whitefish stocks are not recovering, as obviously very little or no natural reproduction takes place in the lake.
These drastic declines in restocking have undoubtedly been one contributor to the decrease in recorded fish landings at Issyk-Kul Lake.
The cage farming of Rainbow Trout started in 1988 by Alfa Laval Avose, but was not economically viable due to the high cost of feed. Obviously large number of fingerlings escaped into the lake, when the storm was turning the experimental cages around. In 1989-90 the company was able to produce 20 tonnes of Rainbow Trout. After the collapse of the USSR there was no development of this activity before 2006 when Ecos International commenced cage farming activities at Issyk-Kul Lake. Since that time exponential growth in trout culture has taken place.
Nowadays the existing eight cage farms and their 26 cages (as in April, 2009) are producing well over 300 tonnes of Rainbow Trout per year [37]. This production is causing pollution in the form of medicaments used for the treatment and prevention of diseases and pathogenic bacteria and parasites. By authorizing lake-based cage culture of Rainbow Trout, the authorities are allowing inevitable eutrophication. Extra nitrogen and phosphorus from unused feed will add to the primary production of algae and lower oxygen level. The second problem is the excess feed which sinks to the bottom of the lake through the net cages. At the bottom, sinking feed and faeces and urine of fish will cause the formation of hydrogen sulfide gases harming the other users and fauna of the lake. The worst, however, is the unwanted new continuous introduction of that predatory fish to the lake, because especially large specimen can and will escape the cages. After that they move around the lake and eat a lot of endemic fish species. According to fishermen (personal interviews in 2009) Rainbow Trout is the main predator in the lake, even more predacious than Pike-perch, because it comes to prey in shallow waters near the shoreline while Pike-perch often remains in the deeper waters (Figure 5).
In order to protect the decreased fish stocks in the lake, the President of the Kyrgyz Republic declared a Moratorium for Artisanal and Commercial Fish catching for a period of 10 years (2003-2013). The need for total ban was stated to be illegal and over-fishing which was seen as the only reason to loss of fish resources and endemic species. But the moratorium can become an effective measure of restoration of fish resources in the lake only if mechanism of implementation and realization (fish inspections etc.) is developed as well. Otherwise the moratorium will not work. Despite of the total ban at least 500 people continue their activity as illegal fishermen. On average they are catching 5 to 20 kg per night, but every fourth night is stormy making artisanal fishing with small boats impossible. If fishing 100 nights per year, they are catching between 250 and 1000 tonnes per year. Should this fairly conservative estimation be true, the lake is fished at level of 0.4 - 1.6 kg/ha. So this hardly can be seen as over fishing in a lake where theoretical production capacity is estimated at 4.5 kg/ha [34]. This of course by assuming that the fish stocks have in the last ten years recovered from 2003 level after the total ban of commercial fishing.
It is far too easy to blame over-fishing that some species became nearly extinct and that fewer fish are caught. More important problem is the absence of any fisheries management and lack of controlled protection of fish resources. Removal of the fishing ban is necessary, since it cannot be controlled and monitored. Exploring co-management arrangements is a
This kind of 7 kg Rainbow Trout eats a lot of small indigenous fish species. Photo: Azat Alamanov
better option than command and control as the resources are not available for such policy measure [15]. If more than 500 people are continuing fishing despite of the moratorium, the policy needs to be evaluated for better stewardship outcomes. Actually it is far more important to continue to fish large predatory fish, if having any concern of the survival of native non-predatory species.
However, commercial fishing needs to be reconsidered after the moratorium, in 2014, as recreational and food fishery may be far more sustainable. Due to the growing importance of Lake Issyk-Kul for recreation, fishery management might go in the direction of producing valuable sport and recreational fish to satisfy the demand of the tourists and visitors. Such recreational fishing will basically target large predatory species- Rainbow and Sevan Trout and Pike-perch, which are the favourite species for sport fishing. Recreational fishing of large fish will promote Issyk-Kul Lake as more attractive for tourists. It will also help fishery managers to shift proportions of predator and prey fish species and diminish the negative effect of alien predators towards vulnerable stocks of endemic fish species.
Rare indigenous species stocks will not improve without artificial propagation in local hatcheries. This production of fry has started through UNDP/GEF Project, but stocking the lake with small indigenous is not viable before considerable harvesting of large predators. The number of Sevan Trout is easy to regulate, as it mostly depends on the stocking rate of fingerlings into the lake and these are reproduced artificially in the hatcheries. Rainbow Trout and Pike-perch are more difficult to remove if they are able to multiply in the lake (Figure 6.). Improved stocks of small endemics could take care of these predators by eating their eggs and small fry, like small fish did in Lake Victoria by preying on eggs and fry of the Nile Perch [32].
Widespread mining operations are causing disruption of soils, terrain and water tables but more serious water pollution comes from illegal dumping or storing of toxic chemicals currently in use at Kumtor gold mine, in Tian Shan Mountains. It is the largest gold mine, as well as a major government revenue source, which routinely ignores national environmental legislation. Kumtor mine reportedly uses up to 10 tonnes of cyanide per day in its mining operations, and number of chemical constituents is released into the environment [38]. By sure this is affecting fish populations downstream and the health of local people using the contaminated water or fish (Table 5).
One of the worst regional environmental disasters in recent history occurred on 20 May 1998, when a truck hauling toxic chemical crashed just upstream from the mouth of the Barkuum River, which empties into Lake Issyk-Kul. As a result, 1762 kg of sodium cyanide, a chemical used in the processing of gold ore at Kumtor, were dumped into basin waters [6].
Lack of both adequate infrastructure and financial means to support public utilities (let alone any resort or tourism industry) has made it impossible to improve wastewater treatment plants. This in turn leads to further pollution and unwise use of lake waters. The gradual increase in settlements and industries around the lake has led to an increase in pollution. Although most enterprises have wastewater treatment facilities they are not efficient and some effluents still reach the lake.
Artificial nests are used to remove the eggs of Pike-perch from the lake to reduce the numbers of that alien predatory fish. Photo: Azat Alamanov
Agriculture, through the use of fertilizers and pesticides, also contributes to the lake pollution, but level of fertilizer application on crop fields is known to be moderate. However, the Issyk-Kul area produces 12% of total national cereal crops and over 40% of potato crops. Of the total area of orchards nationwide, 20% are in Issyk-Kul. Numbers of domestic animals in the catchment area is very high: Cattle 163,500; sheep 1,944,400; swine 32,700; poultry 623,400 and horses 48,500 [39]. Dairy product processing covers 50% of the national dairy product supply. Animal breeding is growing, with average annual sheep and cattle surplus at 5-6%. Grazing land is overloaded by 1.5 times its capacity (Figure 7). Grazing practices have changed so that all livestock owned by small proprietors are now grazing near the lake as the farmers have no transport nor money to drive their animals upland to outlaying pastures. That could cause social conflict (grazing on beaches and resort areas) and eutrophication of the lake but luckily people are commonly collecting the manure for fire or fertilizing. While the large volume of 1738 km3 of water in the lake may have at present considerable diluting capability and with the good water mixing is also able to quickly oxygenate organic matter inputs to the lake, sheltered shallows are subject to eutrophication. As the shallows are also important spawning and feeding areas for a number of fish, such eutrophication may affect especially those cold-water fish species which require pristine waters, like Whitefish.
These camels are the only memory of the Silk Road at the Issyk-Kul Lake coast of which is heavily overgrazed by the domestic animals. Photo: Azat Alamanov
Eutrophication caused by birds is not often considered as a problem, but in the Issyk-Kul Lake the amount of migratory birds is such that it will affect the lake. Anywhere from 44,000 to 68,500 birds belonging to 30 to 35 species winter on the lake, and even more birds use it as stopover and feeding place during spring and autumn migration [40].
Within the Issyk-Kul basin there are 834 glaciers of various sizes ranging from less than 0.1 to 11 km2. For example, a typical Issyk-Kul glacier Karabatkak has shown in long-term study between 1957 and 1997 that ice loss exceeded snow mass gain by 18 m.
This thinning of ice is due to climate change, summers have been 0.6 degrees Celsius warmer, although the annual average temperature has risen only 0.2 degrees Celsius. Based on this it was calculated that before 2005 overall glaciation area near the lake will go down 32% on the northern slopes and even 77% on south-facing slopes [41].
The continuing retreat of glaciers in the Issyl-Kul catchment, the melt water from which is one of the major contributors to the lake, seems to be going in parallel with the declining lake water level [42]. Without glacial runoff overall drop in lake’s water level would have been much greater.
The Kyrgyz Republic is within a high seismic activity area, and Issyk-Kul is a tectonic lake, and the lake bottom is believed to have numerous warm water springs. These explain partly why the lake never freezes over, except in the shallow Rybachinsky and Tyup bays. The water stays warmer than the air for seven months per year [39].
Hot springs at the lake and on the bottom change water quality and may facilitate winter spawning of some introduced species like Pike-perch. During the test fishing in early April 2009 we found after opening a 40 cm Pike-perch that it had preyed another (12 cm), and even that small prey had eaten a few juvenile Pike-perch, not more than 5 cm long. It was estimated that these 3rd level victims of cannibalism must have been born early February. This kind of winter spawning is not known before but could obviously take place due to the hot springs.
There are recent reports on the radioactive waste contamination in Central Asia [43] showing that the situation is critical especially in Kyrgyzstan, with 36 tailing sites and 25 uranium dumps in the country. Kadzhi-Say, the country’s largest tailing site (containing 150,000 m3 of radioactive waste), is located barely 1.5 km from Lake Issyk-Kul. Yet although some information is available on the impact of radioactivity on humans, it is not well studied or understood what direct impact the current radioactivity levels have on the aquatic biodiversity in Kyrgyzstan. The monitoring of water bodies for radioactivity is not done consistently and to date, no assessment has been made of the uranium contamination of fish populations of indigenous species and its consequences for fish stocks in Kyrgyzstan. It is not clear, if and how much radioactive waste has already gone into the lake or still goes from incorrectly closed tailing sites and uranium dumps.
During the Soviet period, the USSR Navy operated an extensive facility at the lake’s eastern end, where submarine technology was evaluated. Also Navy tested torpedoes built in Tashkent. Not known if torpedoes exploded in these experiments. If so this must have killed a lot of fish. In 2008 Kyrgyz newspapers reported that Russian Navy is planning to establish a new naval testing facility around the Karabulan peninsula on the lake. This may affect the fish stocks in the future depending on the tests undertaken.
During the Soviet era, the lake became a popular vacation resort, with numerous sanatoria, boarding houses and vacation homes along its northern shore. These fell on hard times after the break-up of the USSR, but from 2005 onwards hotel complexes are being refurbished and simple private bed- and –breakfast pensions are being established for a new generation of health and leisure visitors (Table 5).
Tourism has become one of the most dynamically developing sectors of economy of the Kyrgyz Republic. The number of arrivals of foreign tourists is expected to exceed 2 million persons per year. International tourists are primarily from Kazakhstan, Russia, and Uzbekistan. If half of these tourists will visit the lake, there is need to accommodate an additional 1 million persons per year at the lake in hotels using natural beaches for recreation. Nowadays the lake has 343 tourist enterprises, including cafes and restaurants.
Years | Population in ‘000 | Visitors in hotels | Visitors at homes | Total in ‘000 |
2006 | 430 | 198 | 296 | 924 |
2007 | 433 | 199 | 245 | 877 |
2008 | 435 | 194 | 349 | 978 |
2009 | 438 | 169 | 318 | 925 |
2010 | 441 | 181 | 227 | 849 |
2011 | 445 | 185 | 231 | 861 |
Permanent population and annual visitors in the Lake Issyk-Kul area [44].
In addition, large hospitals have been built to use medicinal mud and hot springs along the coasts for medicinal purposes. Regulations exist for water system supply and to use fully purified sewerage systems. Recycled waste water could be used for irrigation. Unfortunately, some entrepreneurs have forgotten these regulations, and continue to pollute the lake as no corruption free control exists.
Asian Development Bank study [45] has concluded that available water and sanitation and waste disposal infrastructure in the Issyk-Kul area is decrepit, dysfunctional, poorly managed and negatively impacts the surrounding environment. The planned tourist influx equivalent to four times the resident population applies excessive pressure on the existing infrastructure, which results in the pollution of the lake. The proposed Issyk-Kul Sustainable Development Project initiated by the Asian Development bank (ADB) would address the environmental and institutional issues around the Lake Issyk-Kul. The Japanese International Cooperation Agency will also develop the sewerage system and sewage treatment plant in Cholpon-Ata through parallel financing with the ADB.
Issyk-Kul Lake is the second largest high-altitude lake in the world providing recreational and small-scale fishing activities as well as cage culture of Rainbow Trout in the Kyrgyz Republic. The original fish fauna comprised twelve indigenous species and two subspecies particular for this lake. At least 19 species have been introduced to the lake by humans, either on purpose or accidentally. The populations of several indigenous fish are seriously threatened, because many of the introduced fish species are potential predators. Issyk-Kul Marinka, Naked Osman and Sazan Carp are those commercial and indigenous species which are most seriously endangered. In 1986 a total ban was declared for catching Naked Osman, but it did not lead to positive results, indicating that anthropogenic activities were not the only reasons for the suffering of the endemic fish species.
Fishers and most of the previous papers are convinced that the predatory fish species have been the most destructive to biodiversity. Addressing the introductions, the basic rule should be that new human introduced alien species are not allowed to enter into the lake. At least any further fish introductions into the lake should be carefully evaluated to prevent unwanted changes in fish stocks. Issyk-Kul, as an oligotrophic lake of low productivity, has a low carrying capacity for fish; hence it will never become a water body which would sustain high levels of fish stocks.
Dissolution of the Soviet Union explains to some extent the collapse of the fisheries sector (including the hatchery operations) in Kyrgyzstan, but maybe not the loss of biodiversity. Rapid growth in human activities with the development of tourism industry; irrigation; water eutrophication and pollution, and climate change impacts seem to be important root causes for loss of fish stocks and biodiversity degradation.
Uptake of water for irrigation is one of the factors seen to be responsible for the present changes in the water level as water from most of the rivers has been completely diverted for irrigation before reaching the lake. Irrigation hinders also the river reproduction of many species as it prevents spawning fish from entering the rivers or fry to return to the lake. During 1960-1979 when fish catches started to decrease the total decline of the water level was 140 cm, at an average rate of seven cm annually. That loss of water level has been one important factor affecting the fish stocks and fisheries.
There are recent reports on the radioactive waste contamination in Kyrgyzstan, where the country’s largest uranium tailing site is located barely 1.5 km from Lake Issyk-Kul. It is not clear, if and how much radioactive waste has already gone into the lake or still goes in from incorrectly closed tailing sites and uranium dumps. Maybe even more serious water pollution comes from illegal dumping or storing of toxic chemicals currently in use at a gold mine in Tian Shan Mountains. It is the largest gold mine, as well as a major government revenue source, which routinely ignores national environmental legislation. This mine uses daily up to 10 tonnes of cyanide in its operations, and many of toxic chemicals are released into the environment. This is surely affecting fish populations downstream and the health of local people using the contaminated water or fish for drink and food.
Existing water and sanitation and waste disposal infrastructure in the Issyk-Kul area is decrepit, dysfunctional, poorly managed and has negative impacts on the environment. The planned tourist influx equivalent to four times the resident population will apply excessive pressure on the existing infrastructure, which will result in further pollution of the lake.
Important problem is the total absence of any fisheries management and lack of controlled protection of fish stocks and diversity. Fishing ban is not helpful as it cannot be controlled and monitored. Exploring co-management arrangements is a better option than command and control as the resources are not available for such policy measure. Rare and endangered indigenous fish species will not increase without artificial propagation in local hatcheries. Stocking the lake with small indigenous species is not viable, if not first harvesting the large predators. Pike-perch would just eat small indigenous species and grow bigger and spawn more.
Over-fishing of introduced species, like Rainbow Trout and Pike-perch, could be a good thing. As popular food fish and recreational catch, they could be severely over fished. This could lead to population reduction, and several populations of endemic fish species should soon show signs of increasing numbers. So the authors should allow the local fishing communities capture large introduced fish species as much as they can rather than restricting them through moratorium.
Any new development initiatives must be consultative and participatory in order to be more consistent with local habits and cultural values. Inherited customs provide an important element for the development of locally based resource management system. Consequently, allocation and sustainable management of natural resources is one of the key issues for the local population, whose daily cash economy is directly dependent on availability of fish resources.
Before being able to define the best management ways, one needs further research in taxonomy and fish biology. The knowledge of fish stock parameters is essential for the determination of appropriate fisheries management and definition of sustainable fish yield. Impact of mining toxins and radioactive waste is important to study and to know for control measures. Water pollution is a continuous risk for this important lake, and has to be halted in the future. The Kyrgyz public must be engaged in the future through environmental education in conservation and preservation of natural and cultural riches of the Issyk-Kul area.
My deepest gratitude goes to Mrs. Burul Nazarmatova who translated all grey Russian literature into English and collected a lot of relevant Soviet time data from the internet. Mrs. Mairam Sarieva from the Fisheries Department of the Kyrgyz Republic assisted me in collecting the fish catch statistics of the Issyk-Kul Lake. Some unpublished data from fish stocks was also given by Dr Muchtar Alpiev from the Kyrgyz Academy of Science. Last but not least I want to give my special thanks to Dr Ahmed Khan, Memorial University of Newfoundland, Canada, who’s useful and critical suggestions improved this chapter a lot.
In recent years, researchers in the software engineering (SE) field have turned their interest to data mining (DM) and machine learning (ML)-based studies since collected SE data can be helpful in obtaining new and significant information. Software engineering presents many subjects for research, and data mining can give further insight to support decision-making related to these subjects.
Figure 1 shows the intersection of three main areas: data mining, software engineering, and statistics/math. A large amount of data is collected from organizations during software development and maintenance activities, such as requirement specifications, design diagrams, source codes, bug reports, program versions, and so on. Data mining enables the discovery of useful knowledge and hidden patterns from SE data. Math provides the elementary functions, and statistics determines probability, relationships, and correlation within collected data. Data science, in the center of the diagram, covers different disciplines such as DM, SE, and statistics.
The intersection of data mining and software engineering with other areas of the field.
This study presents a comprehensive literature review of existing research and offers an overview of how to approach SE problems using different mining techniques. Up to now, review studies either introduce SE data descriptions [1], explain tools and techniques mostly used by researchers for SE data analysis [2], discuss the role of software engineers [3], or focus only on a specific problem in SE such as defect prediction [4], design pattern [5], or effort estimation [6]. Some existing review articles having the same target [7] are former, and some of them are not comprehensive. In contrast to the previous studies, this article provides a systematic review of several SE tasks, gives a comprehensive list of available studies in the field, clearly states the advantages of mining SE data, and answers “how” and “why” questions in the research area.
The novelties and main contributions of this review paper are fivefold.
First, it provides a general overview of several SE tasks that have been the focus of studies using DM and ML, namely, defect prediction, effort estimation, vulnerability analysis, refactoring, and design pattern mining.
Second, it comprehensively discusses existing data mining solutions in software engineering according to various aspects, including methods (clustering, classification, association rule mining, etc.), algorithms (k-nearest neighbor (KNN), neural network (NN), etc.), and performance metrics (accuracy, mean absolute error, etc.).
Third, it points to several significant research questions that are unanswered in the recent literature as a whole or the answers to which have changed with the technological developments in the field.
Fourth, some statistics related to the studies between the years of 2010 and 2019 are given from different perspectives: according to their subjects and according to their methods.
Five, it focuses on different machine learning types: supervised and unsupervised learning, especially on ensemble learning and deep learning.
This paper addresses the following research questions:
RQ1. What kinds of SE problems can ML and DM techniques help to solve?
RQ2. What are the advantages of using DM techniques in SE?
RQ3. Which DM methods and algorithms are commonly used to handle SE tasks?
RQ4. Which performance metrics are generally used to evaluate DM models constructed in SE studies?
RQ5. Which types of machine learning techniques (e.g., ensemble learning, deep learning) are generally preferred for SE problems?
RQ6. Which SE datasets are popular in DM studies?
The remainder of this paper is organized as follows. Section 2 explains the knowledge discovery process that aims to extract interesting, potentially useful, and nontrivial information from software engineering data. Section 3 provides an overview of current work on data mining for software engineering grouped under five tasks: defect prediction, effort estimation, vulnerability analysis, refactoring, and design pattern mining. In addition, some machine learning studies are divided into subgroups, including ensemble learning- and deep learning-based studies. Section 4 gives statistical information about the number of highly validated research conducted in the last decade. Related works considered as fundamental by journals with a highly positive reputation are listed, and the specific methods they used and their categories and purposes are clearly expressed. In addition, widely used datasets related to SE are given. Finally, Section 5 offers concluding remarks and suggests future scientific and practical efforts that might improve the efficiency of SE actions.
This section basically explains the consecutive critical steps that should be followed to discover beneficial knowledge from software engineering data. It outlines the order of necessary operations in this process and explains how related data flows among them.
Software development life cycle (SDLC) describes a process to improve the quality of a product in project management. The main phases of SDCL are planning, requirement analysis, designing, coding, testing, and maintenance of a project. In every phase of software development, some software problems (e.g., software bugs, security, or design problems) may occur. Correcting these problems in the early phases leads to more accurate and timely delivery of the project. Therefore, software engineers broadly apply data mining techniques for different SE tasks to solve SE problems and to enhance programming efficiency and quality.
Figure 2 presents the data mining and knowledge discovery process of SE tasks including data collection, data preprocessing, data mining, and evaluation. In the data collection phase, data are obtained from software projects such as bug reports, historical data, version control data, and mailing lists that include various information about the project’s versions, status, or improvement. In the data preprocessing phase, the data are preprocessed after collection by using different methods such as feature selection (dimensionality reduction), feature extraction, missing data elimination, class imbalance analysis, normalization, discretization, and so on. In the next phase, DM techniques such as classification, clustering, and association rule mining are applied to discover useful patterns and relationships in software engineering data and therefore to solve a software engineering problem such as defected or vulnerable systems, reused patterns, or parts of code changes. Mining and obtaining valuable knowledge from such data prevents errors and allows software engineers to deliver the project on time. Finally, in the evaluation phase, validation techniques are used to assess the data mining results such as k-fold cross validation for classification. The commonly used evaluation measures are accuracy, precision, recall, F-score, area under the curve (AUC) for classification, and sum of squared errors (SSE) for clustering.
KDD process for software engineering.
In this review, we examine data mining studies in various SE tasks and evaluate commonly used algorithms and datasets.
A defect means an error, failure, flaw, or bug that causes incorrect or unexpected results in a system [8]. A software system is expected to be without any defects since software quality represents a capacity of the defect-free percentage of the product [9]. However, software projects often do not have enough time or people working on them to extract errors before a product is released. In such a situation, defect prediction methods can help to detect and remove defects in the initial stages of the SDLC and to improve the quality of the software product. In other words, the goal of defect prediction is to produce robust and effective software systems. Hence, software defect prediction (SDP) is an important topic for software engineering because early prediction of software defects could help to reduce development costs and produce more stable software systems.
Various studies have been conducted on defect prediction using different metrics such as code complexity, history-based metrics, object-oriented metrics, and process metrics to construct prediction models [10, 11]. These models can be considered on a cross-project or within-project basis. In within-project defect prediction (WPDP), a model is constructed and applied on the same project [12]. For within-project strategy, a large amount of historical defect data is needed. Hence, in new projects that do not have enough data to train, cross-project strategy may be preferred [13]. Cross-project defect prediction (CPDP) is a method that involves applying a prediction model from one project to another, meaning that models are prepared by utilizing historical data from other projects [14, 15]. Studies in the field of CPDP have increased in recent years [10, 16]. However, there are some deficiencies in comparisons of prior studies since they cannot be replicated because of the difference in utilizing evaluation metrics or preparation way of training data. Therefore, Herbold et al. [16] tried to replicate different CPDP methods previously proposed and find which approach performed best in terms of metrics such as F-score, area under the curve (AUC), and Matthews correlation coefficient (MCC). Results showed that 7- or 8-year approaches may perform better. Another study [17] replicated prior work to demonstrate whether the determination of classification techniques is important. Both noisy and cleaned datasets were used, and the same results were obtained from the two datasets. However, new dataset gave better results for some classification algorithms. For this reason, authors claimed that the selection of classification techniques affects the performance of the model.
Numerous defect prediction studies have been conducted using DM techniques. In the following subsections, we will explain these studies in terms of whether they apply ensemble learning or not. Some defect prediction studies in SE are compared in Table 1. The objective of the studies, the year they were conducted, algorithms, ensemble learning techniques and datasets in the studies, and the type of data mining tasks are shown in this table. The bold entries in Table 1 have better performance than other algorithms in that study.
Ref. | Year | Task | Objective | Algorithms | Ensemble learning | Dataset | Evaluation metrics and results |
---|---|---|---|---|---|---|---|
[18] | 2011 | Classification | Comparative study of various ensemble methods to find the most effective one | NB | Bagging, boosting, RT, RF, RS, AdaBoost, Stacking, and Voting | NASA datasets: CM1 JM1 KC1 KC2 KC3 KC4 MC1 MC2 MW1 PC1 PC2 PC3 PC4 PC5 | 10-fold CV, ACC, and AUC Vote 88.48% random forest 87.90% |
[19] | 2013 | Classification | Comparative study of class imbalance learning methods and proposed dynamic version of AdaBoost.NC | NB, RUS, RUS-bal, THM, SMB, BNC | RF, SMB, BNC, AdaBoost.NC | NASA and PROMISE repository: MC2, KC2, JM1, KC1, PC4, PC3, CM1, KC3, MW1, PC1 | 10-fold CV Balance, G-mean and AUC, PD, PF |
[20] | 2014 | Classification | Comparative study to deal with imbalanced data | Base Classifiers: C4.5, NB Sampling: ROS, RUS, SMOTE | AdaBoost, Bagging, boosting, RF | NASA datasets: CM1, JM1, KC1, KC2, KC3, MC1, MC2, MW1, PC1, PC2, PC3, PC4, PC5 | 5 × 5 CV, MCC, ROC, results change according to characteristics of datasets |
[17] | 2015 | Clustering/classification | To show that the selection of classification technique has an impact on the performance of software defect prediction models | Statistical: NB, Simple Logistic Clustering: KM, EM Rule based: Ripper, Ridor NNs: RBF Nearest neighbor: KNN DTs: J48, LMT | Bagging, AdaBoost, rotation forest, random subspace | NASA: CM1, JM1, KC1, KC3, KC4, MW1, PC1, PC2, PC3, PC4 PROMISE: Ant 1.7, Camel 1.6, Ivy 1.4, Jedit 4, Log4j 1, Lucene 2.4, Poi 3, Tomcat 6, Xalan 2.6, Xerces 1.3 | 10 × 10-fold CV AUC > 0.5 Scott-Knott test α = 0.05, simple logistic, LMT, and RF + base learner outperforms KNN and RBF |
[21] | 2015 | Classification | Average probability ensemble (APE) learning module is proposed by combining feature selection and ensemble learning | APE system combines seven classifiers: SGD, weighted SVMs (W-SVMs), LR, MNB and Bernoulli naive Bayes (BNB) | RF, GB | NASA: CM1, JM1, KC1, KC3, KC4, MW1, PC1, PC2, PC3, PC4 PROMISE (RQ2): Ant 1.7, Camel 1.6, Ivy 1.4, Jedit 4, Log4j 1, Lucene 2.4, Poi 3, Tomcat 6, Xalan 2.6, Xerces 1.3 | 10 × 10-fold CV, AUC > 0.5 Scott-Knott test α = 0.05, simple logistic, LMT, and RF + base learner outperforms KNN and RBF |
[22, 23] | 2016 | Classification | Comparative study of 18 ML techniques using OO metrics on six releases of Android operating system | LR, NB, BN, MLP, RBF SVM, VP, CART, J48, ADT, Nnge, DTNB | Bagging, random forest, Logistic model trees, Logit Boost, Ada Boost | 6 releases of Android app: Android 2.3.2, Android 2.3.7, Android 4.0.4, Android 4.1.2, Android 4.2.2, Android 4.3.1 | 10-fold, inter-release validation AUC for NB, LB, MLP is >0.7 |
[24] | 2016 | Classification | Caret has been applied whether parameter settings can have a large impact on the performance of defect prediction models | NB, KNN, LR, partial least squares, NN, LDA, rule based, DT, SVM | Bagging, boosting | Cleaned NASA JM1, PC5 Proprietary from Prop-1 to Prop-5 Apache Camel 1.2, Xalan 2.5–2.6 Eclipse Platform 2.0–2.1–3.0, Debug 3.4, SWT 3.4, JDT, Mylyn, PDE | Out-of-sample bootstrap validation technique, AUC Caret AUC performance up to 40 percentage points |
[25] | 2017 | Regression | Aim is to validate the source code metrics and identify a suitable set of source code metrics | 5 training algorithms: GD, GDM, GDX, NM, LM | Heterogeneous linear and nonlinear ensemble methods | 56 open-source Java projects from PROMISE Repository | 10-fold CV, t-test, ULR analysis Neural network with Levenberg Marquardt (LM) is the best |
[16] | 2017 | Classification | Replicate 24 CDPD approaches, and compare on 5 different datasets | DT, LR, NB, SVM | LE, RF, BAG-DT, BAG-NB, BOOST-DT, BOOST-NB | 5 available datasets: JURECZKO, NASA MDP, AEEEM, NETGENE, RELINK | Recall, PR, ACC, G-measure, F-score, MCC, AUC |
[26] | 2017 | Classification | Just-in-time defect prediction (TLEL) | NB, SVM, DT, LDA, NN | Bagging, stacking | Bugzilla, Columba, JDT, Platform, Mozilla, and PostgreSQL | 10-fold CV, F-score |
[13] | 2017 | Classification | Adaptive Selection of Classifiers in bug prediction (ASCI) method is proposed. | Base classifiers: LOG (binary logistic regression), NB, RBF, MLP, DT | Voting | Ginger Bread (2.3.2 and 2.3.7), Ice Cream Sandwich (4.0.2 and 4.0.4), and JellyBean (4.1.2, 4.2.2 and 4.3.1) | 10-fold, inter-release validation AUC for NB, LB, MLP is >0.7 |
[27] | 2018 | Classification | MULTI method for JIT-SDP (just in time software defect prediction) | EALR, SL, RBFNet Unsupervised: LT, AGE | Bagging, AdaBoost, Rotation Forest, RS | Bugzilla, Columba, Eclipse JDT, Eclipse Platform, Mozilla, PostgreSQ | CV, timewise-CV, ACC, and POPT MULTI performs significantly better than all the baselines |
[28] | 2007 | Classification | To found pre- and post-release defects for every package and file | LR | — | Eclipse 2.0, 2.1, 3.0 | PR, recall, ACC |
[8] | 2014 | Clustering | Cluster ensemble with PSO for clustering the software modules (fault-prone or not fault-prone) | PSO clustering algorithm | KM-E, KM-M, PSO-E, PSO-M and EM | Nasa MDP, PROMISE | |
[29] | 2015 | Classification | Defect identification by applying DM algorithms | NB, J48, MLP | — | PROMISE, NASA MDP dataset: CM1, JM1, KC1, KC3, MC1, MC2, MW1, PC1, PC2, PC3 | 10-fold CV, ACC, PR, FMLP is the best |
[30] | 2015 | Classification | To show the attributes that predict the defective state of software modules | NB, NN, association rules, DT | Weighted voting rule of the four algorithms | NASA datasets: CM1, JM1, KC1, KC2, PC1 | PR, recall, ACC, F-score NB > NN > DT |
[31] | 2016 | Classification | Authors proposed a model that finds fault-proneness | NB, LR, LivSVM, MLP, SGD, SMO, VP, LR Logit Boost, Decision Stamp, RT, REP Tree | RF | Camel1.6, Tomcat 6.0, Ant 1.7, jEdit4.3, Ivy 2.0, arc, e-learning, berek, forrest 0.8, zuzel, Intercafe, and Nieruchomosci | 10-fold CV, AUC AUC = 0.661 |
[32] | 2016 | Classification | GA to select suitable source code metrics | LR, ELM, SVML, SVMR, SVMP | — | 30 open-source software projects from PROMISE repository from DS1 to DS30 | 5-fold CV, F-score, ACC, pairwise t-test |
[33] | 2016 | — | Weighted least-squares twin support vector machine (WLSTSVM) to find misclassification cost of DP | SVM, NB, RF, LR, KNN, BN, cost-sensitive neural network | — | PROMISE repository: CM1, KC1, PC1, PC3, PC4, MC2, KC2, KC3 | 10-fold CV, PR, recall, F-score, G-mean Wilcoxon signed rank test |
[34] | 2016 | — | A multi-objective naive Bayes learning techniques MONB, MOBNN | NB, LR, DT, MODT, MOLR, MONB | — | Jureczko datasets obtained from PROMISE repository | AUC, Wilcoxon rank test CP MO NB (0.72) produces the highest value |
[35] | 2016 | Classification | A software defect prediction model to find faulty components of a software | Hybrid filter approaches FISHER, MR, ANNIGMA. | — | KC1, KC2, JM1, PC1, PC2, PC3, and PC4 datasets | ACC, ent filters, ACC 90% |
[36] | 2017 | Classification | Propose an hybrid method called TSC-RUS + S | A random undersampling based on two-step cluster (TSC) | Stacking: DT, LR, kNN, NB | NASA MDP: i.e., CM1, KC1, KC3, MC2, MW1, PC1, PC2, PC3, PC4 | 10-fold CV, AUC, (TSC-RUS + S) is the best |
[37] | 2017 | Classification | Analyze five popular ML algorithms for software defect prediction | ANN, PSO, DT, NB, LC | — | Nasa and PROMISE datasets: CM1, JM1, KC1, KC2, PC1, KC1-LC | 10-fold CV ANN < DT |
[38] | 2018 | Classification | Three well-known ML techniques are compared. | NB, DT, ANN | — | Three different datasets DS1, DS2, DS3 | ACC, PR, recall, F, ROC ACC 97% DT > ANN > NB |
[10] | 2018 | Classification | ML algorithms are compared with CODEP | LR, BN, RBF, MLP, alternating decision tree (ADTree), and DT | Max, CODEP, Bagging J48, Bagging NB, Boosting J48, Boosting NB, RF | PROMISE: Ant, Camel, ivy, Jedit, Log4j, Lucene, Poi, Prop, Tomcat, Xalan | F-score, PR, AUC ROC Max performs better than CODEP |
Data mining and machine learning studies on the subject “defect prediction.”
Ensemble learning combines several base learning models to obtain better performance than individual models. These base learners can be acquired with:
Different learning algorithms
Different parameters of the same algorithm
Different training sets
The commonly used ensemble techniques bagging, boosting, and stacking are shown in Figure 3 and briefly explained in this part. Bagging (which stands for bootstrap aggregating) is a kind of parallel ensemble. In this method, each model is built independently, and multiple training datasets are generated from the original dataset through random selection of different feature subsets; thus, it aims to decrease variance. It combines the outputs of each ensemble member by a voting mechanism. Boosting can be described as sequential ensemble. First, the same weights are assigned to data instances; after training, the weight of wrong predictions is increased, and this process is repeated as the ensemble size. Finally, it uses a weighted voting scheme, and in this way, it aims to decrease bias. Stacking is a technique that uses predictions from multiple models via a meta-classifier.
Common ensemble learning methods: (a) Bagging, (b) boosting, (c) stacking.
Some software defect prediction studies have compared ensemble techniques to determine the best performing one [10, 18, 21, 39, 40]. In a study conducted by Wang et al. [18], different ensemble techniques such as bagging, boosting, random tree, random forest, random subspace, stacking, and voting were compared to each other and a single classifier (NB). According to the results, voting and random forest clearly exhibited better performance than others. In a different study [39], ensemble methods were compared with more than one base learner (NB, BN, SMO, PART, J48, RF, random tree, IB1, VFI, DT, NB tree). For boosted SMO, bagging J48, and boosting and bagging RT, performance of base classifiers was lower than that of ensemble learner classifiers.
In study [21], a new method was proposed of mixing feature selection and ensemble learning for defect classification. Results showed that random forests and the proposed algorithm are not affected by poor features, and the proposed algorithm outperforms existing single and ensemble classifiers in terms of classification performance. Another comparative study [10] used seven composite algorithms (Ave, Max, Bagging C4.5, bagging naive Bayes (NB), Boosting J48, Boosting naive Bayes, and RF) and one composite state-of-the art study for cross-project defect prediction. The Max algorithm yielded the best results regarding F-score in terms of classification performance.
Bowes et al. [40] compared RF, NB, Rpart, and SVM algorithms to determine whether these classifiers obtained the same results. The results demonstrated that a unique subset of defects can be discovered by specific classifiers. However, whereas some classifiers are steady in the predictions they make, other classifiers change in their predictions. As a result, ensembles with decision-making without majority voting can perform best.
One of the main problems of SDP is the imbalance between the defect and non-defect classes of the dataset. Generally, the number of defected instances is greater than the number of non-defected instances in the collected data. This situation causes the machine learning algorithms to perform poorly. Wang and Yao [19] compared five class-imbalanced learning methods (RUS, RUS-bal, THM, BNC, SMB) and NB and RF algorithms and proposed the dynamic version of AdaBoost.NC. They utilized balance, G-mean, and AUC measures for comparison. Results showed that AdaBoost.NC and naive Bayes are better than the other seven algorithms in terms of evaluation measures. Dynamic AdaBoost.NC showed better defect detection rate and overall performance than the original AdaBoost.NC. To handle the class imbalance problem, studies [20] have compared different methods (sampling, cost sensitive, hybrid, and ensemble) by taking into account evaluation metrics such as MCC and receiver operating characteristic (ROC).
As shown in Table 1, the most common datasets used in the defect prediction studies [17, 18, 19, 39] are the NASA MDP dataset and PROMISE repository datasets. In addition, some studies utilized open-source projects such as Bugzilla Columba and Eclipse JDT [26, 27], and other studies used Android application data [22, 23].
Although use of ensemble learning techniques has dramatically increased recently, studies that do not use ensemble learning are still conducted and successful. For example, in study [32], prediction models were created using source code metrics as in ensemble studies but by using different feature selection techniques such as genetic algorithm (GA).
To overcome the class imbalance problem, Tomar and Agarwal [33] proposed a prediction system that assigns lower cost to non-defective data samples and higher cost to defective samples to balance data distribution. In the absence of enough data within a project, required data can be obtained from cross projects; however, in this case, this situation may cause class imbalance. To solve this problem, Ryu and Baik [34] proposed multi-objective naïve Bayes learning for cross-project environments. To obtain significant software metrics on cloud computing environments, Ali et al. used a combination of filter and wrapper approaches [35]. They compared different machine learning algorithms such as NB, DT, and MLP [29, 37, 38, 41].
Software effort estimation (SEE) is critical for a company because hiring more employees than required will cause loss of revenue, while hiring fewer employees than necessary will result in delays in software project delivery. The estimation analysis helps to predict the amount of effort (in person hours) needed to develop a software product. Basic steps of software estimation can be itemized as follows:
Determine project objectives and requirements.
Design the activities.
Estimate product size and complexity.
Compare and repeat estimates.
SEE contains requirements and testing besides predicting effort estimation [42]. Many research and review studies have been conducted in the field of SEE. Recently, a survey [43] analyzed effort estimation studies that concentrated on ML techniques and compared them with studies focused on non-ML techniques. According to the survey, case-based reasoning (CBR) and artificial neural network (ANN) were the most widely used techniques. In 2014, Dave and Dutta [44] examined existing studies that focus only on neural network.
The current effort estimation studies using DM and ML techniques are available in Table 2. This table summarizes the prominent studies in terms of aspects such as year, data mining task, aim, datasets, and metrics. Table 2 indicates that neural network is the most widely used technique for the effort estimation task.
Ref. | Year | Task | Objective | Algorithms | Ensemble learning | Dataset | Evaluation metrics and results |
---|---|---|---|---|---|---|---|
[45] | 2008 | Regression | Ensemble of neural networks with associative memory (ENNA) | NN, MLP, KNN | Bagging | NASA, NASA 93, USC, SDR, Desharnais | MMRE, MdMRE and PRED(L) For ENNA PRED(25) = 36.4 For neural network PRED(25) = 8 |
[46] | 2009 | Regression | Authors proposed the ensemble of neural networks with associative memory (ENNA) | NN, MLP, KNN | Bagging | NASA, NASA 93, USC, SDR, Desharnais | Random subsampling, t-test MMRE, MdMRE, and PRED(L) ENNA is the best |
[47] | 2010 | Regression | To show the effectiveness of SVR for SEE | SVR, RBF | — | Tukutuku | LOOCV, MMRE, Pred(25), MEMRE, MdEMRE SVR outperforms others |
[48] | 2011 | Regression | To evaluate whether readily available ensemble methods enhance SEE | MLP, RBF, RT | Bagging | 5 datasets from PROMISE: cocomo81, nasa93, nasa, sdr, and Desharnais 8 datasets from ISBSG repository | MMRE, MdMRE, PRED(25) RTs and Bagging with MLPs perform similarly |
[49] | 2012 | Regression | To show the measures behave in SEE and to create good ensembles | MLP, RBF, REPTree, | Bagging | cocomo81, nasa93, nasa, cocomo2, desharnais, ISBSG repository | MMRE, PRED(25), LSD, MdMRE, MAE, MdAE Pareto ensemble for all measures, except LSD. |
[50] | 2012 | Regression | To use cross-company models to create diverse ensembles able to dynamically adapt to changes | WC RTs, CC-DWM | WC-DWM | 3 datasets from ISBSG repository (ISBSG2000, ISBSG2001, ISBSG) 2 datasets from PROMISE (CocNasaCoc81 and CocNasaCoc81Nasa93) | MAE, Friedman test Only DCL could improve upon RT CC data potentially beneficial for improving SEE |
[51] | 2012 | Regression | To generate estimates from ensembles of multiple prediction methods | CART, NN, LR, PCR, PLSR, SWR, ABE0-1NN, ABE0-5NN | Combining top M solo methods | PROMISE | MAR, MMRE, MdMRE, MMER, MBRE, MIBRE. Combinations perform better than 83% |
[52] | 2012 | Classification/regression | DM techniques to estimate software effort. | M5, CART, LR, MARS, MLPNN, RBFNN, SVM | — | Coc81, CSC, Desharnais, Cocnasa, Maxwell, USP05 | MdMRE, Pred(25), Friedman test Log + OLS > LMS, BC + OLS, MARS, LS-SVM |
[53] | 2013 | Clustering/classification | Estimation of software development effort | NN, ABE, C-means | — | Maxwell | 3-fold CV and LOOCV, RE, MRE, MMRE, PRED |
[54] | 2014 | Regression | ANNs are examined using COCOMO model | MLP, RBFNN, SVM, PSO-SVM Extreme learning Machines | — | COCOMO II Data | MMRE, PRED PSO-SVM is the best |
[55] | 2014 | — | A hybrid model based on GA And ACO for optimization | GA, ACO | — | NASA datasets | MMRE, the proposed method is the best |
[56] | 2015 | Regression | To display the effect of data preprocessing techniques on ML methods in SEE | CBR, ANN, CART Preprocessing rech: MDT, LD, MI, FS, CS, FSS, BSS | — | ISBSG, Desharnais, Kitchenham, USPFT | CV, MBRE, PRED (0.25), MdBRE |
[57] | 2016 | Regression | Four neural network models are compared with each other. | MLP, RBFNN, GRNN, CCNN | — | ISBSG repository | 10-fold CV, MAR The CCNN outperforms the other three models |
[58] | 2016 | Regression | To propose a model based on Bayesian network | GA and PSO | — | COCOMO NASA Dataset | DIR, DRM The proposed model is best |
[59] | 2016 | Classification/regression | A hybrid model using SVM and RBNN compared against previous models | SVM, RBNN | — | Dataset1 = 45 industrial projects Dataset2 = 65 educational projects | LOOCV, MAE, MBRE, MIBRE, SA The proposed approach is the best |
[60] | 2017 | Classification | To estimate software effort by using ML techniques | SVM, KNN | Boosting: kNN and SVM | Desharnais, Maxwell | LOOCV, k-fold CV ACC = 91.35% for Desharnais ACC = 85.48% for Maxwell |
Data mining and machine learning studies on the subject “effort estimation.”
Several studies have compared ensemble learning methods with single learning algorithms [45, 46, 48, 49, 51, 60] and examined them on cross-company (CC) and within-company (WC) datasets [50]. The authors observed that ensemble methods obtained by a proper combination of estimation methods achieved better results than single methods. Various ML techniques such as neural network, support vector machine (SVM), and k-nearest neighbor are commonly used as base classifiers for ensemble methods such as bagging and boosting in software effort estimation. Moreover, their results indicate that CC data can increase performance over WC data for estimation techniques [50].
In addition to the abovementioned studies, researchers have conducted studies without using ensemble techniques. The general approach is to investigate which DM technique has the best effect on performance in software effort estimation. For instance, Subitsha and Rajan [54] compared five different algorithms—MLP, RBFNN, SVM, ELM, and PSO-SVM—and Nassif et al. [57] investigated four neural network algorithms—MLP, RBFNN, GRNN, and CCNN. Although neural networks are widely used in this field, missing values and outliers frequently encountered in the training set adversely affect neural network results and cause inaccurate estimations. To overcome this problem, Khatibi et al. [53] split software projects into several groups based on their similarities. In their studies, the C-means clustering algorithm was used to determine the most similar projects and to decrease the impact of unrelated projects, and then analogy-based estimation (ABE) and NN were applied. Another clustering study by Azzeh and Nassif [59] combined SVM and bisecting k-medoids clustering algorithms; an estimation model was then built using RBFNN. The proposed method was trained on historical use case points (UCP).
Zare et al. [58] and Maleki et al. [55] utilized optimization methods for accurate cost estimation. In the former study, a model was proposed based on Bayesian network with genetic algorithm and particle swarm optimization (PSO). The latter study used GA to optimize the effective factors’ weight, and then trained by ant colony optimization (ACO). Besides conventional effort estimation studies, researchers have utilized machine learning techniques for web applications. Since web-based software projects are different from traditional projects, the effort estimation process for these studies is more complex.
It is observed that PRED(25) and MMRE are the most popular evaluation metrics in effort estimation. MMRE stands for the mean magnitude relative error, and PRED(25) measures prediction accuracy and provides a percentage of predictions within 25% of actual values.
Vulnerability analysis is becoming the focal point of system security to prevent weaknesses in the software system that can be exploited by an attacker. Description of software vulnerability is given in many different resources in different ways [61]. The most popular and widely utilized definition appears in the Common Vulnerabilities and Exposures (CVE) 2017 report as follows:
Vulnerability is a weakness in the computational logic found in software and some hardware components that, when exploited, results in a negative impact to confidentiality, integrity or availability.
Vulnerability analysis may require many different operations to identify defects and vulnerabilities in a software system. Vulnerabilities, which are a special kind of defect, are more critical than other defects because attackers exploit system vulnerabilities to perform unauthorized actions. A defect is a normal problem that can be encountered frequently in the system, easily found by users or developers and fixed promptly, whereas vulnerabilities are subtle mistakes in large codes [62, 63]. Wijayasekara et al. claim that some bugs have been identified as vulnerabilities after being publicly announced in bug databases [64]. These bugs are called “hidden impact vulnerabilities” or “hidden impact bugs.” Therefore, the authors proposed a hidden impact vulnerability identification methodology that utilizes text mining techniques to determine which bugs in bug databases are vulnerabilities. According to the proposed method, a bug report was taken as input, and it produces feature vector after applying text mining. Then, classifier was applied and revealed whether it is a bug or a vulnerability. The results given in [64] demonstrate that a large proportion of discovered vulnerabilities were first described as hidden impact bugs in public bug databases. While bug reports were taken as input in that study, in many other studies, source code is taken as input. Text mining is a highly preferred technique for obtaining features directly from source codes as in the studies [65, 66, 67, 68, 69]. Several studies [63, 70] have compared text mining-based models and software metrics-based models.
In the security area of software systems, several studies have been conducted related to DM and ML. Some of these studies are compared in Table 3, which shows the data mining task and explanation of the studies, the year they were performed, the algorithms that were used, the type of vulnerability analysis, evaluation metrics, and results. In this table, the best performing algorithms according to the evaluation criteria are shown in bold.
Ref. | Year | Task | Objective | Algorithms | Type | Dataset description | Evaluation metrics and results |
---|---|---|---|---|---|---|---|
[71] | 2011 | Clustering | Obtaining software vulnerabilities based on RDBC | RDBC | Static | Database is built by RD-Entropy | FNR, FPR |
[42] | 2011 | Classification/regression | To predict the time to next vulnerability | LR, LMS, MLP, RBF, SMO | Static | NVD, CPE, CVSS | CC, RMSE, RRSE |
[65] | 2012 | Text mining | Analysis of source code as text | RBF, SVM | Static | K9 email client for the Android platform | ACC, PR, recall ACC = 0.87, PR = 0.85, recall = 0.88 |
[64] | 2012 | Classification/text mining | To identify vulnerabilities in bug databases | — | Static | Linux kernel MITRE CVE and MySQL bug databases | BDR, TPR, FPR 32% (Linux) and 62% (MySQL) of vulnerabilities |
[72] | 2014 | Classification/regression | Combine taint analysis and data mining to obtain vulnerabilities | ID3, C4.5/J48, RF, RT, KNN, NB, Bayes Net, MLP, SVM, LR | Hybrid | A version of WAP to collect the data | 10-fold CV, TPD, ACC, PR, KAPPA ACC = 90.8%, PR = 92%, KAPPA = 81% |
[73] | 2014 | Clustering | Identify vulnerabilities from source codes using CPG | — | Static | Neo4J and InfiniteGraph databases | — |
[63] | 2014 | Classification | Comparison of software metrics with text mining | RF | Static | Vulnerabilities from open-source web apps (Drupal, Moodle, PHPMyAdmin) | 3-fold CV, recall, IR, PR, FPR, ACC. Text mining provides benefits overall |
[69] | 2014 | Classification | To create model in the form of a binary classifier using text mining | NB, RF | Static | Applications from the F-Droid repository and Android | 10-fold CV, PR, recall PR and recall ≥ 80% |
[74] | 2015 | Classification | A new approach (VCCFinder) to obtain potentially dangerous codes | SVM-based detection model | — | The database contains 66 GitHub projects | k-fold CV, false alarms <99% at the same level of recall |
[70] | 2015 | Ranking/classification | Comparison of text mining and software metrics models | RF | — | Vulnerabilities from open-source web apps (Drupal, Moodle, PHPMyAdmin) | 10-fold CV Metrics: ER-BCE, ERBPP, ER-AVG |
[75] | 2015 | Clustering | Search patterns for taint-style vulnerabilities in C code | Hierarchical clustering (complete-linkage) | Static | 5 open-source projects: Linux, OpenSSL, Pidgin, VLC, Poppler (Xpdf) | Correct source, correct sanitization, number of traversals, generation time, execution time, reduction, amount of code review <95% |
[76] | 2016 | Classification | Static and dynamic features for classification | LR, MLP, RF | Hybrid | Dataset was created by analyzing 1039 test cases from the Debian Bug Tracker | FPR, FNR Detect 55% of vulnerable programs |
[77] | 2017 | Classification | 1. Employ a deep neural network 2. Combine N-gram analysis and feature selection | Deep neural network | — | Feature extraction from 4 applications (BoardGameGeek, Connectbot, CoolReader, AnkiDroid) | 10 times using 5-fold CV ACC = 92.87%, PR = 94.71%, recall = 90.17% |
[67] | 2017 | Text mining | To analyze characteristics of software vulnerability from source files | — | — | CVE, CWE, NVD databases | PR = 70%, recall = 60% |
[68] | 2017 | Text mining | Deep learning (LSTM) is used to learn semantic and syntactic features in code | RNN, LSTM, DBN | — | Experiments on 18 Java applications from the Android OS platform | 10-fold CV, PR, recall, and F-score Deep Belief Network PR, recall, and F-score > 80% |
[66] | 2018 | Classification | Identify bugs by extracting text features from C source code | NB, KNN, K-means, NN, SVM, DT, RF | Static | NVD, Cat, Cp, Du, Echo, Head, Kill, Mkdir, Nl, Paste, Rm, Seq, Shuf, Sleep, Sort, Tail, Touch, Tr, Uniq, Wc, Whoami | 5-fold CV ACC, TP, TN ACC = 74% |
[78] | 2018 | Regression | A deep learning-based vulnerability detection system (VulDeePecker) | BLSTM NN | Static | NIST: NVD and SAR project | 10-fold CV, PR, recall, F-score F-score = 80.8% |
[79] | 2018 | Classification | A mapping between existing requirements and vulnerabilities | LR, SVM, NB | — | Data is gathered from Apache Tomcat, CVE, requirements from Bugzilla, and source code is collected from Github | PR, recall, F-score LSI > SVM |
Data mining and machine learning studies on the subject “vulnerability analysis.”
Vulnerability analysis can be categorized into three types: static vulnerability analysis, dynamic vulnerability analysis, and hybrid analysis [61, 80]. Many studies have applied the static analysis approach, which detects vulnerabilities from source code without executing software, since it is cost-effective. Few studies have performed the dynamic analysis approach, in which one must execute software and check program behavior. The hybrid analysis approach [72, 76] combines these two approaches.
As revealed in Table 3, in addition to classification and text mining, clustering techniques are also frequently seen in software vulnerability analysis studies. To detect vulnerabilities in an unknown software data repository, entropy-based density clustering [71] and complete-linkage clustering [75] were proposed. Yamaguchi et al. [73] introduced a model to represent a large number of source codes as a graph called control flow graph (CPG), a combination of abstract syntax tree, CFG, and program dependency graph (PDG). This model enabled the discovery of previously unknown (zero-day) vulnerabilities.
To learn the time to next vulnerability, a prediction model was proposed in the study [42]. The result could be a number that refers to days or a bin representing values in a range. The authors used regression and classification techniques for the former and latter cases, respectively.
In vulnerability studies, issue tracking systems like Bugzilla, code repositories like Github, and vulnerability databases such as NVD, CVE, and CWE have been utilized [79]. In addition to these datasets, some studies have used Android [65, 68, 69] or web [63, 70, 72] (PHP source code) datasets. In recent years, researchers have concentrated on deep learning for building binary classifiers [77], obtaining vulnerability patterns [78], and learning long-term dependencies in sequential data [68] and features directly from the source code [81].
Li et al. [78] note two difficulties of vulnerability studies: demanding, intense manual labor and high false-negative rates. Thus, the widely used evaluation metrics in vulnerability analysis are false-positive rate and false-negative rate.
During the past years, software developers have used design patterns to create complex software systems. Thus, researchers have investigated the field of design patterns in many ways [82, 83]. Fowler defines a pattern as follows:
“A pattern is an idea that has been useful in one practical context and will probably be useful in others.” [84]
Patterns display relationships and interactions between classes or objects. Well-designed object-oriented systems have various design patterns integrated into them. Design patterns can be highly useful for developers when they are used in the right manner and place. Thus, developers avoid recreating methods previously refined by others. The pattern approach was initially presented in 1994 by four authors—namely, Erich Gama, Richard Helm, Ralph Johnson, and John Vlissides—called the Gang of Four (GOF) in 1994 [85]. According to the authors, there are three types of design patterns:
Creational patterns provide an object creation mechanism to create the necessary objects based on predetermined conditions. They allow the system to call appropriate object and add flexibility to the system when objects are created. Some creational design patterns are factory method, abstract factory, builder, and singleton.
Structural patterns focus on the composition of classes and objects to allow the establishment of larger software groups. Some of the structural design patterns are adapter, bridge, composite, and decorator.
Behavioral patterns determine common communication patterns between objects and how multiple classes behave when performing a task. Some behavioral design patterns are command, interpreter, iterator, observer, and visitor.
Many design pattern studies exist in the literature. Table 4 shows some design pattern mining studies related to machine learning and data mining. This table contains the aim of the study, mining task, year, and design patterns selected by the study, input data, dataset, and results of the studies.
Ref. | Year | Task | Objective | Algorithms | EL | Selected design patterns | Input data | Dataset | Evaluation metrics and results |
---|---|---|---|---|---|---|---|---|---|
[86] | 2012 | Text classification | Two-phase method: 1—text classification to 2—learning design patterns | NB, KNN, DT, SVM | — | 46 security patterns, 34 Douglass patterns, 23 GoF patterns | Documents | Security, Douglass, GoF | PR, recall, EWM PR = 0.62, recall = 0.75 |
[87] | 2013 | Regression | An approach is to find a valid instance of a DP or not | ANN | — | Adapter, command, composite, decorator, observer, and proxy | Set of candidate classes | JHotDraw 5.1 open-source application | 10 fold CV, PR, recall |
[88] | 2014 | Graph mining | Sub-graph mining-based approach | CloseGraph | — | — | Java source code | Open-source project:YARI, Zest, JUnit, JFreeChart, ArgoUML | No any empirical comparison |
[89] | 2015 | Classification/clustering | MARPLE-DPD is developed to classify instances whether it is a bad or good instance | SVM, DT, RF, K-means, ZeroR, OneR, NB, JRip, CLOPE. | — | Classification for singleton and adapter Classification and clustering for composite, decorator, and factory method | — | 10 open-source software systems DPExample, QuickUML 2001, Lexi v0.1.1 alpha, JRefactory v2.6.24, Netbeans v1.0.x, JUnit v3.7, JHotDraw v5.1, MapperXML v1.9.7, Nutch v0.4, PMD v1.8 | 10-fold CV, ACC, F-score, AUC ACC > =85% |
[90] | 2015 | Regression | A new method (SVM-PHGS) is proposed | Simple Logistic, C4.5, KNN, SVM, SVM-PHGS | — | Adapter, builder, composite, factory method, iterator, observer | Source code | P-mart repository | PR, recall, F-score, FP PR = 0.81, recall =0.81, F-score = 0.81, FP = 0.038 |
[91] | 2016 | Classification | Design pattern recognition using ML algorithms. | LRNN, DT | — | Abstract factory, adapter patterns | Source code | Dataset with 67 OO metrics, extracted by JBuilder tool | 5-fold CV, ACC, PR, recall, F-score ACC = 100% by LRNN |
[92] | 2016 | Classification | Three aspects: design patterns, software metrics, and supervised learning methods | Layer Recurrent Neural Network (LRNN) | RF | Abstract factory, adapter, bridge, singleton, and template method | Source code | Dataset with 67 OO metrics, extracted by JBuilder tool | PR, recall, F-score F-score = 100% by LRNN and RF ACC = 100% by RF |
[93] | 2017 | Classification | 1. Creation of metrics-oriented dataset 2. Detection of software design patterns | ANN, SVM | RF | Abstract factory, adapter, bridge, composite, and Template | Source code | Metrics extracted from source codes (JHotDraw, QuickUML, and Junit) | 5-fold and 10-fold CV, PR, recall, F-score ANN, SVM, and RF yielded to 100% PR for JHotDraw |
[94] | 2017 | Classification | Detection of design motifs based on a set of directed semantic graphs | Strong graph simulation, graph matching | — | All three groups: creational, structural, behavioral | UML class diagrams | — | PR, recall High accuracy by the proposed method |
[95] | 2017 | Text categorization | Selection of more appropriate design patterns | Fuzzy c-means | Ensemble-IG | Various design patterns | Problem definitions of design patterns | DP, GoF, Douglass, Security | F-score |
[96] | 2018 | Classification | Finding design pattern and smell pairs which coexist in the code | J48 | — | Used patterns: adapter, bridge, Template, singleton | Source code | Eclipse plugin Web of Patterns The tool selected for code smell detection is iPlasma | PR, recall, F-score, PRC, ROC Singleton pattern shows no presence of bad smells |
Data mining and machine learning studies on the subject “design pattern mining.”
In design pattern mining, detecting the design pattern is a frequent study objective. To do so, studies have used machine learning algorithms [87, 89, 90, 91], ensemble learning [95], deep learning [97], graph theory [94], and text mining [86, 95].
In study [91], the training dataset consists of 67 object-oriented (OO) metrics extracted by using the JBuilder tool. The authors used LRNN and decision tree techniques for pattern detection. Alhusain et al. [87] generated training datasets from existing pattern detection tools. The ANN algorithm was selected for pattern instances. Chihada et al. [90] created training data from pattern instances using 45 OO metrics. The authors utilized SVM for classifying patterns accurately. Another metrics-oriented dataset was developed by Dwivedi et al. [93]. To evaluate the results, the authors benefited from three open-source software systems (JHotDraw, QuickUML, and JUnit) and applied three classifiers, SVM, ANN, and RF. The advantage of using random forest is that it does not require linear features and can manage high-dimensional spaces.
To evaluate methods and to find patterns, open-source software projects such as JHotDraw, Junit, and MapperXML have been generally preferred by researchers. For example, Zanoni et al. [89] developed a tool called MARPLE-DPD by combining graph matching and machine learning techniques. Then, to obtain five design patterns, instances were collected from 10 open-source software projects, as shown in Table 4.
Design patterns and code smells are related issues: Code smell refers to symptoms in code, and if there are code smells in a software, its design pattern is not well constructed. Therefore, Kaur and Singh [96] checked whether design pattern and smell pairs appear together in a code by using J48 Decision Tree. Their obtained results showed that the singleton pattern had no presence of bad smells.
According to the studies summarized in the table, the most frequently used patterns are abstract factory and adapter. It has recently been observed that studies on ensemble learning in this field are increasing.
One of the SE tasks most often used to improve the quality of a software system is refactoring, which Martin Fowler has described as “a technique for restructuring an existing body of code, altering its internal structure without changing its external behavior” [98]. It improves readability and maintainability of the source code and decreases complexity of a software system. Some of the refactoring types are: Add Parameter, Replace Parameter, Extract method, and Inline method [99].
Code smell and refactoring are closely related to each other: Code smells represent problems due to bad design and can be fixed during refactoring. The main challenge is to obtain which part of the code needs refactoring.
Some of data mining studies related to software refactoring are presented in Table 5. Some studies focus on historical data to predict refactoring [100] or to obtain both refactoring and software defects [101] using different data mining algorithms such as LMT, Rip, and J48. Results suggest that when refactoring increases, the number of software defects decreases, and thus refactoring has a positive effect on software quality.
Ref. | Year | Task | Objective | Algorithms | EL | Dataset | Evaluation metrics and results |
---|---|---|---|---|---|---|---|
[100] | 2007 | Regression | Stages: (1) data understanding, (2) preprocessing, (3) ML, (4) post-processing, (5) analysis of the results | J48, LMT, Rip, NNge | — | ArgoUML, Spring Framework | 10-fold CV, PR, recall, F-score PR and recall are 0.8 for ArgoUML |
[101] | 2008 | Classification | Finding the relationship between refactoring and defects | C4.5, LMT, Rip, NNge | — | ArgoUML, JBoss Cache, Liferay Portal, Spring Framework, XDoclet | PR, recall, F-score |
[102] | 2014 | Regression | Propose GA-based learning for software refactoring based on ANN | GA, ANN | — | Xerces-J, JFreeChart, GanttProject, AntApache, JHotDraw, and Rhino. | Wilcoxon test with a 99% confidence level (α = 0.01) |
[103] | 2015 | Regression | Removing defects with time series in a multi-objective approach | Multi-objective algorithm, based on NSGA-II, ARIMA | FindBugs, JFreeChart, Hibernate, Pixelitor, and JDI-Ford | Wilcoxon rank sum test with a 99% confidence level (α < 1%) | |
[104] | 2016 | Web mining/clustering | Unsupervised learning approach to detect refactoring opportunities in service-oriented applications | PAM, K-means, COBWEB, X-Means | — | Two datasets of WSDL documents | COBWEB and K-means max. 83.33% and 0%, inter-cluster COBWEB and K-means min. 33.33% and 66.66% intra-cluster |
[105] | 2017 | Clustering | A novel algorithm (HASP) for software refactoring at the package level | Hierarchical clustering algorithm | — | Three open-source case studies | Modularization Quality and Evaluation Metric Function |
[99] | 2017 | Classification | A technique to predict refactoring at class level | PCA, SMOTE LS-SVM, RBF | — | From tera- PROMISE Repository seven open-source software systems | 10-fold CV, AUC, and ROC curves RBF kernel outperforms linear and polynomial kernel The mean value of AUC for LS-SVM RBF kernel is 0.96 |
[106] | 2017 | Classification | Exploring the impact of clone refactoring (CR) on the test code size | LR, KNN, NB | RF | data collected from an open-source Java software system (ANT) | PR, recall, accuracy, F-score kNN and RF outperform NB ACC (fitting (98%), LOOCV (95%), and 10 FCV (95%)) |
[107] | 2017 | — | Finding refactoring opportunities in source code | J48, BayesNet, SVM, LR | RF | Ant, ArgoUML, jEdit, jFreeChart, Mylyn | 10-fold CV, PR, recall 86–97% PR and 71–98% recall for proposed tech |
[108] | 2018 | Classification | A learning-based approach (CREC) to extract refactored and non-refactored clone groups from repositories | C4.5, SMO, NB. | RF, Adaboost | Axis2, Eclipse.jdt.core, Elastic Search, JFreeChart, JRuby, and Lucene | PR, recall, F-score F-score = 83% in the within-project F-score = 76% in the cross-project |
[109] | 2018 | Clustering | Combination of the use of multi-objective and unsupervised learning to decrease developer’s effort | GMM, EM | — | ArgoUML, JHotDraw, GanttProject, UTest, Apache Ant, Azureus | One-way ANOVA with a 95% confidence level (α = 5%) |
Data mining and machine learning studies on the subject “refactoring.”
While automated refactoring does not always give the desired result, manual refactoring is time-consuming. Therefore, one study [109] proposed a clustering-based recommendation tool by combining multi-objective search and unsupervised learning algorithm to reduce the number of refactoring options. At the same time, the number of refactoring that should be selected is decreasing with the help of the developer’s feedback.
Since many SE studies that apply data mining approaches exist in the literature, this article presents only a few of them. However, Figure 4 shows the current number of papers obtained from the Scopus search engine for each year from 2010 to 2019 by using queries in the title/abstract/keywords field. We extracted publications in 2020 since this year has not completed yet. Queries included (“data mining” OR “machine learning”) with (“defect prediction” OR “defect detection” OR “bug prediction” OR “bug detection”) for defect prediction, (“effort estimation” OR “effort prediction” OR “cost estimation”) for effort estimation, (“vulnerab*” AND “software” OR “vulnerability analysis”) for vulnerability analysis, and (“software” AND “refactoring”) for refactoring. As seen in the figure, the number of studies using data mining in SE tasks, especially defect prediction and vulnerability analysis, has increased rapidly. The most stable area in the studies is design pattern mining.
Number of publications of data mining studies for SE tasks from Scopus search by their years.
Figure 5 shows the publications studied in classification, clustering, text mining, and association rule mining as a percentage of the total number of papers obtained by a Scopus query for each SE task. For example, in defect prediction, the number of studies is 339 in the field of classification, 64 in clustering, 8 in text mining, and 25 in the field of association rule mining. As can be seen from the pie charts, while clustering is a popular DM technique in refactoring, no study related to text mining is found in this field. In other SE tasks, the preferred technique is classification, and the second is clustering.
Number of publications of data mining studies for SE tasks from Scopus search by their topics.
Defect prediction generally compares learning algorithms in terms of whether they find defects correctly using classification algorithms. Besides this approach, in some studies, clustering algorithms were used to select futures [110] or to compare supervised and unsupervised methods [27]. In the text mining area, to extract features from scripts, TF-IDF techniques were generally used [111, 112]. Although many different algorithms have been used in defect prediction, the most popular ones are NB, MLP, and RBF.
Figure 6 shows the number of document types (conference paper, book chapter, article, book) published between the years of 2010 and 2019. It is clearly seen that conference papers and articles are the most preferred research study type. It is clearly seen that there is no review article about data mining studies in design pattern mining.
The number of publications in terms of document type between 2010 and 2019.
Table 6 shows popular repositories that contain various datasets and their descriptions, which tasks they are used for, and hyperlinks to download. For example, the PMART repository includes source files of java projects, and the PROMISE repository has different datasets with software metrics such as cyclomatic complexity, design complexity, and lines of code. Since these repositories contain many datasets, no detailed information about them has been provided in this article.
Repository | Topic | Description | Web link |
---|---|---|---|
Nasa MDP | Defect Pred. | NASA’s Metrics Data Program | |
Android Git | Defect Pred. | Android version bug reports | |
PROMISE | Defect Pred. Effort Est. | It includes 20 datasets for defect prediction and cost estimation | |
Software Defect Pred. Data | Defect Pred. | It includes software metrics, # of defects, etc. Eclipse JDT: Eclipse PDE: | |
PMART | Design pattern mining | It has 22 patterns 9 Projects, 139 ins. Format: XML Manually detected and validated |
Description of popular repositories used in studies.
Refactoring can be applied at different levels; study [105] predicted refactoring at package level using hierarchical clustering, and another study [99] applied class-level refactoring using LS-SVM as learning algorithm, SMOTE for handling refactoring, and PCA for feature extraction.
Data mining techniques have been applied successfully in many different domains. In software engineering, to improve the quality of a product, it is highly critical to find existing deficits such as bugs, defects, code smells, and vulnerabilities in the early phases of SDLC. Therefore, many data mining studies in the past decade have aimed to deal with such problems. The present paper aims to provide information about previous studies in the field of software engineering. This survey shows how classification, clustering, text mining, and association rule mining can be applied in five SE tasks: defect prediction, effort estimation, vulnerability analysis, design pattern mining, and refactoring. It clearly shows that classification is the most used DM technique. Therefore, new studies can focus on clustering on SE tasks.
LMT | logistic model trees |
Rip | repeated incremental pruning |
NNge | nearest neighbor generalization |
PCA | principal component analysis |
PAM | partitioning around medoids |
LS-SVM | least-squares support vector machines |
MAE | mean absolute error |
RBF | radial basis function |
RUS | random undersampling |
SMO | sequential minimal optimization |
GMM | Gaussian mixture model |
EM | expectation maximizaion |
LR | logistic regression |
SMB | SMOTEBoost |
RUS-bal | balanced version of random undersampling |
THM | threshold-moving |
BNC | AdaBoost.NC |
RF | random forest |
RBF | radial basis function |
CC | correlation coefficient |
ROC | receiver operating characteristic |
BayesNet | Bayesian network |
SMOTE | synthetic minority over-sampling technique |
"Open access contributes to scientific excellence and integrity. It opens up research results to wider analysis. It allows research results to be reused for new discoveries. And it enables the multi-disciplinary research that is needed to solve global 21st century problems. Open access connects science with society. It allows the public to engage with research. To go behind the headlines. And look at the scientific evidence. And it enables policy makers to draw on innovative solutions to societal challenges".
\n\nCarlos Moedas, the European Commissioner for Research Science and Innovation at the STM Annual Frankfurt Conference, October 2016.
",metaTitle:"About Open Access",metaDescription:"Open access contributes to scientific excellence and integrity. It opens up research results to wider analysis. It allows research results to be reused for new discoveries. And it enables the multi-disciplinary research that is needed to solve global 21st century problems. Open access connects science with society. It allows the public to engage with research. To go behind the headlines. And look at the scientific evidence. And it enables policy makers to draw on innovative solutions to societal challenges.\n\nCarlos Moedas, the European Commissioner for Research Science and Innovation at the STM Annual Frankfurt Conference, October 2016.",metaKeywords:null,canonicalURL:"about-open-access",contentRaw:'[{"type":"htmlEditorComponent","content":"The Open Access publishing movement started in the early 2000s when academic leaders from around the world participated in the formation of the Budapest Initiative. They developed recommendations for an Open Access publishing process, “which has worked for the past decade to provide the public with unrestricted, free access to scholarly research—much of which is publicly funded. Making the research publicly available to everyone—free of charge and without most copyright and licensing restrictions—will accelerate scientific research efforts and allow authors to reach a larger number of readers” (reference: http://www.budapestopenaccessinitiative.org)
\\n\\nIntechOpen’s co-founders, both scientists themselves, created the company while undertaking research in robotics at Vienna University. Their goal was to spread research freely “for scientists, by scientists’ to the rest of the world via the Open Access publishing model. The company soon became a signatory of the Budapest Initiative, which currently has more than 1000 supporting organizations worldwide, ranging from universities to funders.
\\n\\nAt IntechOpen today, we are still as committed to working with organizations and people who care about scientific discovery, to putting the academic needs of the scientific community first, and to providing an Open Access environment where scientists can maximize their contribution to scientific advancement. By opening up access to the world’s scientific research articles and book chapters, we aim to facilitate greater opportunity for collaboration, scientific discovery and progress. We subscribe wholeheartedly to the Open Access definition:
\\n\\n“By “open access” to [peer-reviewed research literature], we mean its free availability on the public internet, permitting any users to read, download, copy, distribute, print, search, or link to the full texts of these articles, crawl them for indexing, pass them as data to software, or use them for any other lawful purpose, without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself. The only constraint on reproduction and distribution, and the only role for copyright in this domain, should be to give authors control over the integrity of their work and the right to be properly acknowledged and cited” (reference: http://www.budapestopenaccessinitiative.org)
\\n\\nOAI-PMH
\\n\\nAs a firm believer in the wider dissemination of knowledge, IntechOpen supports the Open Access Initiative Protocol for Metadata Harvesting (OAI-PMH Version 2.0). Read more
\\n\\nLicense
\\n\\nBook chapters published in edited volumes are distributed under the Creative Commons Attribution 3.0 Unported License (CC BY 3.0). IntechOpen upholds a very flexible Copyright Policy. There is no copyright transfer to the publisher and Authors retain exclusive copyright to their work. All Monographs/Compacts are distributed under the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0). Read more
\\n\\nPeer Review Policies
\\n\\nAll scientific works are Peer Reviewed prior to publishing. Read more
\\n\\nOA Publishing Fees
\\n\\nThe Open Access publishing model employed by IntechOpen eliminates subscription charges and pay-per-view fees, enabling readers to access research at no cost. In order to sustain operations and keep our publications freely accessible we levy an Open Access Publishing Fee for manuscripts, which helps us cover the costs of editorial work and the production of books. Read more
\\n\\nDigital Archiving Policy
\\n\\nIntechOpen is committed to ensuring the long-term preservation and the availability of all scholarly research we publish. We employ a variety of means to enable us to deliver on our commitments to the scientific community. Apart from preservation by the Croatian National Library (for publications prior to April 18, 2018) and the British Library (for publications after April 18, 2018), our entire catalogue is preserved in the CLOCKSS archive.
\\n"}]'},components:[{type:"htmlEditorComponent",content:'The Open Access publishing movement started in the early 2000s when academic leaders from around the world participated in the formation of the Budapest Initiative. They developed recommendations for an Open Access publishing process, “which has worked for the past decade to provide the public with unrestricted, free access to scholarly research—much of which is publicly funded. Making the research publicly available to everyone—free of charge and without most copyright and licensing restrictions—will accelerate scientific research efforts and allow authors to reach a larger number of readers” (reference: http://www.budapestopenaccessinitiative.org)
\n\nIntechOpen’s co-founders, both scientists themselves, created the company while undertaking research in robotics at Vienna University. Their goal was to spread research freely “for scientists, by scientists’ to the rest of the world via the Open Access publishing model. The company soon became a signatory of the Budapest Initiative, which currently has more than 1000 supporting organizations worldwide, ranging from universities to funders.
\n\nAt IntechOpen today, we are still as committed to working with organizations and people who care about scientific discovery, to putting the academic needs of the scientific community first, and to providing an Open Access environment where scientists can maximize their contribution to scientific advancement. By opening up access to the world’s scientific research articles and book chapters, we aim to facilitate greater opportunity for collaboration, scientific discovery and progress. We subscribe wholeheartedly to the Open Access definition:
\n\n“By “open access” to [peer-reviewed research literature], we mean its free availability on the public internet, permitting any users to read, download, copy, distribute, print, search, or link to the full texts of these articles, crawl them for indexing, pass them as data to software, or use them for any other lawful purpose, without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself. The only constraint on reproduction and distribution, and the only role for copyright in this domain, should be to give authors control over the integrity of their work and the right to be properly acknowledged and cited” (reference: http://www.budapestopenaccessinitiative.org)
\n\nOAI-PMH
\n\nAs a firm believer in the wider dissemination of knowledge, IntechOpen supports the Open Access Initiative Protocol for Metadata Harvesting (OAI-PMH Version 2.0). Read more
\n\nLicense
\n\nBook chapters published in edited volumes are distributed under the Creative Commons Attribution 3.0 Unported License (CC BY 3.0). IntechOpen upholds a very flexible Copyright Policy. There is no copyright transfer to the publisher and Authors retain exclusive copyright to their work. All Monographs/Compacts are distributed under the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0). Read more
\n\nPeer Review Policies
\n\nAll scientific works are Peer Reviewed prior to publishing. Read more
\n\nOA Publishing Fees
\n\nThe Open Access publishing model employed by IntechOpen eliminates subscription charges and pay-per-view fees, enabling readers to access research at no cost. In order to sustain operations and keep our publications freely accessible we levy an Open Access Publishing Fee for manuscripts, which helps us cover the costs of editorial work and the production of books. Read more
\n\nDigital Archiving Policy
\n\nIntechOpen is committed to ensuring the long-term preservation and the availability of all scholarly research we publish. We employ a variety of means to enable us to deliver on our commitments to the scientific community. Apart from preservation by the Croatian National Library (for publications prior to April 18, 2018) and the British Library (for publications after April 18, 2018), our entire catalogue is preserved in the CLOCKSS archive.
\n'}]},successStories:{items:[]},authorsAndEditors:{filterParams:{sort:"featured,name"},profiles:[{id:"6700",title:"Dr.",name:"Abbass A.",middleName:null,surname:"Hashim",slug:"abbass-a.-hashim",fullName:"Abbass A. Hashim",position:null,profilePictureURL:"https://mts.intechopen.com/storage/users/6700/images/1864_n.jpg",biography:"Currently I am carrying out research in several areas of interest, mainly covering work on chemical and bio-sensors, semiconductor thin film device fabrication and characterisation.\nAt the moment I have very strong interest in radiation environmental pollution and bacteriology treatment. The teams of researchers are working very hard to bring novel results in this field. I am also a member of the team in charge for the supervision of Ph.D. students in the fields of development of silicon based planar waveguide sensor devices, study of inelastic electron tunnelling in planar tunnelling nanostructures for sensing applications and development of organotellurium(IV) compounds for semiconductor applications. I am a specialist in data analysis techniques and nanosurface structure. I have served as the editor for many books, been a member of the editorial board in science journals, have published many papers and hold many patents.",institutionString:null,institution:{name:"Sheffield Hallam University",country:{name:"United Kingdom"}}},{id:"54525",title:"Prof.",name:"Abdul Latif",middleName:null,surname:"Ahmad",slug:"abdul-latif-ahmad",fullName:"Abdul Latif Ahmad",position:null,profilePictureURL:"//cdnintech.com/web/frontend/www/assets/author.svg",biography:null,institutionString:null,institution:null},{id:"20567",title:"Prof.",name:"Ado",middleName:null,surname:"Jorio",slug:"ado-jorio",fullName:"Ado Jorio",position:null,profilePictureURL:"//cdnintech.com/web/frontend/www/assets/author.svg",biography:null,institutionString:null,institution:{name:"Universidade Federal de Minas Gerais",country:{name:"Brazil"}}},{id:"47940",title:"Dr.",name:"Alberto",middleName:null,surname:"Mantovani",slug:"alberto-mantovani",fullName:"Alberto Mantovani",position:null,profilePictureURL:"//cdnintech.com/web/frontend/www/assets/author.svg",biography:null,institutionString:null,institution:null},{id:"12392",title:"Mr.",name:"Alex",middleName:null,surname:"Lazinica",slug:"alex-lazinica",fullName:"Alex Lazinica",position:null,profilePictureURL:"https://mts.intechopen.com/storage/users/12392/images/7282_n.png",biography:"Alex Lazinica is the founder and CEO of IntechOpen. After obtaining a Master's degree in Mechanical Engineering, he continued his PhD studies in Robotics at the Vienna University of Technology. Here he worked as a robotic researcher with the university's Intelligent Manufacturing Systems Group as well as a guest researcher at various European universities, including the Swiss Federal Institute of Technology Lausanne (EPFL). During this time he published more than 20 scientific papers, gave presentations, served as a reviewer for major robotic journals and conferences and most importantly he co-founded and built the International Journal of Advanced Robotic Systems- world's first Open Access journal in the field of robotics. Starting this journal was a pivotal point in his career, since it was a pathway to founding IntechOpen - Open Access publisher focused on addressing academic researchers needs. Alex is a personification of IntechOpen key values being trusted, open and entrepreneurial. Today his focus is on defining the growth and development strategy for the company.",institutionString:null,institution:{name:"TU Wien",country:{name:"Austria"}}},{id:"19816",title:"Prof.",name:"Alexander",middleName:null,surname:"Kokorin",slug:"alexander-kokorin",fullName:"Alexander Kokorin",position:null,profilePictureURL:"https://mts.intechopen.com/storage/users/19816/images/1607_n.jpg",biography:"Alexander I. Kokorin: born: 1947, Moscow; DSc., PhD; Principal Research Fellow (Research Professor) of Department of Kinetics and Catalysis, N. Semenov Institute of Chemical Physics, Russian Academy of Sciences, Moscow.\r\nArea of research interests: physical chemistry of complex-organized molecular and nanosized systems, including polymer-metal complexes; the surface of doped oxide semiconductors. He is an expert in structural, absorptive, catalytic and photocatalytic properties, in structural organization and dynamic features of ionic liquids, in magnetic interactions between paramagnetic centers. The author or co-author of 3 books, over 200 articles and reviews in scientific journals and books. He is an actual member of the International EPR/ESR Society, European Society on Quantum Solar Energy Conversion, Moscow House of Scientists, of the Board of Moscow Physical Society.",institutionString:null,institution:{name:"Semenov Institute of Chemical Physics",country:{name:"Russia"}}},{id:"62389",title:"PhD.",name:"Ali Demir",middleName:null,surname:"Sezer",slug:"ali-demir-sezer",fullName:"Ali Demir Sezer",position:null,profilePictureURL:"https://mts.intechopen.com/storage/users/62389/images/3413_n.jpg",biography:"Dr. Ali Demir Sezer has a Ph.D. from Pharmaceutical Biotechnology at the Faculty of Pharmacy, University of Marmara (Turkey). He is the member of many Pharmaceutical Associations and acts as a reviewer of scientific journals and European projects under different research areas such as: drug delivery systems, nanotechnology and pharmaceutical biotechnology. Dr. Sezer is the author of many scientific publications in peer-reviewed journals and poster communications. Focus of his research activity is drug delivery, physico-chemical characterization and biological evaluation of biopolymers micro and nanoparticles as modified drug delivery system, and colloidal drug carriers (liposomes, nanoparticles etc.).",institutionString:null,institution:{name:"Marmara University",country:{name:"Turkey"}}},{id:"61051",title:"Prof.",name:"Andrea",middleName:null,surname:"Natale",slug:"andrea-natale",fullName:"Andrea Natale",position:null,profilePictureURL:"//cdnintech.com/web/frontend/www/assets/author.svg",biography:null,institutionString:null,institution:null},{id:"100762",title:"Prof.",name:"Andrea",middleName:null,surname:"Natale",slug:"andrea-natale",fullName:"Andrea Natale",position:null,profilePictureURL:"//cdnintech.com/web/frontend/www/assets/author.svg",biography:null,institutionString:null,institution:{name:"St David's Medical Center",country:{name:"United States of America"}}},{id:"107416",title:"Dr.",name:"Andrea",middleName:null,surname:"Natale",slug:"andrea-natale",fullName:"Andrea Natale",position:null,profilePictureURL:"//cdnintech.com/web/frontend/www/assets/author.svg",biography:null,institutionString:null,institution:{name:"Texas Cardiac Arrhythmia",country:{name:"United States of America"}}},{id:"64434",title:"Dr.",name:"Angkoon",middleName:null,surname:"Phinyomark",slug:"angkoon-phinyomark",fullName:"Angkoon Phinyomark",position:null,profilePictureURL:"https://mts.intechopen.com/storage/users/64434/images/2619_n.jpg",biography:"My name is Angkoon Phinyomark. I received a B.Eng. degree in Computer Engineering with First Class Honors in 2008 from Prince of Songkla University, Songkhla, Thailand, where I received a Ph.D. degree in Electrical Engineering. My research interests are primarily in the area of biomedical signal processing and classification notably EMG (electromyography signal), EOG (electrooculography signal), and EEG (electroencephalography signal), image analysis notably breast cancer analysis and optical coherence tomography, and rehabilitation engineering. I became a student member of IEEE in 2008. During October 2011-March 2012, I had worked at School of Computer Science and Electronic Engineering, University of Essex, Colchester, Essex, United Kingdom. In addition, during a B.Eng. I had been a visiting research student at Faculty of Computer Science, University of Murcia, Murcia, Spain for three months.\n\nI have published over 40 papers during 5 years in refereed journals, books, and conference proceedings in the areas of electro-physiological signals processing and classification, notably EMG and EOG signals, fractal analysis, wavelet analysis, texture analysis, feature extraction and machine learning algorithms, and assistive and rehabilitative devices. I have several computer programming language certificates, i.e. Sun Certified Programmer for the Java 2 Platform 1.4 (SCJP), Microsoft Certified Professional Developer, Web Developer (MCPD), Microsoft Certified Technology Specialist, .NET Framework 2.0 Web (MCTS). I am a Reviewer for several refereed journals and international conferences, such as IEEE Transactions on Biomedical Engineering, IEEE Transactions on Industrial Electronics, Optic Letters, Measurement Science Review, and also a member of the International Advisory Committee for 2012 IEEE Business Engineering and Industrial Applications and 2012 IEEE Symposium on Business, Engineering and Industrial Applications.",institutionString:null,institution:{name:"Joseph Fourier University",country:{name:"France"}}},{id:"55578",title:"Dr.",name:"Antonio",middleName:null,surname:"Jurado-Navas",slug:"antonio-jurado-navas",fullName:"Antonio Jurado-Navas",position:null,profilePictureURL:"https://mts.intechopen.com/storage/users/55578/images/4574_n.png",biography:"Antonio Jurado-Navas received the M.S. degree (2002) and the Ph.D. degree (2009) in Telecommunication Engineering, both from the University of Málaga (Spain). He first worked as a consultant at Vodafone-Spain. From 2004 to 2011, he was a Research Assistant with the Communications Engineering Department at the University of Málaga. In 2011, he became an Assistant Professor in the same department. From 2012 to 2015, he was with Ericsson Spain, where he was working on geo-location\ntools for third generation mobile networks. Since 2015, he is a Marie-Curie fellow at the Denmark Technical University. His current research interests include the areas of mobile communication systems and channel modeling in addition to atmospheric optical communications, adaptive optics and statistics",institutionString:null,institution:{name:"University of Malaga",country:{name:"Spain"}}}],filtersByRegion:[{group:"region",caption:"North America",value:1,count:5698},{group:"region",caption:"Middle and South America",value:2,count:5172},{group:"region",caption:"Africa",value:3,count:1689},{group:"region",caption:"Asia",value:4,count:10243},{group:"region",caption:"Australia and Oceania",value:5,count:888},{group:"region",caption:"Europe",value:6,count:15647}],offset:12,limit:12,total:117315},chapterEmbeded:{data:{}},editorApplication:{success:null,errors:{}},ofsBooks:{filterParams:{hasNoEditors:"0",sort:"dateEndThirdStepPublish",topicId:"10,11,12,14,24,5"},books:[{type:"book",id:"8485",title:"Weather Forecasting",subtitle:null,isOpenForSubmission:!0,hash:"eadbd6f9c26be844062ce5cd3b3eb573",slug:null,bookSignature:"Associate Prof. Muhammad Saifullah",coverURL:"https://cdn.intechopen.com/books/images_new/8485.jpg",editedByType:null,editors:[{id:"320968",title:"Associate Prof.",name:"Muhammad",surname:"Saifullah",slug:"muhammad-saifullah",fullName:"Muhammad Saifullah"}],productType:{id:"1",chapterContentType:"chapter",authoredCaption:"Edited by"}},{type:"book",id:"10013",title:"Geothermal Energy",subtitle:null,isOpenForSubmission:!0,hash:"a5f5277a1c0616ce6b35f4b44a4cac7a",slug:null,bookSignature:"Dr. Basel I. Ismail",coverURL:"https://cdn.intechopen.com/books/images_new/10013.jpg",editedByType:null,editors:[{id:"62122",title:"Dr.",name:"Basel",surname:"Ismail",slug:"basel-ismail",fullName:"Basel Ismail"}],productType:{id:"1",chapterContentType:"chapter",authoredCaption:"Edited by"}},{type:"book",id:"10404",title:"Evapotranspiration - Recent Advances and Applications",subtitle:null,isOpenForSubmission:!0,hash:"babca2dea1c80719111734cc57a21a4c",slug:null,bookSignature:"Dr. Amin Talei",coverURL:"https://cdn.intechopen.com/books/images_new/10404.jpg",editedByType:null,editors:[{id:"335732",title:"Dr.",name:"Amin",surname:"Talei",slug:"amin-talei",fullName:"Amin Talei"}],productType:{id:"1",chapterContentType:"chapter",authoredCaption:"Edited by"}},{type:"book",id:"7724",title:"Climate Issues in Asia and Africa - Examining Climate, Its Flux, the Consequences, and Society's Responses",subtitle:null,isOpenForSubmission:!0,hash:"c1bd1a5a4dba07b95a5ae5ef0ecf9f74",slug:null,bookSignature:" John P. Tiefenbacher",coverURL:"https://cdn.intechopen.com/books/images_new/7724.jpg",editedByType:null,editors:[{id:"73876",title:"Dr.",name:"John P.",surname:"Tiefenbacher",slug:"john-p.-tiefenbacher",fullName:"John P. Tiefenbacher"}],productType:{id:"1",chapterContentType:"chapter",authoredCaption:"Edited by"}},{type:"book",id:"10238",title:"Food Packaging",subtitle:null,isOpenForSubmission:!0,hash:"891ee7ffd87b72cf155fcdf9c8ae5d1a",slug:null,bookSignature:"Dr. Norizah Mhd Sarbon",coverURL:"https://cdn.intechopen.com/books/images_new/10238.jpg",editedByType:null,editors:[{id:"246000",title:"Dr.",name:"Norizah",surname:"Mhd Sarbon",slug:"norizah-mhd-sarbon",fullName:"Norizah Mhd Sarbon"}],productType:{id:"1",chapterContentType:"chapter",authoredCaption:"Edited by"}},{type:"book",id:"10374",title:"Advances in Micro- and Nanofluidics",subtitle:null,isOpenForSubmission:!0,hash:"b7ba9cab862a9bca2fc9f9ee72ba5eec",slug:null,bookSignature:"Prof. S. M. Sohel Murshed",coverURL:"https://cdn.intechopen.com/books/images_new/10374.jpg",editedByType:null,editors:[{id:"24904",title:"Prof.",name:"S. M. Sohel",surname:"Murshed",slug:"s.-m.-sohel-murshed",fullName:"S. M. Sohel Murshed"}],productType:{id:"1",chapterContentType:"chapter",authoredCaption:"Edited by"}},{type:"book",id:"10358",title:"Silage - Recent Advances and New Perspectives",subtitle:null,isOpenForSubmission:!0,hash:"1e33f63e9311af352daf51d49f0a3aef",slug:null,bookSignature:"Dr. Juliana Oliveira and Dr. Edson Mauro Santos",coverURL:"https://cdn.intechopen.com/books/images_new/10358.jpg",editedByType:null,editors:[{id:"180036",title:"Dr.",name:"Juliana",surname:"Oliveira",slug:"juliana-oliveira",fullName:"Juliana Oliveira"}],productType:{id:"1",chapterContentType:"chapter",authoredCaption:"Edited by"}},{type:"book",id:"10491",title:"Anaerobic Digestion in Natural and Built Environments",subtitle:null,isOpenForSubmission:!0,hash:"082ec753a05d6c7ed8cc5559e7dac432",slug:null,bookSignature:"Dr. Anna Sikora and Dr. Anna Detman",coverURL:"https://cdn.intechopen.com/books/images_new/10491.jpg",editedByType:null,editors:[{id:"146985",title:"Dr.",name:"Anna",surname:"Sikora",slug:"anna-sikora",fullName:"Anna Sikora"}],productType:{id:"1",chapterContentType:"chapter",authoredCaption:"Edited by"}},{type:"book",id:"10597",title:"Electric Grid Modernization",subtitle:null,isOpenForSubmission:!0,hash:"62f0e391662f7e8ae35a6bea2e77accf",slug:null,bookSignature:"Dr. Mahmoud Ghofrani",coverURL:"https://cdn.intechopen.com/books/images_new/10597.jpg",editedByType:null,editors:[{id:"183482",title:"Dr.",name:"Mahmoud",surname:"Ghofrani",slug:"mahmoud-ghofrani",fullName:"Mahmoud Ghofrani"}],productType:{id:"1",chapterContentType:"chapter",authoredCaption:"Edited by"}},{type:"book",id:"10412",title:"Transition Metals",subtitle:null,isOpenForSubmission:!0,hash:"bd7287b801dc0ac77e01f66842dc1d99",slug:null,bookSignature:"Dr. Sajjad Haider and Dr. Adnan Haider",coverURL:"https://cdn.intechopen.com/books/images_new/10412.jpg",editedByType:null,editors:[{id:"110708",title:"Dr.",name:"Sajjad",surname:"Haider",slug:"sajjad-haider",fullName:"Sajjad Haider"}],productType:{id:"1",chapterContentType:"chapter",authoredCaption:"Edited by"}},{type:"book",id:"10216",title:"Paraffin - Thermal Energy Storage Applications",subtitle:null,isOpenForSubmission:!0,hash:"456090b63f5ba2290e24e655abd119bf",slug:null,bookSignature:"Dr. Elsayed Zaki and Dr. Abdelghaffar S. Dhmees",coverURL:"https://cdn.intechopen.com/books/images_new/10216.jpg",editedByType:null,editors:[{id:"220156",title:"Dr.",name:"Elsayed",surname:"Zaki",slug:"elsayed-zaki",fullName:"Elsayed Zaki"}],productType:{id:"1",chapterContentType:"chapter",authoredCaption:"Edited by"}},{type:"book",id:"10573",title:"Fluid-Structure Interaction",subtitle:null,isOpenForSubmission:!0,hash:"3950d1f9c82160d23bc594d00ec2ffbb",slug:null,bookSignature:"Dr. Khaled Ghaedi",coverURL:"https://cdn.intechopen.com/books/images_new/10573.jpg",editedByType:null,editors:[{id:"190572",title:"Dr.",name:"Khaled",surname:"Ghaedi",slug:"khaled-ghaedi",fullName:"Khaled Ghaedi"}],productType:{id:"1",chapterContentType:"chapter",authoredCaption:"Edited by"}}],filtersByTopic:[{group:"topic",caption:"Agricultural and Biological Sciences",value:5,count:9},{group:"topic",caption:"Biochemistry, Genetics and Molecular Biology",value:6,count:18},{group:"topic",caption:"Business, Management and Economics",value:7,count:2},{group:"topic",caption:"Chemistry",value:8,count:7},{group:"topic",caption:"Computer and Information Science",value:9,count:11},{group:"topic",caption:"Earth and Planetary Sciences",value:10,count:5},{group:"topic",caption:"Engineering",value:11,count:15},{group:"topic",caption:"Environmental Sciences",value:12,count:2},{group:"topic",caption:"Immunology and Microbiology",value:13,count:5},{group:"topic",caption:"Materials Science",value:14,count:4},{group:"topic",caption:"Mathematics",value:15,count:1},{group:"topic",caption:"Medicine",value:16,count:61},{group:"topic",caption:"Nanotechnology and Nanomaterials",value:17,count:1},{group:"topic",caption:"Neuroscience",value:18,count:1},{group:"topic",caption:"Pharmacology, Toxicology and Pharmaceutical Science",value:19,count:6},{group:"topic",caption:"Physics",value:20,count:2},{group:"topic",caption:"Psychology",value:21,count:3},{group:"topic",caption:"Robotics",value:22,count:1},{group:"topic",caption:"Social Sciences",value:23,count:3},{group:"topic",caption:"Technology",value:24,count:1},{group:"topic",caption:"Veterinary Medicine and Science",value:25,count:2}],offset:12,limit:12,total:36},popularBooks:{featuredBooks:[{type:"book",id:"7802",title:"Modern Slavery and Human Trafficking",subtitle:null,isOpenForSubmission:!1,hash:"587a0b7fb765f31cc98de33c6c07c2e0",slug:"modern-slavery-and-human-trafficking",bookSignature:"Jane Reeves",coverURL:"https://cdn.intechopen.com/books/images_new/7802.jpg",editors:[{id:"211328",title:"Prof.",name:"Jane",middleName:null,surname:"Reeves",slug:"jane-reeves",fullName:"Jane Reeves"}],equalEditorOne:null,equalEditorTwo:null,equalEditorThree:null,productType:{id:"1",chapterContentType:"chapter"}},{type:"book",id:"8545",title:"Animal Reproduction in Veterinary Medicine",subtitle:null,isOpenForSubmission:!1,hash:"13aaddf5fdbbc78387e77a7da2388bf6",slug:"animal-reproduction-in-veterinary-medicine",bookSignature:"Faruk Aral, Rita Payan-Carreira and Miguel Quaresma",coverURL:"https://cdn.intechopen.com/books/images_new/8545.jpg",editors:[{id:"25600",title:"Prof.",name:"Faruk",middleName:null,surname:"Aral",slug:"faruk-aral",fullName:"Faruk Aral"}],equalEditorOne:null,equalEditorTwo:null,equalEditorThree:null,productType:{id:"1",chapterContentType:"chapter"}},{type:"book",id:"9961",title:"Data Mining",subtitle:"Methods, Applications and Systems",isOpenForSubmission:!1,hash:"ed79fb6364f2caf464079f94a0387146",slug:"data-mining-methods-applications-and-systems",bookSignature:"Derya Birant",coverURL:"https://cdn.intechopen.com/books/images_new/9961.jpg",editors:[{id:"15609",title:"Dr.",name:"Derya",middleName:null,surname:"Birant",slug:"derya-birant",fullName:"Derya Birant"}],equalEditorOne:null,equalEditorTwo:null,equalEditorThree:null,productType:{id:"1",chapterContentType:"chapter"}},{type:"book",id:"9157",title:"Neurodegenerative Diseases",subtitle:"Molecular Mechanisms and Current Therapeutic Approaches",isOpenForSubmission:!1,hash:"bc8be577966ef88735677d7e1e92ed28",slug:"neurodegenerative-diseases-molecular-mechanisms-and-current-therapeutic-approaches",bookSignature:"Nagehan Ersoy Tunalı",coverURL:"https://cdn.intechopen.com/books/images_new/9157.jpg",editors:[{id:"82778",title:"Ph.D.",name:"Nagehan",middleName:null,surname:"Ersoy Tunalı",slug:"nagehan-ersoy-tunali",fullName:"Nagehan Ersoy Tunalı"}],equalEditorOne:null,equalEditorTwo:null,equalEditorThree:null,productType:{id:"1",chapterContentType:"chapter"}},{type:"book",id:"8686",title:"Direct Torque Control Strategies of Electrical Machines",subtitle:null,isOpenForSubmission:!1,hash:"b6ad22b14db2b8450228545d3d4f6b1a",slug:"direct-torque-control-strategies-of-electrical-machines",bookSignature:"Fatma Ben Salem",coverURL:"https://cdn.intechopen.com/books/images_new/8686.jpg",editors:[{id:"295623",title:"Associate Prof.",name:"Fatma",middleName:null,surname:"Ben Salem",slug:"fatma-ben-salem",fullName:"Fatma Ben Salem"}],equalEditorOne:null,equalEditorTwo:null,equalEditorThree:null,productType:{id:"1",chapterContentType:"chapter"}},{type:"book",id:"7434",title:"Molecular Biotechnology",subtitle:null,isOpenForSubmission:!1,hash:"eceede809920e1ec7ecadd4691ede2ec",slug:"molecular-biotechnology",bookSignature:"Sergey Sedykh",coverURL:"https://cdn.intechopen.com/books/images_new/7434.jpg",editors:[{id:"178316",title:"Ph.D.",name:"Sergey",middleName:null,surname:"Sedykh",slug:"sergey-sedykh",fullName:"Sergey Sedykh"}],equalEditorOne:null,equalEditorTwo:null,equalEditorThree:null,productType:{id:"1",chapterContentType:"chapter"}},{type:"book",id:"9839",title:"Outdoor Recreation",subtitle:"Physiological and Psychological Effects on Health",isOpenForSubmission:!1,hash:"5f5a0d64267e32567daffa5b0c6a6972",slug:"outdoor-recreation-physiological-and-psychological-effects-on-health",bookSignature:"Hilde G. Nielsen",coverURL:"https://cdn.intechopen.com/books/images_new/9839.jpg",editors:[{id:"158692",title:"Ph.D.",name:"Hilde G.",middleName:null,surname:"Nielsen",slug:"hilde-g.-nielsen",fullName:"Hilde G. Nielsen"}],equalEditorOne:null,equalEditorTwo:null,equalEditorThree:null,productType:{id:"1",chapterContentType:"chapter"}},{type:"book",id:"9208",title:"Welding",subtitle:"Modern Topics",isOpenForSubmission:!1,hash:"7d6be076ccf3a3f8bd2ca52d86d4506b",slug:"welding-modern-topics",bookSignature:"Sadek Crisóstomo Absi Alfaro, Wojciech Borek and Błażej Tomiczek",coverURL:"https://cdn.intechopen.com/books/images_new/9208.jpg",editors:[{id:"65292",title:"Prof.",name:"Sadek Crisostomo Absi",middleName:"C. Absi",surname:"Alfaro",slug:"sadek-crisostomo-absi-alfaro",fullName:"Sadek Crisostomo Absi Alfaro"}],equalEditorOne:null,equalEditorTwo:null,equalEditorThree:null,productType:{id:"1",chapterContentType:"chapter"}},{type:"book",id:"9139",title:"Topics in Primary Care Medicine",subtitle:null,isOpenForSubmission:!1,hash:"ea774a4d4c1179da92a782e0ae9cde92",slug:"topics-in-primary-care-medicine",bookSignature:"Thomas F. Heston",coverURL:"https://cdn.intechopen.com/books/images_new/9139.jpg",editors:[{id:"217926",title:"Dr.",name:"Thomas F.",middleName:null,surname:"Heston",slug:"thomas-f.-heston",fullName:"Thomas F. Heston"}],equalEditorOne:null,equalEditorTwo:null,equalEditorThree:null,productType:{id:"1",chapterContentType:"chapter"}},{type:"book",id:"9343",title:"Trace Metals in the Environment",subtitle:"New Approaches and Recent Advances",isOpenForSubmission:!1,hash:"ae07e345bc2ce1ebbda9f70c5cd12141",slug:"trace-metals-in-the-environment-new-approaches-and-recent-advances",bookSignature:"Mario Alfonso Murillo-Tovar, Hugo Saldarriaga-Noreña and Agnieszka Saeid",coverURL:"https://cdn.intechopen.com/books/images_new/9343.jpg",editors:[{id:"255959",title:"Dr.",name:"Mario Alfonso",middleName:null,surname:"Murillo-Tovar",slug:"mario-alfonso-murillo-tovar",fullName:"Mario Alfonso Murillo-Tovar"}],equalEditorOne:null,equalEditorTwo:null,equalEditorThree:null,productType:{id:"1",chapterContentType:"chapter"}},{type:"book",id:"8697",title:"Virtual Reality and Its Application in Education",subtitle:null,isOpenForSubmission:!1,hash:"ee01b5e387ba0062c6b0d1e9227bda05",slug:"virtual-reality-and-its-application-in-education",bookSignature:"Dragan Cvetković",coverURL:"https://cdn.intechopen.com/books/images_new/8697.jpg",editors:[{id:"101330",title:"Dr.",name:"Dragan",middleName:"Mladen",surname:"Cvetković",slug:"dragan-cvetkovic",fullName:"Dragan Cvetković"}],equalEditorOne:null,equalEditorTwo:null,equalEditorThree:null,productType:{id:"1",chapterContentType:"chapter"}},{type:"book",id:"7831",title:"Sustainability in Urban Planning and Design",subtitle:null,isOpenForSubmission:!1,hash:"c924420492c8c2c9751e178d025f4066",slug:"sustainability-in-urban-planning-and-design",bookSignature:"Amjad Almusaed, Asaad Almssad and Linh Truong - Hong",coverURL:"https://cdn.intechopen.com/books/images_new/7831.jpg",editors:[{id:"110471",title:"Dr.",name:"Amjad",middleName:"Zaki",surname:"Almusaed",slug:"amjad-almusaed",fullName:"Amjad Almusaed"}],equalEditorOne:null,equalEditorTwo:null,equalEditorThree:null,productType:{id:"1",chapterContentType:"chapter"}}],offset:12,limit:12,total:5141},hotBookTopics:{hotBooks:[],offset:0,limit:12,total:null},publish:{},publishingProposal:{success:null,errors:{}},books:{featuredBooks:[{type:"book",id:"9208",title:"Welding",subtitle:"Modern Topics",isOpenForSubmission:!1,hash:"7d6be076ccf3a3f8bd2ca52d86d4506b",slug:"welding-modern-topics",bookSignature:"Sadek Crisóstomo Absi Alfaro, Wojciech Borek and Błażej Tomiczek",coverURL:"https://cdn.intechopen.com/books/images_new/9208.jpg",editors:[{id:"65292",title:"Prof.",name:"Sadek Crisostomo Absi",middleName:"C. Absi",surname:"Alfaro",slug:"sadek-crisostomo-absi-alfaro",fullName:"Sadek Crisostomo Absi Alfaro"}],equalEditorOne:null,equalEditorTwo:null,equalEditorThree:null,productType:{id:"1",chapterContentType:"chapter"}},{type:"book",id:"9139",title:"Topics in Primary Care Medicine",subtitle:null,isOpenForSubmission:!1,hash:"ea774a4d4c1179da92a782e0ae9cde92",slug:"topics-in-primary-care-medicine",bookSignature:"Thomas F. Heston",coverURL:"https://cdn.intechopen.com/books/images_new/9139.jpg",editors:[{id:"217926",title:"Dr.",name:"Thomas F.",middleName:null,surname:"Heston",slug:"thomas-f.-heston",fullName:"Thomas F. Heston"}],equalEditorOne:null,equalEditorTwo:null,equalEditorThree:null,productType:{id:"1",chapterContentType:"chapter"}},{type:"book",id:"8697",title:"Virtual Reality and Its Application in Education",subtitle:null,isOpenForSubmission:!1,hash:"ee01b5e387ba0062c6b0d1e9227bda05",slug:"virtual-reality-and-its-application-in-education",bookSignature:"Dragan Cvetković",coverURL:"https://cdn.intechopen.com/books/images_new/8697.jpg",editors:[{id:"101330",title:"Dr.",name:"Dragan",middleName:"Mladen",surname:"Cvetković",slug:"dragan-cvetkovic",fullName:"Dragan Cvetković"}],equalEditorOne:null,equalEditorTwo:null,equalEditorThree:null,productType:{id:"1",chapterContentType:"chapter"}},{type:"book",id:"9343",title:"Trace Metals in the Environment",subtitle:"New Approaches and Recent Advances",isOpenForSubmission:!1,hash:"ae07e345bc2ce1ebbda9f70c5cd12141",slug:"trace-metals-in-the-environment-new-approaches-and-recent-advances",bookSignature:"Mario Alfonso Murillo-Tovar, Hugo Saldarriaga-Noreña and Agnieszka Saeid",coverURL:"https://cdn.intechopen.com/books/images_new/9343.jpg",editors:[{id:"255959",title:"Dr.",name:"Mario Alfonso",middleName:null,surname:"Murillo-Tovar",slug:"mario-alfonso-murillo-tovar",fullName:"Mario Alfonso Murillo-Tovar"}],equalEditorOne:null,equalEditorTwo:null,equalEditorThree:null,productType:{id:"1",chapterContentType:"chapter"}},{type:"book",id:"9785",title:"Endometriosis",subtitle:null,isOpenForSubmission:!1,hash:"f457ca61f29cf7e8bc191732c50bb0ce",slug:"endometriosis",bookSignature:"Courtney Marsh",coverURL:"https://cdn.intechopen.com/books/images_new/9785.jpg",editors:[{id:"255491",title:"Dr.",name:"Courtney",middleName:null,surname:"Marsh",slug:"courtney-marsh",fullName:"Courtney Marsh"}],equalEditorOne:null,equalEditorTwo:null,equalEditorThree:null,productType:{id:"1",chapterContentType:"chapter"}},{type:"book",id:"7831",title:"Sustainability in Urban Planning and Design",subtitle:null,isOpenForSubmission:!1,hash:"c924420492c8c2c9751e178d025f4066",slug:"sustainability-in-urban-planning-and-design",bookSignature:"Amjad Almusaed, Asaad Almssad and Linh Truong - Hong",coverURL:"https://cdn.intechopen.com/books/images_new/7831.jpg",editors:[{id:"110471",title:"Dr.",name:"Amjad",middleName:"Zaki",surname:"Almusaed",slug:"amjad-almusaed",fullName:"Amjad Almusaed"}],equalEditorOne:null,equalEditorTwo:null,equalEditorThree:null,productType:{id:"1",chapterContentType:"chapter"}},{type:"book",id:"9376",title:"Contemporary Developments and Perspectives in International Health Security",subtitle:"Volume 1",isOpenForSubmission:!1,hash:"b9a00b84cd04aae458fb1d6c65795601",slug:"contemporary-developments-and-perspectives-in-international-health-security-volume-1",bookSignature:"Stanislaw P. Stawicki, Michael S. Firstenberg, Sagar C. Galwankar, Ricardo Izurieta and Thomas Papadimos",coverURL:"https://cdn.intechopen.com/books/images_new/9376.jpg",editors:[{id:"181694",title:"Dr.",name:"Stanislaw P.",middleName:null,surname:"Stawicki",slug:"stanislaw-p.-stawicki",fullName:"Stanislaw P. Stawicki"}],equalEditorOne:null,equalEditorTwo:null,equalEditorThree:null,productType:{id:"1",chapterContentType:"chapter"}},{type:"book",id:"7769",title:"Medical Isotopes",subtitle:null,isOpenForSubmission:!1,hash:"f8d3c5a6c9a42398e56b4e82264753f7",slug:"medical-isotopes",bookSignature:"Syed Ali Raza Naqvi and Muhammad Babar Imrani",coverURL:"https://cdn.intechopen.com/books/images_new/7769.jpg",editors:[{id:"259190",title:"Dr.",name:"Syed Ali Raza",middleName:null,surname:"Naqvi",slug:"syed-ali-raza-naqvi",fullName:"Syed Ali Raza Naqvi"}],equalEditorOne:null,equalEditorTwo:null,equalEditorThree:null,productType:{id:"1",chapterContentType:"chapter"}},{type:"book",id:"9279",title:"Concepts, Applications and Emerging Opportunities in Industrial Engineering",subtitle:null,isOpenForSubmission:!1,hash:"9bfa87f9b627a5468b7c1e30b0eea07a",slug:"concepts-applications-and-emerging-opportunities-in-industrial-engineering",bookSignature:"Gary Moynihan",coverURL:"https://cdn.intechopen.com/books/images_new/9279.jpg",editors:[{id:"16974",title:"Dr.",name:"Gary",middleName:null,surname:"Moynihan",slug:"gary-moynihan",fullName:"Gary Moynihan"}],equalEditorOne:null,equalEditorTwo:null,equalEditorThree:null,productType:{id:"1",chapterContentType:"chapter"}},{type:"book",id:"7807",title:"A Closer Look at Organizational Culture in Action",subtitle:null,isOpenForSubmission:!1,hash:"05c608b9271cc2bc711f4b28748b247b",slug:"a-closer-look-at-organizational-culture-in-action",bookSignature:"Süleyman Davut Göker",coverURL:"https://cdn.intechopen.com/books/images_new/7807.jpg",editors:[{id:"190035",title:"Associate Prof.",name:"Süleyman Davut",middleName:null,surname:"Göker",slug:"suleyman-davut-goker",fullName:"Süleyman Davut Göker"}],equalEditorOne:null,equalEditorTwo:null,equalEditorThree:null,productType:{id:"1",chapterContentType:"chapter"}}],latestBooks:[{type:"book",id:"7434",title:"Molecular Biotechnology",subtitle:null,isOpenForSubmission:!1,hash:"eceede809920e1ec7ecadd4691ede2ec",slug:"molecular-biotechnology",bookSignature:"Sergey Sedykh",coverURL:"https://cdn.intechopen.com/books/images_new/7434.jpg",editedByType:"Edited by",editors:[{id:"178316",title:"Ph.D.",name:"Sergey",middleName:null,surname:"Sedykh",slug:"sergey-sedykh",fullName:"Sergey Sedykh"}],equalEditorOne:null,equalEditorTwo:null,equalEditorThree:null,productType:{id:"1",chapterContentType:"chapter",authoredCaption:"Edited by"}},{type:"book",id:"8545",title:"Animal Reproduction in Veterinary Medicine",subtitle:null,isOpenForSubmission:!1,hash:"13aaddf5fdbbc78387e77a7da2388bf6",slug:"animal-reproduction-in-veterinary-medicine",bookSignature:"Faruk Aral, Rita Payan-Carreira and Miguel Quaresma",coverURL:"https://cdn.intechopen.com/books/images_new/8545.jpg",editedByType:"Edited by",editors:[{id:"25600",title:"Prof.",name:"Faruk",middleName:null,surname:"Aral",slug:"faruk-aral",fullName:"Faruk Aral"}],equalEditorOne:null,equalEditorTwo:null,equalEditorThree:null,productType:{id:"1",chapterContentType:"chapter",authoredCaption:"Edited by"}},{type:"book",id:"9569",title:"Methods in Molecular Medicine",subtitle:null,isOpenForSubmission:!1,hash:"691d3f3c4ac25a8093414e9b270d2843",slug:"methods-in-molecular-medicine",bookSignature:"Yusuf Tutar",coverURL:"https://cdn.intechopen.com/books/images_new/9569.jpg",editedByType:"Edited by",editors:[{id:"158492",title:"Prof.",name:"Yusuf",middleName:null,surname:"Tutar",slug:"yusuf-tutar",fullName:"Yusuf Tutar"}],equalEditorOne:null,equalEditorTwo:null,equalEditorThree:null,productType:{id:"1",chapterContentType:"chapter",authoredCaption:"Edited by"}},{type:"book",id:"9839",title:"Outdoor Recreation",subtitle:"Physiological and Psychological Effects on Health",isOpenForSubmission:!1,hash:"5f5a0d64267e32567daffa5b0c6a6972",slug:"outdoor-recreation-physiological-and-psychological-effects-on-health",bookSignature:"Hilde G. Nielsen",coverURL:"https://cdn.intechopen.com/books/images_new/9839.jpg",editedByType:"Edited by",editors:[{id:"158692",title:"Ph.D.",name:"Hilde G.",middleName:null,surname:"Nielsen",slug:"hilde-g.-nielsen",fullName:"Hilde G. Nielsen"}],equalEditorOne:null,equalEditorTwo:null,equalEditorThree:null,productType:{id:"1",chapterContentType:"chapter",authoredCaption:"Edited by"}},{type:"book",id:"7802",title:"Modern Slavery and Human Trafficking",subtitle:null,isOpenForSubmission:!1,hash:"587a0b7fb765f31cc98de33c6c07c2e0",slug:"modern-slavery-and-human-trafficking",bookSignature:"Jane Reeves",coverURL:"https://cdn.intechopen.com/books/images_new/7802.jpg",editedByType:"Edited by",editors:[{id:"211328",title:"Prof.",name:"Jane",middleName:null,surname:"Reeves",slug:"jane-reeves",fullName:"Jane Reeves"}],equalEditorOne:null,equalEditorTwo:null,equalEditorThree:null,productType:{id:"1",chapterContentType:"chapter",authoredCaption:"Edited by"}},{type:"book",id:"8063",title:"Food Security in Africa",subtitle:null,isOpenForSubmission:!1,hash:"8cbf3d662b104d19db2efc9d59249efc",slug:"food-security-in-africa",bookSignature:"Barakat Mahmoud",coverURL:"https://cdn.intechopen.com/books/images_new/8063.jpg",editedByType:"Edited by",editors:[{id:"92016",title:"Dr.",name:"Barakat",middleName:null,surname:"Mahmoud",slug:"barakat-mahmoud",fullName:"Barakat Mahmoud"}],equalEditorOne:null,equalEditorTwo:null,equalEditorThree:null,productType:{id:"1",chapterContentType:"chapter",authoredCaption:"Edited by"}},{type:"book",id:"10118",title:"Plant Stress Physiology",subtitle:null,isOpenForSubmission:!1,hash:"c68b09d2d2634fc719ae3b9a64a27839",slug:"plant-stress-physiology",bookSignature:"Akbar Hossain",coverURL:"https://cdn.intechopen.com/books/images_new/10118.jpg",editedByType:"Edited by",editors:[{id:"280755",title:"Dr.",name:"Akbar",middleName:null,surname:"Hossain",slug:"akbar-hossain",fullName:"Akbar Hossain"}],equalEditorOne:null,equalEditorTwo:null,equalEditorThree:null,productType:{id:"1",chapterContentType:"chapter",authoredCaption:"Edited by"}},{type:"book",id:"9157",title:"Neurodegenerative Diseases",subtitle:"Molecular Mechanisms and Current Therapeutic Approaches",isOpenForSubmission:!1,hash:"bc8be577966ef88735677d7e1e92ed28",slug:"neurodegenerative-diseases-molecular-mechanisms-and-current-therapeutic-approaches",bookSignature:"Nagehan Ersoy Tunalı",coverURL:"https://cdn.intechopen.com/books/images_new/9157.jpg",editedByType:"Edited by",editors:[{id:"82778",title:"Ph.D.",name:"Nagehan",middleName:null,surname:"Ersoy Tunalı",slug:"nagehan-ersoy-tunali",fullName:"Nagehan Ersoy Tunalı"}],equalEditorOne:null,equalEditorTwo:null,equalEditorThree:null,productType:{id:"1",chapterContentType:"chapter",authoredCaption:"Edited by"}},{type:"book",id:"9961",title:"Data Mining",subtitle:"Methods, Applications and Systems",isOpenForSubmission:!1,hash:"ed79fb6364f2caf464079f94a0387146",slug:"data-mining-methods-applications-and-systems",bookSignature:"Derya Birant",coverURL:"https://cdn.intechopen.com/books/images_new/9961.jpg",editedByType:"Edited by",editors:[{id:"15609",title:"Dr.",name:"Derya",middleName:null,surname:"Birant",slug:"derya-birant",fullName:"Derya Birant"}],equalEditorOne:null,equalEditorTwo:null,equalEditorThree:null,productType:{id:"1",chapterContentType:"chapter",authoredCaption:"Edited by"}},{type:"book",id:"8686",title:"Direct Torque Control Strategies of Electrical Machines",subtitle:null,isOpenForSubmission:!1,hash:"b6ad22b14db2b8450228545d3d4f6b1a",slug:"direct-torque-control-strategies-of-electrical-machines",bookSignature:"Fatma Ben Salem",coverURL:"https://cdn.intechopen.com/books/images_new/8686.jpg",editedByType:"Edited by",editors:[{id:"295623",title:"Associate Prof.",name:"Fatma",middleName:null,surname:"Ben Salem",slug:"fatma-ben-salem",fullName:"Fatma Ben Salem"}],equalEditorOne:null,equalEditorTwo:null,equalEditorThree:null,productType:{id:"1",chapterContentType:"chapter",authoredCaption:"Edited by"}}]},subject:{topic:{id:"397",title:"Cytogenetics",slug:"cytogenetics",parent:{title:"Human Genetics",slug:"human-genetics"},numberOfBooks:3,numberOfAuthorsAndEditors:51,numberOfWosCitations:26,numberOfCrossrefCitations:15,numberOfDimensionsCitations:38,videoUrl:null,fallbackUrl:null,description:null},booksByTopicFilter:{topicSlug:"cytogenetics",sort:"-publishedDate",limit:12,offset:0},booksByTopicCollection:[{type:"book",id:"8073",title:"Chromosomal Abnormalities",subtitle:null,isOpenForSubmission:!1,hash:"6a9d3c58434edf5e65f9849a6858edfe",slug:"chromosomal-abnormalities",bookSignature:"Tülay Aşkın Çelik and Subrata Dey",coverURL:"https://cdn.intechopen.com/books/images_new/8073.jpg",editedByType:"Edited by",editors:[{id:"74041",title:"Dr.",name:"Tulay",middleName:null,surname:"Askin Celik",slug:"tulay-askin-celik",fullName:"Tulay Askin Celik"}],equalEditorOne:null,equalEditorTwo:null,equalEditorThree:null,productType:{id:"1",chapterContentType:"chapter",authoredCaption:"Edited by"}},{type:"book",id:"6920",title:"Cytogenetics",subtitle:"Past, Present and Further Perspectives",isOpenForSubmission:!1,hash:"d72001eed508dfa72d9a68e1de28bb4b",slug:"cytogenetics-past-present-and-further-perspectives",bookSignature:"Marcelo Larramendy and Sonia Soloneski",coverURL:"https://cdn.intechopen.com/books/images_new/6920.jpg",editedByType:"Edited by",editors:[{id:"14764",title:"Dr.",name:"Marcelo L.",middleName:null,surname:"Larramendy",slug:"marcelo-l.-larramendy",fullName:"Marcelo L. Larramendy"}],equalEditorOne:null,equalEditorTwo:null,equalEditorThree:null,productType:{id:"1",chapterContentType:"chapter",authoredCaption:"Edited by"}},{type:"book",id:"1721",title:"Recent Trends in Cytogenetic Studies",subtitle:"Methodologies and Applications",isOpenForSubmission:!1,hash:"0087ed85088a960b84ae536fe7204ee9",slug:"recent-trends-in-cytogenetic-studies-methodologies-and-applications",bookSignature:"Padma Tirunilai",coverURL:"https://cdn.intechopen.com/books/images_new/1721.jpg",editedByType:"Edited by",editors:[{id:"87653",title:"Prof.",name:"Padma",middleName:null,surname:"Tirunilai",slug:"padma-tirunilai",fullName:"Padma Tirunilai"}],equalEditorOne:null,equalEditorTwo:null,equalEditorThree:null,productType:{id:"1",chapterContentType:"chapter",authoredCaption:"Edited by"}}],booksByTopicTotal:3,mostCitedChapters:[{id:"30743",doi:"10.5772/35890",title:"Chromosomes as Tools for Discovering Biodiversity – The Case of Erythrinidae Fish Family",slug:"chromosomes-as-tools-for-discovering-biodiversity-the-case-of-erythrinidae-fish-family-",totalDownloads:1756,totalCrossrefCites:9,totalDimensionsCites:25,book:{slug:"recent-trends-in-cytogenetic-studies-methodologies-and-applications",title:"Recent Trends in Cytogenetic Studies",fullTitle:"Recent Trends in Cytogenetic Studies - Methodologies and Applications"},signatures:"Marcelo de Bello Cioffi, Wagner Franco Molina, Roberto Ferreira Artoni and Luiz Antonio Carlos Bertollo",authors:[{id:"92318",title:"Prof.",name:"Roberto",middleName:null,surname:"Artoni",slug:"roberto-artoni",fullName:"Roberto Artoni"},{id:"106252",title:"Dr.",name:"Luiz",middleName:"Antonio Carlos",surname:"Bertollo",slug:"luiz-bertollo",fullName:"Luiz Bertollo"},{id:"106412",title:"MSc.",name:"Marcelo",middleName:null,surname:"Cioffi",slug:"marcelo-cioffi",fullName:"Marcelo Cioffi"},{id:"134346",title:"Dr.",name:"Wagner",middleName:null,surname:"Molina",slug:"wagner-molina",fullName:"Wagner Molina"}]},{id:"30736",doi:"10.5772/34200",title:"\ufeffCytogenetic Analysis of Primary Cultures and Cell Lines: Generalities, Applications and Protocols",slug:"cytogenetic-from-primary-cultures-and-cell-lines-applications-protocols-and-nomenclature",totalDownloads:9967,totalCrossrefCites:2,totalDimensionsCites:4,book:{slug:"recent-trends-in-cytogenetic-studies-methodologies-and-applications",title:"Recent Trends in Cytogenetic Studies",fullTitle:"Recent Trends in Cytogenetic Studies - Methodologies and Applications"},signatures:"Sandra Milena Rondón Lagos and Nelson Enrique Rangel Jiménez",authors:[{id:"99161",title:"Dr.",name:"Sandra Milena",middleName:null,surname:"Rondón-Lagos",slug:"sandra-milena-rondon-lagos",fullName:"Sandra Milena Rondón-Lagos"},{id:"108130",title:"MSc.",name:"Nelson Enrique",middleName:null,surname:"Rangel Jimenez",slug:"nelson-enrique-rangel-jimenez",fullName:"Nelson Enrique Rangel Jimenez"}]},{id:"63313",doi:"10.5772/intechopen.80486",title:"Cytogenetics in the Study of Chromosomal Rearrangement during Wheat Evolution and Breeding",slug:"cytogenetics-in-the-study-of-chromosomal-rearrangement-during-wheat-evolution-and-breeding",totalDownloads:626,totalCrossrefCites:2,totalDimensionsCites:3,book:{slug:"cytogenetics-past-present-and-further-perspectives",title:"Cytogenetics",fullTitle:"Cytogenetics - Past, Present and Further Perspectives"},signatures:"Elena A. Salina and Irina G. Adonina",authors:null}],mostDownloadedChaptersLast30Days:[{id:"71339",title:"The Energy as a Determinant Factor in the Ethiopathogeny of Chromosomal Abnormalities. The Unsuspected Bioenergetic Role of Melanin",slug:"the-energy-as-a-determinant-factor-in-the-ethiopathogeny-of-chromosomal-abnormalities-the-unsuspecte",totalDownloads:168,totalCrossrefCites:0,totalDimensionsCites:0,book:{slug:"chromosomal-abnormalities",title:"Chromosomal Abnormalities",fullTitle:"Chromosomal Abnormalities"},signatures:"Arturo Solis Herrera",authors:[{id:"280131",title:"Ph.D.",name:"Arturo",middleName:null,surname:"Solis Herrera",slug:"arturo-solis-herrera",fullName:"Arturo Solis Herrera"}]},{id:"71349",title:"Current Cytogenetic Abnormalities in Acute Myeloid Leukemia",slug:"current-cytogenetic-abnormalities-in-acute-myeloid-leukemia",totalDownloads:174,totalCrossrefCites:0,totalDimensionsCites:0,book:{slug:"chromosomal-abnormalities",title:"Chromosomal Abnormalities",fullTitle:"Chromosomal Abnormalities"},signatures:"Mounia Bendari, Nisrine Khoubila, Siham Cherkaoui, Nezha Hda, Meryem Qachouh, Mouna Lamchahab and Asmaa Quessar",authors:[{id:"306239",title:"Dr.",name:"Mounia",middleName:null,surname:"Bendari",slug:"mounia-bendari",fullName:"Mounia Bendari"},{id:"306240",title:"Prof.",name:"Nisrine",middleName:null,surname:"Khoubila",slug:"nisrine-khoubila",fullName:"Nisrine Khoubila"},{id:"306242",title:"Prof.",name:"Siham",middleName:null,surname:"Cherkaoui",slug:"siham-cherkaoui",fullName:"Siham Cherkaoui"},{id:"306243",title:"Prof.",name:"Mouna",middleName:null,surname:"Lamhahab",slug:"mouna-lamhahab",fullName:"Mouna Lamhahab"},{id:"306244",title:"Prof.",name:"Meryem",middleName:null,surname:"Qachouh",slug:"meryem-qachouh",fullName:"Meryem Qachouh"},{id:"306245",title:"Dr.",name:"Nezha",middleName:null,surname:"Hda",slug:"nezha-hda",fullName:"Nezha Hda"},{id:"306246",title:"Prof.",name:"Asmaa",middleName:null,surname:"Quessar",slug:"asmaa-quessar",fullName:"Asmaa Quessar"}]},{id:"70088",title:"Maize Chromosome Abnormalities and Breakage-Fusion-Bridge Cycles in Callus Cultures",slug:"maize-chromosome-abnormalities-and-breakage-fusion-bridge-cycles-in-callus-cultures",totalDownloads:176,totalCrossrefCites:0,totalDimensionsCites:0,book:{slug:"chromosomal-abnormalities",title:"Chromosomal Abnormalities",fullTitle:"Chromosomal Abnormalities"},signatures:"Margarida L.R. Aguiar-Perecin, Janay A. Santos-Serejo, José R. Gardingo and Mateus Mondin",authors:[{id:"208128",title:"Dr.",name:"Margarida",middleName:null,surname:"L. R. Aguiar-Perecin",slug:"margarida-l.-r.-aguiar-perecin",fullName:"Margarida L. R. Aguiar-Perecin"},{id:"306859",title:"Dr.",name:"Janay",middleName:null,surname:"A. Santos-Serejo",slug:"janay-a.-santos-serejo",fullName:"Janay A. Santos-Serejo"},{id:"309660",title:"Prof.",name:"Mateus",middleName:null,surname:"Mondin",slug:"mateus-mondin",fullName:"Mateus Mondin"},{id:"309661",title:"Prof.",name:"José",middleName:null,surname:"Raulindo Gardingo",slug:"jose-raulindo-gardingo",fullName:"José Raulindo Gardingo"}]},{id:"63567",title:"Cytogenetic Tools to Study the Biodiversity of Neotropical Fish: From the Classic to the Advent of Cell Culture",slug:"cytogenetic-tools-to-study-the-biodiversity-of-neotropical-fish-from-the-classic-to-the-advent-of-ce",totalDownloads:632,totalCrossrefCites:0,totalDimensionsCites:0,book:{slug:"cytogenetics-past-present-and-further-perspectives",title:"Cytogenetics",fullTitle:"Cytogenetics - Past, Present and Further Perspectives"},signatures:"Fabilene G. Paim, Maria Lígia M. de Oliveira Nobile, Fausto Foresti\nand Claudio Oliveira",authors:null},{id:"72597",title:"Polyploidy in the Ginger Family from Thailand",slug:"polyploidy-in-the-ginger-family-from-thailand",totalDownloads:159,totalCrossrefCites:0,totalDimensionsCites:1,book:{slug:"chromosomal-abnormalities",title:"Chromosomal Abnormalities",fullTitle:"Chromosomal Abnormalities"},signatures:"Kesara Anamthawat-Jónsson and Puangpaka Umpunjun",authors:[{id:"101215",title:"Prof.",name:"Kesara",middleName:null,surname:"Anamthawat-Jónsson",slug:"kesara-anamthawat-jonsson",fullName:"Kesara Anamthawat-Jónsson"},{id:"305033",title:"Dr.",name:"Puangpaka",middleName:null,surname:"Umpunjun",slug:"puangpaka-umpunjun",fullName:"Puangpaka Umpunjun"}]},{id:"72977",title:"Introductory Chapter: Chromosomal Abnormalities",slug:"introductory-chapter-chromosomal-abnormalities",totalDownloads:88,totalCrossrefCites:0,totalDimensionsCites:0,book:{slug:"chromosomal-abnormalities",title:"Chromosomal Abnormalities",fullTitle:"Chromosomal Abnormalities"},signatures:"Tülay Aşkin Çelik",authors:[{id:"74041",title:"Dr.",name:"Tulay",middleName:null,surname:"Askin Celik",slug:"tulay-askin-celik",fullName:"Tulay Askin Celik"}]},{id:"63136",title:"Cytogenomic Microarray Testing",slug:"cytogenomic-microarray-testing",totalDownloads:660,totalCrossrefCites:0,totalDimensionsCites:0,book:{slug:"cytogenetics-past-present-and-further-perspectives",title:"Cytogenetics",fullTitle:"Cytogenetics - Past, Present and Further Perspectives"},signatures:"Irene Plaza Pinto, Alex da Cruz, Emília Costa, Samara Pereira, Lysa\nMinasi and Aparecido da Cruz",authors:null},{id:"63313",title:"Cytogenetics in the Study of Chromosomal Rearrangement during Wheat Evolution and Breeding",slug:"cytogenetics-in-the-study-of-chromosomal-rearrangement-during-wheat-evolution-and-breeding",totalDownloads:626,totalCrossrefCites:2,totalDimensionsCites:3,book:{slug:"cytogenetics-past-present-and-further-perspectives",title:"Cytogenetics",fullTitle:"Cytogenetics - Past, Present and Further Perspectives"},signatures:"Elena A. Salina and Irina G. Adonina",authors:null},{id:"61824",title:"Cytogenetic Relationships of Turkish Oaks",slug:"cytogenetic-relationships-of-turkish-oaks",totalDownloads:483,totalCrossrefCites:0,totalDimensionsCites:0,book:{slug:"cytogenetics-past-present-and-further-perspectives",title:"Cytogenetics",fullTitle:"Cytogenetics - Past, Present and Further Perspectives"},signatures:"Aykut Yılmaz",authors:null},{id:"68521",title:"The Risk of Chromosomal Abnormalities in Cases of Minor and Major Fetal Anomalies in the Second Trimester",slug:"the-risk-of-chromosomal-abnormalities-in-cases-of-minor-and-major-fetal-anomalies-in-the-second-trim",totalDownloads:271,totalCrossrefCites:0,totalDimensionsCites:0,book:{slug:"chromosomal-abnormalities",title:"Chromosomal Abnormalities",fullTitle:"Chromosomal Abnormalities"},signatures:"Artúr Beke and Aténé Simonyi",authors:[{id:"211641",title:"Dr.",name:"Artúr",middleName:null,surname:"Beke",slug:"artur-beke",fullName:"Artúr Beke"},{id:"302526",title:"MSc.",name:"Aténé",middleName:null,surname:"Simonyi",slug:"atene-simonyi",fullName:"Aténé Simonyi"}]}],onlineFirstChaptersFilter:{topicSlug:"cytogenetics",limit:3,offset:0},onlineFirstChaptersCollection:[],onlineFirstChaptersTotal:0},preDownload:{success:null,errors:{}},aboutIntechopen:{},privacyPolicy:{},peerReviewing:{},howOpenAccessPublishingWithIntechopenWorks:{},sponsorshipBooks:{sponsorshipBooks:[{type:"book",id:"10176",title:"Microgrids and Local Energy Systems",subtitle:null,isOpenForSubmission:!0,hash:"c32b4a5351a88f263074b0d0ca813a9c",slug:null,bookSignature:"Prof. Nick Jenkins",coverURL:"https://cdn.intechopen.com/books/images_new/10176.jpg",editedByType:null,editors:[{id:"55219",title:"Prof.",name:"Nick",middleName:null,surname:"Jenkins",slug:"nick-jenkins",fullName:"Nick Jenkins"}],equalEditorOne:null,equalEditorTwo:null,equalEditorThree:null,productType:{id:"1",chapterContentType:"chapter"}}],offset:8,limit:8,total:1},route:{name:"profile.detail",path:"/profiles/56330/marek-kciuk",hash:"",query:{},params:{id:"56330",slug:"marek-kciuk"},fullPath:"/profiles/56330/marek-kciuk",meta:{},from:{name:null,path:"/",hash:"",query:{},params:{},fullPath:"/",meta:{}}}},function(){var e;(e=document.currentScript||document.scripts[document.scripts.length-1]).parentNode.removeChild(e)}()