Friday, April 5, 2019

Methods to Assess Groundwater Potential by Spring Locations

Methods to pass judgment Ground piddle Potential by Spring LocationsAbstractRegarding the ever increasing issue of pee scarcity in different countries, the current cultivation plans to apply support vector political machine (SVM), random forest (RF), and transmittable algorithm optimized random forest (RFGA) methods to assess groundwater electromotive force by spring postures. To this end, 14 rough-and-ready variables including DEM-derived, river-based, fault-based, footing role, and lithology factors were provided. Of 842 spring locations found, 70% (589) were utilise for sit around training, and the rest of them were use to evaluate the exemplifications. The mentioned prototypes were run and groundwater likely corresponds (GPMs) were produced. At last, receiver operating characteristics (ROC) curve was plotted to evaluate the efficiency of the methods. The leaves of the current discipline denoted that RFGA, and RF methods had cave in efficacy than different kerne ls of SVM prototype. Area under curve (AUC) of ROC value for RF and RFGA was estimated as 84.6, and 85.6%, respectively. AUC of ROC was computed as SVM- running(a) (78.6%), SVM-polynomial (76.8%), SVM-sigmoid (77.1%), and SVM- radial based take to the woods (77%). Furthermore, the results represented soaringer immenseness of altitude, TWI, and heel over angle in groundwater strength. The methodological analysis produced in the current area could be transferred to opposite places with water scarcity issues for groundwater authorisation assessment and vigilance. depict words Geographic information organisation, Ardebil, Iran, Support vector machine, hit-or-miss forest, Genetic algorithmIntroduction wet scarcity is regarded as one of the to the highest degree substantial soicio-environmental challenges in different countries. The demand on groundwater is increasing, and the overutilization of this valuable resource is threatening future generations (Todd and Mays 2005 Re kha and Thomas 2007) Thus, its management is believed to be vital. A divulge water resources management plan would be possible when there is enough knowledge ab extinct the resources (i.e. high electromotive force and liable(predicate) zones).In recent years, re essayers have made use of a variety of models to map groundwater potential such as frequency ratio (FR), weight of evidence (WofE), logistic regression (LR), index of entropy, significant belief function (Oh et al. 2011 Ozdemir 2011a, b Pourtaghi and Pourghasemi 2014 Davoodi Moghaddam et al. 2015 Naghibi and Pourghasemi 2015 Naghibi et al. 2015). Also, some researchers used machine schooling methods including boosted regression tree (BRT), variety and regression (CART), general linear model (GLM), and RF algorithms in this theater of acquire (Naghibi and Pourghasemi 2015 Rahmati et al. 2016). Lee et al (2012) employed mawkish neural ne bothrk (ANN) to assess groundwater productivity. Their results showed satisfacto ry cognitive process of ANN. Recently M1Naghibi et al. (2017) used quartette recently developed selective information mining models including AdaBoost, Bagging generalized additive model, and nave bayes for groundwater potential mathematical function. They have also introduced a tonic ensemble method from combination of the mentioned models and FR. In addition, Magaji et al. (2016) used geographical information system and evidential belief function model to produce groundwater recharge potential zones map. Theodossiou (2004) investigated how mode change influences the sustainability of groundwater in watershed-scale in Greece. Furthermore, Thivya et al. (2016) conducted a study to identify recharge mechanisms of groundwater in hard agitate aquifers implementing stable isotopes.Support vector machine (SVM) algorithm has been employed in different fields of study such as flood susceptibility assessment (Tehrany et al. 2014 Tehrany et al. 2015), and provinceslide susceptibility i nvestigation (Brenning 2005 Kavzogluetal 2014 Tien Bui et al. 2012 Yao et al. 2008 Yilmaz 2010 Tien Bui et al. 2015 Chen et al. 2017) with competent efficacy. Genetic algorithm is one of the or so advanced and pervasive developed heuristic search techniques in artificial intelligence and its application has been done in many fields of study including urban mean, ecological, climatical modelling, and remote sensing studies (Hasegava et al. 2013 Termansen et al. 2006 Chang et al. 2006 Chen et al. 2009).In the current study, we aim to investigate the performance of a novel method for optimisation of random forest and its results argon comp ard with RF and SVM models in groundwater potential mapping. base on the literature review, application of different kernels of SVM and RFGA in groundwater potential mapping are two main novelties of this study. Also, the splendour of different effective factors in groundwater potential is discussed. The results of the current study could d etermine high potential and susceptible groundwater potential zones and be used by water resource managers.Material and MethodsFigure 1 shows the methods and the flow chart implemented in the current study.Study AreaThe study commonwealth lies from 48 18 26 to 48 53 16 eastern longitudes and from 37 41 23 to 37 09 26 Yankee latitudes in Ardebil Province, Iran (Fig. 2). It clicks an area of 1,524 km2. The elevation in the study area ranges from 840 to 3,320 m above sea take with an average of 1,930 m. The mean annual precipitation of Khalkhal character is measured as 345 mm. The mean annual temperature of Khalkhal voice is 12 degrees Centigrade. In the respect of land use, 89.69% of Khalkhal region is unfolded by rangeland, and other land use classes are forest, agriculture, orchard, and residential areas. In the respect of lithology, Khalkhal region comprises of 14 lithological categories. Eav class (andesitic volcanic) covers most of the study area. Khalkhal region is locat ed in Ardebil province of Iran which includes 14 hydrological watersheds. These watersheds are located in three main split including central part, Khoresh Rostam, and Shahrood areas. In this area people exploit water resources by wells (42%), springs (47%), and qanats (11%) therefore, it can be seen that a high percent of the water requirement is obtained by springs.Data preparationSpring characteristicsThe springs location map was prepared for the study area victimization national reports (Iranian Department of Water Resources Management) and extensive field surveys in 150,000 scale. From 842 springs identified in the study area, 70% (589 springs) were casted for training purpose, and 30% (253 springs) were used as validation infoset (Fig. 2). Approximately ninety percent of the springs are permanent and ten percent of them are seasonal. terminate of the springs in Khalkhal region alters between 0.1 and 100 liters per second having an average of 1 liter per second. It can be se en that there are different kinds of spring in the study area such as contrast, drainage, and separate springs with 5.34%, 29.81%, 58.08%, and 6.77% of the springs, respectively. The average pH of the springs is measured as 6.68. The average electric conductivity (EC) of the springs is measured as 470 .Groundwater effective factorsIn this study, based on the literature review (Ozdemir 2011a, b Oh et al. 2011 Naghibi et al. 2017), cardinal groundwater effective factors such as altitude, slope angle, slope aspect, plan curvature, profile curvature, slope aloofness (LS), SPI, TWI, distance from rivers, river density, distance from faults, fault density, land use, and lithology were provided and mapped.The digital elevation model (DEM) of the Khalkhal region was created using the 150,000-scale topographic maps in 20 m resolution. Groundwater effective-factors such as altitude, slope angle, and slope aspect were prepared using DEM in ArcGIS 9.3 and represented in Fig. 3a-c.Plan curvat ure describes the divergence and convergence of flow and discriminates among basins (Fig. 3d). Profile curvature shows the rate at which the slope gradient alters in the direction of maximal slope (Catani et al. 2013) (Fig. 3e). Slope length is the combination of the slope length and slope steepness that shows soil loss potential from the combined slope features (Fig. 3f). SPI is a measure of the acrid power of flowing water based on the assumption that discharge is relative to specific catchment area (Moore et al. 1991) (Fig. 3g). The TWI affects accumulation and movement of surface runoff over the land surface (Elmahdy and Mostafa Mohamed 2014) (Fig. 3h).Distance from rivers and river density were created using topographical map of Khalkhal region (Fig. 3i, j). Also, distance from fault and fault density layers were produced using geological map (Fig. 3k, l).The land use map was created using Landsat images (Fig 3m). thither are five land use classes in the study area such as ag riculture, forest, orchard, rangeland, and residential area. Most of the study area is covered by the rangeland land use class. The lithology map was acquired using a 1100,000-scale geological map and the lithological units were grouped into fourteen classes (GSI 1997, Fig. 3n, Table 1).Support vector machines (SVM)SVM is known as a supervised machine learning technique that is performed based on the (SRM structural risk minimization) principle and statistical learning theory (Tien Bui et al. 2012). SVM transforms original stimulant space into a higher-dimensional feature space to find an optimum separating hyper plane. Marjanovic et al (2011) affirmed that separating hyper-plane is built in the original space of n coordinates between the points of two distinct classes. If the point is situated over the hyper-plane it ordain be classified as positive 1, if not, it will be classified as negative 1. penalization (C) controls the trade-off between margin and training errors, which as sists to prevent the models over-fitting (Marjanovic et al. 2011). The kernel width () controls the degree of nonlinearity of the model (Tien Bui et al. 2012). controversy (d) is the polynomial degree in the PL kernel function and (r) is the bias term in the kernel function for two kernels of SVM including PL and SIG kernels (Tehrany et al. 2014). In the current study, the 10-fold cross-validation was used to select the optimal kernel parameters of SVM (Pradhan 2013 Zhuang and Dai 2006).Random forest (RF) model Random forests (RFs) are very flexible and powerful ensemble classifiers based on decision trees which were for the first duration developed by Breiman (2001). RF constructs multiple trees based on random bootstrapped samples of the training dataset (Breiman 2001). The algorithm runs random binary trees that implement a subset of the observations over bootstrapping approach, of the initial dataset a random survival of the training data is selected and implement to create the model, the data which is not included are described as out of lulu (OOB) (Catani et al. 2013). The RF predicts the importance of a variables by looking at how a good deal the error of prediction increases when out of bag data for that variable is permuted turn all others are left fixed (Liaw and Wiener 2002 Catani et al. 2013). Random forests need two parameters to be tuned including the number of trees (ntree), and the number of variables (mtry).Genetic algorithm (GA) modelA familial algorithm (GA) is a search heuristic which mimics the natural selection process in the field of artificial intelligence. GA beings with a population of presented random solutions in some structure series. Then, a number of operators are repeatedly implemented, until convergence is obtained. As a matter of fact, the optimization strategy in GA could be described as a global optimization procedure with the benefit of not being drug-addicted on the initial value to gain the convergence. Crossover and mutation are implemented to produce newer and stop chromosomes populations (Yetilmezsoy and Demirel 2008).Random forest optimization methods In this study, we used two different methods for RF parameter optimization including caret package and genetic algorithm. Both of the models were applied in the R software.At first, we presented a hybrid RFGA model to predict groundwater potential which was firstly introduced by Hasegawa et al (2013) in the field of commute mode choice analysis. A simple method is trial and error, but there are many mixtures of parameters, and it needs much iteration to evaluate the options. Another method for optimization of these parameters is to use caret package. So, we proposed a practical method for optimizing the parameters of RF by meta- heuristic optimization using GAs. The rgenoud package of the R program (R Core Team 2012) Mebane and Sekhon (2011) were used to implement the optimizing process of RF parameters ntree and mtry. Input parameters of the RFGA model are subject to the GA-based parameter optimization process. notwithstanding that pair of parameters that minimizes the OOB error rate in this step is used as input to the RFGA model. For running RFGA, maximum number of generations was considered as 100, the population size was 300 and the domain of allowable values for each parameter of the function being optimized (mtry values between 1 and 14, ntree values between 1 and 2000). The run time of this process till the calculation is complete was approximately 2 h 20 min.Validation of groundwater potential maps (GPM)In the current study, receiver operating characteristics (ROC) curve was used to determine the performance of the GPMs produced using the implemented models. The area under the ROC curve (AUC) shows the quality of a forecast system by representing the ability of the system to predict correctly the occurrence or nonoccurrence of specific events (Negnevitsky 2002). The area under the curve of ROC ranges from 0 to 1. The qualitative relationship between AUC and prediction accuracy could be classified as excellent (0.9-1), very good (0.8-0.9), good (0.7-0.8), average (0.6-0.7), and poor (0.5-0.6). Based on the reviewer comment, and in order to consider the discharge values of the springs, two weights were assigned to the springs to take their discharge into account in the evaluation process. For conducting this idea, average was calculated for discharge values of the springs. Then, weight 2 was assigned to the springs with greater discharge than the median value, while other springs were assigned to a weight of 1. Finally, for calculating ROC values, values of the springs with weight 2 were considered twice in the analysis, while other springs were considered once. This procedure enhances the influence of the springs with higher discharges in the evaluation process.ResultsSupport vector machineIn the current study, four kernels of the SVM model were optimized by cross-validation and GPMs were plotted in ArcGIS 9.3. Based on the results, the scoop SVM with LN kernel had a cost value of 0.001. The results of PL kernel showed that gamma=0.5, cost= 0.1, and degree= 2 had the high hat performance. In the case of SVM-SIG, best performance was gained by gamma= 1, and c= 0.01. The results of SVM-RBF showed that gamma= 0.5, c= 10 had the best performance.The resultant GPMs produced using different kernels of the SVM are represented in Fig. 5 and Table 2. agree to the results, low, moderate, high, and very high classes in GPM produced by SVM-LN occupy 15.88, 36.05, 33.75, and 14.32% of the study area, respectively. Low, moderate, high, and very high classes in SVM-PL cover 3.38, 22.12, 47.52, and 26.98% of the study area, respectively. In the case of SVM-SIG, 22.87, 32.98, 30.50, and 13.64% of the study area were designated to the low, moderate, high, and very high classes, respectively. The results of SVM-RBF showed that low, moderate, high, and very high classes cover 22 .01, 45.85, 22.39, and 9.74% of the study area, respectively.Random forest (RF), and genetic algorithm optimized random forest (RFGA)As mentioned in the methods section, two methods were used to optimize RF model including caret and genetic algorithm. Final model by RF-caret had ntree= 1600, and mtry= 2, while net model by RFGA had ntree= 1744, and mtry= 2. The results showed that out of bag error for RFGA (0.316) was lower than its value for RF-caret (0.35%). Also, the results of the ROC analysis showed better performance of RFGA than RF-caret by area under the curve of ROC values of 86.5, and 85.6, respectively. Considering the better performance of the RFGA model, its results about the importance of effective factors and final GPM were represented and the results of RF-caret were ignored.Figure 4 represents the mean decrease accuracy, and mean decrease Gini obtained by RFGA. match to the mean decrease accuracy, altitude had the highest importance, followed by TWI, slope angle, and aspect, while the profile curvature, and plan curvature had concluding importance. On the other hand, results of the mean decrease Gini depicted that land use, and lithology were the least grand factors in groundwater potential mapping. The GPM produced using RFGA is represented in Fig. 5. According to the results, low, moderate, high, and very high classes in GPM produced by RFGA occupy 27.2, 32.4, 25.5, and 14.8% of the study area, respectively.Validation of the GPMsThe ROC was calculated for all GPMs with springs validation dataset. The results of AUC-ROC are represented in Fig. 6. AUC-ROC for GPMs produced by the implemented methods in the current study ranges from 76.9 to 85.5%. AUC-ROC values for RF and RFGA were estimated as 84.6, and 85.5%, respectively. AUC-ROC values were estimated for SVM- LN, SVM-PL, SVM-SIG, and SVM- RBF as 79.3, 77, 77.7, and 76.9%, respectively.DiscussionIn this section, the results are discussed by three parts including (i) the performance of t he models, (ii) the importance of the effective factors, and (iii) the precision of the GPMs.The performance of the modelsThe results showed that RFGA represented better performance than RF-caret. One of the advantages of GA is the capability to solve any optimization problem based on chromosome approach another important characteristic of GA is its capability to handle multiple solution search spaces and solve the problem in such an environment (Tabassum and Mathew 2014). These advantages may have caused RFGAs better performance in the current study.Also, it can be seen that both RFs (i.e. RF-caret and RFGA) had better performance than different kernels of SVM model. The results of different SVM kernels showed that SVM-LN had the best performance, followed by SVM-SIG, SVM-RBF, and SVM-PL However, their performance was similar. Based on the results, it is evident that SVM could be used as an efficient machine learning model in groundwater potential mapping. One of the drawbacks of t he SVM relates to the needed time for the analysis. In addition, several criteria should be tested in order to find the optimum values for the modeling process (Tehrany et al. 2015). However, the efficiency of the SVM could be change magnitude by making ensemble models. In a research, Tehrany et al (2015) used an ensemble weights of evidence and SVM model in flood mapping. Their results proved the efficiency and strength of the ensemble method over the individual methods. There are several potential reasons for error in the datasets implemented for groundwater modeling, including measurement errors, limitations in field data collection, sampling bias, etc. The mentioned errors could affect the overall accuracy of the SVM models (Moisen et al. 2006).The importance of effective factors in groundwater potential mappingThe importance of effective factors was determined using RFGA as the best model in the current study. Based on the results, in total, altitude, TWI, slope angle, and slo pe aspect were the most effective factors on groundwater potential. On the other hand, plan curvature, profile curvature, land use, and lithology were the least effective factors on groundwater potential. A growing body of literature investigates the importance of different effective factors in groundwater potential mapping (Naghibi and Pourghasemi 2015 Rahmati et al. 2016). The results of Naghibi and Pourghasemi (2015) showed that altitude, distance from faults, SPI, and fault density had the highest importance in groundwater potential mapping. In another research, Rahmati et al (2016) depicted that altitude, drainage density, lithology, and land use were the most influence factors on groundwater potential. Comparing the results of the current study and the results of the two mentioned researches shows that the importance of effective factors in groundwater potential mapping is dependent on the indicator, methods, and hydrological, geological, and climatic conditions of the target area.The precision of the GPMsWith this assumption that a better model is the one which determines the high and very high classes more precisely, a model with lower percent of high and very high classes area could be more helpful in water resources planning and management. A more precise GPM could help water resources managers to make better and more accurate decisions about areas for growing and even water conservation techniques. According to the results, SVM-RBF, and RFGA models had the lowest percent of the high and very high classes with 32.1, and 40.3% of the study area, respectively.ConclusionIn general, the water crisis in the 21th century is much more related to management and planning than to a real crisis of scarcity and drought stress. Lack of knowledge of water resources and inappropriate water resources management plans and strategies have made water crisis worse in arid and semi- arid regions. Therefore, the first step in appropriate planning of water resources is to know and gain knowledge of these vital resources. Groundwater is one of the most important water resource supplies, especially in arid and semi- arid countries with extreme lack of water, growing population, and resultant droughts. Considering the mentioned problems and issues, in the current study, we evaluated the performance of different kernels of SVM model and two strategies for optimization of RF (i.e. caret and GA). The results of the current study showed that RFGA had the best performance, followed by SVM-LN, SVM-SIG, SVM-RBF, and SVM-PL. The RFGA was successfully implemented in the current study. Also, different kernels of the SVM were used for producing GPMs with acceptable performances. However, their result was not as well as RFs performance. Furthermore, it can be seen that altitude, TWI, slope angle, and slope aspect were the most effective factors in groundwater potential assessment. The methodology produced in the current study could be transferred and tested in ot her areas for producing GPMs. As a final conclusion, GPMs could significantly help water resources managers and planners for better understanding of water resources conditions, exploitation, and conservation plans.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.