Decision makers are increasingly demanding climate information at the national to local scale in order to address the risk posed by projected climate changes and their anticipated impacts. Readily available climate change projections are provided at global and continental spatial scales for the 21st century (IPCC, 2007). These projections, however, do not fit the needs of sub-national adaptation planning that requires regional and/or local projections of likely conditions five to 10 years from now. Moreover, in order to derive climate projections at scales that decision makers desire, a process termed downscaling has been developed. Global Circulation Models (GCMs) simulate the Earth’s climate using physically based mathematical equations that describe atmospheric, oceanic, and biotic processes, interactions, and feedbacks. Coulibaly et al. (2005), defined GCM as the mathematical models used to simulate the present climate and project future climate with forcing by greenhouse gases and aerosols. They are the primary tools that provide reasonably accurate global, hemispheric, and continental scale climate information and are used to understand present climate and future climate scenarios under increased greenhouse gas concentrations, (Caffrey and Farmer, 2014). However, the spatial resolution of GCMs remains quite coarse, on the order of 300 km x 300 km, and, at that scale, the regional and local details of the climate that are influenced by spatial heterogeneities in the regional physiography are lost. GCMs are, therefore, inherently unable to represent local sub-grid scale features and dynamics, such as local topographical features and convective cloud processes (Wigley et al. 1990; Carter et al. 1994). Therefore, GCM simulations of local climate at individual grid points are often poor, especially when the area has complex topography (Schubert 1998). There is no theoretical level of spatial aggregation at which GCMs can be considered skillful, though there is evidence thereof at several grid lengths (Widmann and Bretherton 2000). However, in most climate change impact studies, such as agriculture, health and hydrological impacts of climate change, impact models are usually required to simulate sub-grid scale phenomenon and, therefore, require input data (such as precipitation and temperature) at a similar sub-grid scale. Therefore, there is the need to convert the GCM outputs into at least a reliable daily rainfall and temperature time series at the scale useful for climate change impact studies. In order to derive climate projections at scales that decision makers desire, a process termed downscaling has been developed. The methods used to convert GCM outputs into local meteorological variables required for reliable hydrological modeling are usually referred to as “downscaling” techniques (Coulibaly et al., 2005). There are various methods of downscaling, each with their own merits and limitations. Furthermore, the interest in nonlinear regression methods, namely, artificial neural networks (ANNs), is nowadays increasing because of their high potential for complex, nonlinear, and timevarying input–output mapping. Although the weights of an ANN are similar to nonlinear regression coefficients, the unique structure of the network and the nonlinear transfer function associated with each hidden and output node allows ANNs to approximate highly nonlinear relationships. Moreover, while other regression techniques assume a functional form, ANNs allow the data to define the functional form. Therefore, ANNs are generally believed to be more powerful than the other regression-based downscaling techniques (von Storch et al. 2000). The simplest form of ANN (i.e., multilayer perceptron) is reported to give similar results compared to multiple regression downscaling methods (Schoof and Pryor 2001). Weichert and Burger (1998) reported that the ANN model can account for some heavy rainfall events that were not identified by a linear regression downscaling technique. Cannon and Whitfield (2002) also found that an ensemble ANN downscaling model was capable of predicting changes in streamflows using only large-scale atmospheric conditions as model input. Nevertheless, some studies have also shown that the standard ANN method that is commonly used for hydrologic variables modeling is not well suited to temporal sequences processing, and often yields suboptimal solutions (Coulibaly et al. 2001a). There are, however, other categories of neural networks that have a memory structure to account for temporal relationships in the input–output mappings, and they appear more suitable for complex nonlinear system modeling (Gautam and Holz 2000; Coulibaly et al. 2001b). More recently, Tatli et al. (2004) proposed a Jordan-type recurrent neural network that uses not only large-scale predictors, but also the previous states of the relevant local-scale variables. The purpose of this study is to identify optimal temporal neural networks that can capture the complex relationship between selected large-scale predictors and locally observed meteorological variables (or predictands). Therefore, the paper aims to highlight the applicability of temporal neural networks as downscaling methods for improving daily precipitation and temperature estimates at some locations. The downscaling models are developed and validated using large-scale predictor variables derived from the National Centers for Environmental Prediction–National Center for Atmospheric Research (NCEP–NCAR) reanalysis dataset. The paper specifically focuses on the time-lagged feed-forward neural networks (TLFN) that have temporal processing capabilities without resorting to complex and costly training methods. A major assumption in using TLFN is that the local weather is not only conditioned by the present large-scale atmospheric state, but also by the past states. In addition, the optimal TLFN model are applied to downscale the outputs of the Hadley Centre Coupled Model, version 3, (HadCM3) GCM, forced with the Intergovernmental Panel on Climate Change (IPCC) IS92a scenario. The downscaling results are then compared with BCSD downscaled data. The remainder of the paper is organized as follows. Section 2 provides an overview of the downscaling methods. Section 3 provides a brief description of the study area and the data used in this study. Section 4 introduces temporal neural networks; and comparative results from the downscaling experiments are reported and discussed in section 5. Conclusion are made in section 6.

2. Downscaling methods: An overview

Downscaling is used for bridging the gap between the scale of GCMs and required resolution for practical applications at regional and local scales. It is a method that derives local- to regional scale (a point value to 100 km) information from larger-scale models or data analyses, (Bhuvandas. et al., 2014). There are two major methods in downscaling they are: dynamical downscaling and empirical/statistical downscaling. The dynamical method uses the output of GCM as input into a high-resolution regional models to give a high resolution data for assessment. Dynamical downscaling is usually based on the use of regional climate models (RCMs), which generate finer resolution output based on atmospheric physics over a region using GCM fields as boundary conditions, (Giorgi and Mearns, 1991, Giorgi and Mearns, 1999). The physical consistency between GCMs and RCMs is controlled by the agreement of their large-scale circulations, (von Storch et al., 2000). The individual choice of domain size controls the divergence between the RCMs and their driving GCMs, (Jones et al., 1997). As a consequence of the higher spatial resolution output, RCMs provide a better description of topographic phenomena such as orographic effects, (Christensen et al., 2007). Moreover, the finer dynamical processes in RCMs produce more realistic mesoscale circulation patterns, (Buonomo et al., 2007). However, RCMs are not expected to capture the observed spatial precipitation extremes at a fine cell scale, (Fowler et al., 2007). Many studies, (e.g Rauscher et al., 2010), have found that the skill improvement of RCM depends not only on the RCM resolution but also on the region and the season. Although RCMs may give feedback to their driving GCMs, many dynamic downscaling approaches are based on a one-way nesting approach and have no feedback from the RCM to the driving GCM, (Maraun et al., 2010). The main problem with RCMs is that significant biases in the simulation of mean precipitation on large scales can be inherited from the driving GCM, (Durman et al., 2001). Also the boundary conditions are derived from a specific GCM; use of different GCMs will result in different projections, (Mujumdar, and Kumar, 2012). Note that inter-model differences are related to model biases, moreover, Christensen et al. (2001) suggest that GCM biases may not be linear and biases may not be cancelled out by simply taking differences between the control and future scenarios, which many studies have adopted. Despite their rapid development, RCMs are still ridden with problems related to parameterisation schemes due to the fact that physical processes are modelled at a scale on which they cannot be explicitly resolved, (Maraun et al., 2010). The other method is the use of empirical/statistical methods which is developed by the use of statistical relationships that link the large-scale atmospheric variables with local/regional climate variables. This method is based on statistical relationships between the coarse GCMs and fine observed data, statistical downscaling is a straightforward means of obtaining high resolution climate projections, (Wilby et al, 2004). Statistical downscaling may be used whenever impacts models require small-scale data, provided suitable observed data are available to derive the statistical relationships and covers all kind of locations. The output obtained is generally small scale information on future climate or climate change (maps, data, etc.). The key input being appropriate observed data to calibrate and validate the statistical model(s) and GCM data for future climate to drive the model(s), (Wilby et al., 1998). Taking the relationship with RCMs into consideration, Maraun et al. (2010), divided statistical downscaling approaches into prefect prognosis (PP), model output statistics (MOS) and weather generators. In PP, the statistical downscaling relationships are established by observations (local scale). In MOS, gridded RCM simulations and observations (local scale) are used together to develop downscaling relationship. Using PP, MOS or both of them, weather generators are hybrid downscaling methods. With respect to types of statistical methods, downscaling can be categorical, continuous-valued or hybrid, (Fowler et al., 2007, Wilby and Wigley, 1997). In categorical downscaling, classifications and clustering are the common statistical techniques to relate data to different groups according to largescale circulation patterns and data attributes, (Zorita and von Storch, 1999). For continuous-valued downscaling, regression relationships are widely used to map large scale predictors onto local-scale predictands, Chandler and Wheater (2002). When the GCM simulated variables are large in number, nonparametric stepwise predictor identification analysis may be performed based on partial mutual information, Mehrotra and Sharma (2010). In hybrid downscaling, different statistical approaches are combined, Wilby et al. (2002) and they are sometimes referred to as weather generators, based on algorithms of conceptual processes, Chandler (2000), Kilsby et al.(2007). Based on the approach to model the daily precipitation occurrence, the spell length approach Wilks, (1999) is also used which is a type of weather generator, where instead of simulating rainfall occurrences day by day, the models operate by fitting probability distribution to observed relative frequencies of wet and dry spell lengths. In all cases, the quality of the downscaled product depends on the quality of the driving model, (Solomon et al., 2007).