A Methodology to identify an Optimum Rail Network for Colombo Metropolitan Region

The location of rail stations and route structure for new railways in a country or a region constitutes complicated planning processes which involve the consideration and analysis of various data sets and other relevant information. It includes the evaluation of socio-economic and technical parameters to minimize the cost and environmental impacts of different alternatives and to achieve the development of alternative stations and corridors for the rail networks. The use of modern tools such as Geographic Information System (GIS) to identify suitable locations of rail stations and selection of optimum routes involves managing a variety of data sets from different sources at different scales. This work is intended to investigate and show the capabilities of GIS in planning rail network considering the proposed Colombo Metropolitan Region (CMR) Mass Rapid Transit (MRT) system as a case study. The study will identify the information needed for planning and the evaluation criteria for locating stations and optimum network using spatial multi-criteria decision making (MCDM) processes with the help of GIS. The network data model provides an open, generic data model with many common GIS analysis capabilities which dealt with rail network analysis.


Introduction
Nowadays GISs are widely used in different applications. Experience shows that GIS is an efficient tool for solving optimization tasks with spatially distributed linear objects such as railways, roads and pipelines.
Planning rail routes is related to an evaluation in detail of physiographic factor, landscape, engineering-geological and other requirements for the investigation area. It also includes determining the length of the route, calculation of intersections with rivers, roads and railways which are considered as mountain relief complexities. Consideration of the railroad construction costs, which depends on geological structure and land covers (rocks, marshland etc.), and many others require detailed spatial multi-criteria analysis.
An existing GIS spatial analysis capability gives possibility of operative evaluation. Modern GIS software's allow to automate a complex operations such as intersection with different linear and polygonal objects, estimation of transport costs during construction and operational service of route, calculation of an integrated construction cost, etc.
The necessity of GIS applications in planning variants of route and locating stations appeared while planning a railway for CMR. The proposed rail route and stations should satisfy all requirements of technical and construction parameters, including operation of the service.
Locating rail stations and railway network planning involves specialized resource allocation and laying routes. They are complex problems and they depend on multiple factors. The solutions of these complex problems, in order to make decisions, require sequences of processes for factors and criteria. These things need to be processed to obtain relevant information.

Objective
The general problem is to find suitable locations for rail stations and planning rail routes using GIS technology, considering the influence of various economic, ecological and technical factors. After successful location of a rail station, it is necessary to identify new rail links in order to connect all new stations and any existing rail way lines.
The main objective of this work is locating stations and planning rail routes for the CMR by exploring the potential of GIS based spatial multi-criteria decision making methods. After obtaining all possible rail routes with locations the optimum network which satisfies all requirements in identified.

Overview of Methodology
The methodology of this work contains two parts; Data collection and Data analysis.
In Data Collection process, after visiting Site organizations/GIS agencies and Internet, Topography maps, Geology maps, Reports and Paper maps are scanned, geo-referenced and digitized. Geo-referenced maps are edited using statistical data. Available data are categorized as attributes in order to form a vector -raster map. Shuttle Radar Topography Mission (SRTM) resampled data can be used to obtain slope and Digital Elevation Model (DEM).
Data analysis part has to be carried out using SMCDM (Spatial Multi-Criteria Decision Making) is used to develop a model for the selection of nodes and routes. Suitable locations for railway nodes and suitable arrangement for laying routes are then chosen. Selection of an optimum network by using MST (Minimum Spanning Tree) method is then followed. Suitability of the obtained network by MST is checked and modified and re-analyse a to obtain the optimum network which idealizes all the requirements.

Spatial Multi Criteria Decision Making
The SMCDM involves evaluation of geographical events based on criterion values and the decision maker's preferences to a set of evaluation criteria.
In this work the procedures for selecting a set of criteria based on the properties of attributes. The used attributes were comprehensive and measurable [1].
After establishing a set of criteria for evaluating decisions, each criterion represents a map layer in the GIS data base. This layer represents a criterion map. The criterion map indicates the generic nature of the criterion concept and is used to emphasize the attribute-objective relationship [1].
MCDM problems involve determination of the relative importance of the criteria. This is usually achieved by assigning a weight to each criterion. A weight can be defined as a value assigned to an evaluation criterion that indicates its importance relative to other criteria under consideration. The larger the weight, the more important is the criterion.
Assigning weights of importance to evaluation criteria are changed in the range of variation for each evaluation criterion. The weights are usually normalized to sum up to 1. In the case of n criteria, a set of weights is defined as follows: w = (w l7 w 2 ,., w.,., w n ) 2w =1 (1) (2) A multi-criteria decision rule is a procedure that allows for ordering alternatives, to enable us to decide which is preferred to another. It integrates the data, information on alternatives and the decision maker's preferences into an overall assessment of the alternatives [2].
Any spatial decision making problem rule involves a set of attributes and a set of objectives. Spatial multi-criteria decision rules can be categorized into multi-attribute decision making (MADM) and multi-objective decision making (MODM) decision rules [2].
Objectives of this work are implemented in the multi-attribute decision rules which are based on the assumption that the attributes serve as both decision variables and objectives. The aim of MADM analysis is to choose the most preferred alternative to rank in descending order of preference. There are numerous decision rules that can be used for MADM problem. The following MADM methods in GIS-based decision making have been used [3]: • Weighted linear combination(WLC) • Analytical hierarchy process (AHP).
In WLC method the decision maker directly assigns weights of "relative importance" to each I attribute. A total score is obtained for each alternative by multiplying the importance weight assigned for each attribute by the scaled value given to the alternative on that attribute, and summing the products over all attributes by the following formula: where xij is the score of the ith alternative with respect to the jth attribute, where i is the particular cell located in the map. The weights wj are normalized weights, so that Z wj = 1. The most preferred alternative is selected by identifying the maximum value of Ai (i = 1, 2, , m).
This method can be processed using any GIS system having overlay capabilities that allow the evaluation criterion map layers to be aggregated to determine the composite map layer.
The AHP method, developed by Saaty (1980), is used for analyzing complex decisions with multiple criteria. The multi-criteria decision uses hierarchical structures to represent a decision problem, and then develops priorities for the alternatives based on the decision maker's judgments throughout the system[3].

AHP in Rail Route Planning
AHP has been used to determine the weights of relative importance of the factors/criteria in rail route planning. In this case following seven factors/criteria are selected for developing methodology for CMR. i.e. Landuse/Land cover(Ln), Slope(Sl), Settlement(St), Forest(Fr), Road(Rd), Hydrology(Hd), Geology(Gl).
The first step in the AHP procedure is to decompose the decision problem into a hierarchy that consists of the most important elements of the decision problem. The objective to identify suitability for laying an optimum rail route is at the highest level.
The next level represents the factors/criteria that need to be considered for the suitability assessment and the attributes associated with respective criterion. The alternatives are represented in the GIS database so that each layer contains the attribute values assigned to the alternatives, and each alternative (e.g. cell or polygon) is related to the higher level elements (i.e., attributes).
The suitability will be evaluated on the relative importance of the elements at each level of the decision hierarchy, that is, objectives, factors/ criteria and attributes. The procedure is as follows [3]; 1.0 Define the set of evaluation criteria (map layers); 2.0 Standardize each criterion map layer; 3.0 Define the criterion weights; that is, the weight of "relative importance" is directly assigned to each criterion map; 4.0 Create the weighted standardized map layers; that is, multiply the standardized map layers by the corresponding weights; 5.0 Generate the overall score for each alternative using the add overlay operation on the weighted standardized map layers; 6.0 Rank the alternatives according to overall performance score; that is, the alternative with highest score is the best alternative or cost surface.
Once the hierarchy is formed, and the second step is the principle of comparative judgment, i.e. the pair-wise comparison method in order to obtain the criterion weights. This procedure greatly reduces the conceptual complexity of decision making since only two components are considered at any given time [4].
As mentioned above a suitability assessment for laying rail route was applied on the basis of seven factors/criteria to determine the weight of relative importance through pair-wise comparison. The procedure of pair-wise comparison (Saaty 1980) consists of three steps: generation of the pair-wise comparison matrix, the computation of criterion weights, and the estimation of consistency ratio [4].
The AHP method employs an underlying scale with values from 1 to 9 to rate the relative preferences for two criteria (Table 1). This scale consists of a 9-point continuous scale so that an individual can simultaneously compare and consistently rank. The comparison matrix is reciprocal.
According to Urban Transport Development study for CMR [5], for the comparison of criteria such as land use, settlement with reference to each other can be identified. It is mainly based on railway construction based on cost, desire Two requirements are of equal value Experience slightly favours one requirement over another Experience strongly favours one requirement over another A requirement is strongly favoured and its dominance is demonstrated The evidence favouring one over another is of the highest possible order of affirmation When compromise is needed If requirement one has one of the above numbers assigned to it when compared with requirement second, then second has the reciprocal value when compared with first level, feasibility and other obligations. The process has to be done under the supervision of rail road specialists. The values can be alternative at the analysis stage and thus result can be used to compare with each obtained optimum network models with reference to each variation.
The location of the rail stations should be in an optimal distance from the existing stations and include the station for future development of a rail network among these stations.

Overview of Multi-criteria Decision Making
For CMR the nodes are chosen mainly based on future development rather than the usage of MCE since it is more complicated highly urbanized area. Now the objective is to find the optimum shortest routes from proposed stations through to the existing railway stations. Therefore the objective is to find the best rail routes within CMR which would be the shortest and most convenient.
GIS has been used to find a Least Cost Path (LCP) that is the shortest and optimal path for railways in order to minimize time and fuel costs. Solutions to this problem obtained were normally by implementing a GIS package like ArcGIS's Model Builder to automatically perform all processes necessary to calculate cost distances and paths between the proposed stations within CMR [6].
The values of the cost surfaces, expressed in terms of the particular measure of costs were calculated. These values often have an actual economic meaning and are equal to the cost of moving across the landscape. Geographic problems often require the analysis of many different factors for laying a new rail route, such as land cost etc. The cost values were calculated relative to some fixed base amounts which were given values in the range 1 to 10 and a fixed value of 1000. These values assumed to anticipate the cost of the railway crossing geographical features of certain attributes, where 1 is a low cost (equal to the base cost), 10 is high and 1000 is the highest cost (virtually a barrier). Getting a suitable cost surface is important for the selection of an evaluation scale. According to given cost values in the range 1 to 10 and an evaluation scale of 1 to 10 in increment of 1 has been assigned where 1 is least suitable and 10 most suitable. This evaluation scale is one of requirements of Weighted Overlay tool. The process of creating cost surfaces was implemented to find an optimal rail route among suitably located new stations and existing stations. The resulting route, called a least cost path (LCP), which is a short optimum path for the proposed rail route was run on Model Builder [6].
According to given cost values the LCP passes through low cost cells that minimize construction costs of routes [6], After creating cost surfaces it is necessary to calculate how suitable each area, or cell, is to travel through, or how much it will cost to travel through each cell. In this study a GIS analysis was performed using ArcGIS Model Builder. Models are represented as sets of spatial processes, such as reclassification to determine the categories for each input factor, and then assigned cost values for deriving cost surfaces which are combined with overlay techniques to derive a suitability cost surface.
In weighted overlay technique, each of the input factors was assigned a weight influence based on its importance which are derived in section 5.1 and then multiplied by each of the cost surfaces. Then the GIS weighting overlay process combined the cost surfaces such that the result is summed up to produce a suitability cost surface as shown by the formula: Where, c n -characterize raster cell (cost surface), w n -weight derived from AHP pair-wise, comparison Once the suitability cost surface has been created in model builder, cost distance and cost path process were used to determine the leastcost routes. These processes took the suitability cost surface and calculated separately from each station the accumulated cost of travelling from any location back to the starting point. The last process calculated paths through the landscape from the destination stations along the least cost path back to the starting stations.

Cost Distance and Path Analysis
The cost values assigned to each cell are per-unit distance measures for the cell. The cell size is expressed in meters, the cost assigned to the cell is the cost necessary to travel one meter within the cell. The data resolution is 25m for this study, the total cost to travel either horizontally or vertically through the cell would be the cost assigned to the cell multiplied by the resolution [total cost = cost * 25]. To travel diagonally through the cell, the total cost would be 1.414214 multiplied with the cost of the cell and multiplied by resolution [total diagonal cost = 1.414214 (cost* 25)] [6].
To determine the cost for a path to pass through cells to reach a source, the cost is based on the node/link cell representation. In the node/link cell representation, each centre of a cell is considered a node and each node is connected by multiple links. Every link has impedance associated with it.
The impedance is derived from the costs associated with the cells at each end of the link and the direction of movement through the cells. If the movement is from a cell to one of the four directly connected neighbours, the cost to move across these links to the neighbouring node is 1 times the cost of cell 1 plus the cost of cell 2 divided by 2.
Where costl is the cost of cell 1, cost2 is the cost cell 2, and al is the cost to move from cell 1 to cell 2. [6] 3.6 Creating a Profile The plan and longitudinal profile characterizes the route of the railway.
The plan of the route is its projection on the horizontal plane that consists of straight lines connected with curvilinear paths. The longitudinal profile represents the analysis of the route on a vertical plane with intersections at different places. The profile is usually created for different purposes, such as: Analysis of fluency and regularity of moving of trains in the proposed railway; Analysis of construction works of the proposed railway in a scheduled period; Analysis of passing through elevation for tunnels; Analysis of slopes in landscape; Flood prevention of proposed railway at intersection with water bodies; Analyze costs for construction of proposed railway; Analyze of the prevention of construction of rail routes from landslides, seismic activities and geological structure of proposed railway etc.
A profile is helpful for analysis and the subsequent comparison of several variants of the railway and for successive improvements of railway project decisions.
The ArcGIS 3D analysis has been used to create profiles for this study. Successfully created profiles for elevation and slope show the change of the route along a line on a surface.[6]

Network Data Model Editor
Programming based interface need to be provided a complete set of indexes, operators, and functions which are available for managing information based on spatial proximity. Currently there are many GIS vendors offering network solutions; however, their solutions may have the following issues: • Their data model is stored in proprietary file formats, and cannot be integrated with their database.
• The data model and analysis capabilities cannot be extended.
• Application information cannot be separated from connectivity information.
• Spatial information management and hierarchical relationships are not directly supported.
To address these issues, GIS based spatial network data model should do the following: • Provides an open and persistent network data model -The network data model is stored as relational tables in the database.
• Separates connectivity and application information in the data model -Connectivity information can be separated from application information. Both application information and connectivity information are managed in the database. However, only connectivity information is required for network analysis.
• Allows the extension of data model and analysis capabilities. Users can define their own network elements by extending these interfaces. As a result, users can implement their own user-defined representations and analysis functions.
• Integrates with ArcGIS Spatial technology for spatial information management • The network data model should support all spatial data types.
The network data model consists of two parts: a network schema and network APIs. The network schema is the persistent data storage used to store network information.
A network contains network netadata, a node The scheme represents the information necessary for network management and analysis. Application attributes can be added to these tables or referenced from other application tables (through foreign keys). Note that the network data model is also capable of handling geometry information. That is, the network data model can represent both logical and spatial network applications. Adding geometric data to a logical network will allow the logical network to be displayed.
The following analyses are supported in the network data model: • Feasibility Analysis: Is node A accessible from node B? • Along the AB link how much settlements covered, land cover/land use has passed through? . Along the AB link how the slope has varied? « Along the AB link how the elevation is varying?
• Minimum-Cost Spanning Tree: What is the minimum-cost tree that connects all nodes?
Within Cost Analysis: What nodes are within a given cost from (to) a given node?
• Nearest Neighbours: What are the N nearest neighbours of a given node?
• K Shortest Paths: What are the K shortest paths from node A to node B?
• Use of other alternative options of laying rail routes such as underground or flyover can likewise be analyzed by defining of new cost surface or replacing redefined attributes.
• Compare the eonstructability of any proposed network with the judgment one has made The network data model takes a generic approach to solving network problems, by separating connectivity information from application-specific information.
First the network connectivity information (node connections and link cost) is extracted and separated from the application-specific information. Application-specific attributes are stored, if needed, with the connectivity information or separately. Once the connectivity information is extracted, network analysis is conducted on the generic model. Additional network constraints can also be considered. The final result is then mapped to applicationrelated attributes, and displayed. This approach avoids customized network solutions and simplifies the data management of connectivity and application-specific information.
The network data model introduces the concept of network constraints, which provides a mechanism to guide network analysis. For example, one may want to compute the minimum spanning tree network that passes through all possible paths with network constraints; applications can easily incorporate application-specific logic into the network data model analysis engine.
Other constraints, such as path cost can also be included in analysis. The network data model editor is a standalone application that helps create, edit, and visualize networks. The editor supports viewing operations such as pan, zoom, and auto-fit. It also provides functions to navigate between network elements. All analysis functions are supported in the editor. With the editor, users can create a network from scratch in the clientside and save it to the database.
GIS network analysis may include network tracing, network routing, and network allocation.
Tracing applications deal with queries like Is node A reachable from node B? or What are the nodes that are reachable or can be reached from a given node? Such queries are common in water or utility networks. Another type of tracing analysis is to find out how much settlement has been covered through the network link, how the slope and elevation has varied along the link.
Routing analysis or path computation, probably the most studied topic in network applications, is divided into the following categories: • Shortest Path or Fastest Path (transitive closure problem) • K Shortest Paths: Find K shortest paths from a start node to a destination node.
Allocation analysis deals with designating destination points within a network. It provides information on a service area or coverage for points of interest. The network data model supports the following allocation analyses: • Within Cost: Find all points of interest within a certain distance from a designated location in CMR.
• Minimum-Cost Spanning Tree: Find the cheapest way to connect all nodes.
• With the introduction of alternative options such as underground rail route laying option reanalyze the network with minimum cost spanning tree problem.
Constraints are conditions to be satisfied during analysis. The network data model supports network constraints so that applications can impose application-specific conditions on the network during analysis.
The following are some examples of network constraints: • Depth (number of links) and cost constraints: Network analysis can be limited based on the depth of the search path, the cost limit where the analysis occurs. These constraints can be used to specify a preferred subset of possible solutions. The network data model provides a SystemConstraint class (which implements the NetworkConstraint class) for these common network constraints. Users can create an instance of SystemConstraint and use it in analysis.
• Temporarily inactivated nodes or links: Sometimes nodes or links must be temporarily turned off before analysis begins, for example, rail segments (links) need to be avoided temporarily for the purpose of analysis purpose in a rail network.
One can make a node or link inactive by setting its state to assign a cost impedance value of infinity. Network elements that are inactive will not be considered during analysis. Note that changing the state of nodes and links does not affect the persistent data model.
• Routing with specific types of links and nodes: Sometimes network analysis must only be conducted through nodes and links of specific types or with specific requirements.

Data Collection
Data collection was the main task and it typically consumed the majority of the available resources. Today data collection still remains a time consuming, tedious and expensive process. Most of this data was obtained from RDA, Department of Census, Department of Railways, Urban Development Authority and a few from the reliable internet sources. In Sri Lanka it is very rare to find digitized geographical information, so some of the data is prepared from the basic level to use effectively in GIS software [5].

Spatial Data
The cartographic maps served as the main source of data for GIS since long. Maps as an origin of information have two types of functions: • Positional, i.e. give information about the exact location of objects; • Informational, i.e. give information about data type, shape and class of objects including topological properties and relationship of objects.
Input of cartographic information usually is done by scanning and digitizing. The principal difference between these two methods is that, in the case of digitizing, it creates vector data, and in the case of scanning it creates a raster data.

No Spatial Data
All geographic objects have attributes.
Attributes of geographical objects have been collected at the same time as the vector geometry, e.g. population in cities, the category of roads, rivers and so on. These attributes have been manually entered into the geo-database.

Software and Tools
ArcGIS-ArcView is well known in the world and the most widely used category of GIS software. This is used for following purposes; editing, data management and storage; georeferencing data from different sources; performing spatial multi-criteria analysis and visualization of output data; implementation geo-processing functions for different tasks; generating criteria maps and aggregating; defining a cost distance and least cost path; creating profiles of routes, etc[6].
ArcGIS Model Builder is a graphical environment for building and executing multistep models, with facilities for batch processing and dynamic modelling. The model is the description of a decision situation to generate a solution to the problem. Model Builder can be used for a given decision problem which is well structured so that all decision problem-solving activities can be automated. The main characteristics of using a Model are the possibility to structure the decision problem and use well established procedures for solving the spatial problems [6].
Raster data which is obtained by scanning maps usually do not contain the locational information on the surface of the earth and need to be geo-referenced. The geo-referencing process includes assigning a coordinate system that associates the data with a specific location on the earth in the real-world coordinate system. These coordinates defining control points are used to build a polynomial transform from one coordinate space to another. The control points are selected in the input raster dataset and the output locations are specified by typing in the known output coordinates [6].
A geo-database is a collection of geographic data based on a well-defined model for geographic data types. It contains of layers vector data representing features and raster data representing images and grid -surfaces. The purpose of geo-database is to make the features in GIS datasets and define a relationship among features that were displayed on maps as layers.
Each layer represents particular types of features which have been used for spatial analysis [6].

Analysis
Using the pair-wise comparison the following matrix was completed for present study case where the relative importance for the criteria/ factors was applied. The calculation of CI is based on the observation that _ is always greater than or equal to the number of criteria under consideration (n) for positive, reciprocal matrixes, and _ = n if the pair-wise comparison matrix is a consistent matrix. Accordingly, _ -n can be considered as a measure of the degree of inconsistency and can be normalized as: CI = (A.-n)/(n-l) = (7.684-7)/(7-l) =0.114 ... (7) CI provides a measure of departure from consistency, and the calculation of the consistency ratio (CR), which is defined as follows: CR = CI / RI = 0.114 / 1.32 = 0.086 Where, RI is the random index, the consistency index of randomly generated pair-wise comparison matrix. The RI depends on the number of elements being compared. The defined value of CR<0.10, which indicates a reasonable level of consistency in the pair-wise comparisons[4].

Calculations for creating cost surfaces
Tables for creating cost surfaces for each geographical layer's attributes are shown below.

Results
With reference to tables the maps were derived for the geo-database from the collected data of relevant criteria related to rail network analysis. Using ArcGIS Model Builder these vector maps were converted into continuous grid maps for the weighted overlaying procedure. Here 25m X 25m cell size grid map was obtained. Each cell  factor which is mentioned in section 5.1 need to be multiplied. The weighted overlay map obtained is shown in Figure 1 with the defined rail stations.
This map represents: • Cost Map of CMR • Scale of preference • Ideal model for finding rail routes • All possible rail routes All the defined nodes are then joined by links so that they lay along cells which give highest scale values. In this case along the darker area the rail routes need to be traced. These all possible paths with the least cost then can be used to obtain the optimum network.

Conclusions
In this work the capabilities of GIS are explored for locating a rail station and planning new railway routes in CMR. The study attempts to resolve some of the problems in the current railway system in the region and describes in detail the implementation and capabilities of GIS for the choice of better route alignment and station locations.
A SMCDM process based on GIS techniques and capabilities was the main task of this work. This SMCDM process is useful as it allows multicriteria analysis and evaluation. Railway planners can design alternative routes and evaluate the ecological, social and economical impact of each route.
The development of suitable locations for the railway stations and the laying of the railroad were based on the physical terrain factors. For the rail stations these factors were analyzed for suitability with the MCE method. Here they were standardized as factors for overlaying. For the railroad these factors were converted into cost values which were based on the basic cost parameters of existing constructions, bridges, tunnels and on the purchasing of land, which were the cost factors. The physical factors and cost factors were then each assigned a relative importance that reflected where they lay in the overall scope of the study.
The AHP has been successfully applied in this study for defining the relative importance of different parameters to the cases of locating rail stations and planning rail routes in CMR. The AHP has been integrated into a decision making process with GIS technology for station location and route selection.
Finally it needs to be said that while not all data such as proximity maps for node selection, Digital Elevation Model for slope criteria, landslide and earthquake data, etc. that could have been used was available one can remain confident on the result so long as its limitations are understood. Also, as the methodology is sound, if this missing data were to become available in the future, this model could be easily modified to make use of it.
This methodology can be used not only for CMR but also for any possible network in any area.