View on GitHub

DATA-150

The Power of a Geographic Informaiton System (GIS) Framework

My previous data science method insight analyzed remote sensing, a method using satellites to capture data and pictures of a certain area. That data is only useful given there is a way to visualize it rather than having to stare at an endless list of numbers trying to find relationships. That is the intersection where a geographic information system (GIS) framework comes into play. GIS has the capacity to take input data of a variety of types and then visualize it into an interactive map which displays the spatial relationships of the variables. This is a powerful tool used by data scientists to better understand relationships of variables ranging from population demographics to land use characteristics, all in a visual medium.

GIS is defined as a way to take in data and then display it with relation to geographical points. The GIS framework often uses location data as a reference. That data does not have to only be defined by conventional longitude and latitude, however, but can also be expressed through address or ZIP code. As mentioned, the variables it can handle are wide ranging with population demographics, landscape characteristic, and specific site locations all being viable. This means that GIS can show an area’s population size, average income, diversity, vegetation density, soil type, and proximity to roads all at the same time, visualizing the relationships between each variable [1]. Because of this, GIS frameworks are capable of showing the susceptibility of an area to a stimulus, one critical example I examine later being flooding.

Data collection is the term used for putting data into a GIS framework for visualization. Variability is again key, as the GIS framework is able to accept a wide range of data types including cartographic, photographic, digital, remote sensing, and spreadsheet data. Those data which are strictly number based can be directly uploaded, while maps must first be scanned to then be converted to a digital form. The framework is able to then collect the aforementioned data sets and compile them together on one map. Because the data and maps could be in different scales, the framework must adjust the scale to be equal and also make sure that the projections (map distortion from the poles) are all the same to create a coherent and accurate image [1].

There are various layers of a GIS framework which can be of two varieties: vectors or rasters. Vectors are polygons made of points, otherwise known as nodes, and lines. Nodes and lines are produced on the x and y coordinate plane to illustrate spatial relations. Because the data are independent nodes and lines, vectors are capable of showing the centers and edges of features [2]. For this reason, vectors are best used with data having firm borders, such as where school districts or streets occur [1]. Contrarily, raster layers are defined by square matrices, with each square being a cell or pixel. The accuracy of raster images is then dependent on the size of the cells or pixels. If they’re too large, then the descriptions will be too generalized. Because raster images have identical cells rather than independent points, they are unable to show strict boundaries. They specialize in defining the variables on the inside of parameters rather than the outside boundaries. This means that raster levels are best used when the data is variable rather than fixed. Example variables of this variety include elevation, temperature, and soil ph [2].

Researchers have used the GIS framework to visualize a multitude of problems in Vietnam, specifically Ho Chi Minh City, in regards to predicting flood risks. For example, one group used QuickBird imaging in conjunction with an ArcGIS framework to account for variables which had an increased risk of flooding. They defined categories in the imaging such as water bodies, traffic route , construction land, bare land, and green areas. They used secondary data from a TR-55 model that took into account the aforementioned classification, river flow length, and channel slope. With these variables combined into the GIS framework, in this case an ArcGIS, the researchers were able to visually identify areas which were at most risk of flooding as well as showing which areas had the highest height of floods [3]. For these reasons, a GIS framework is a powerful visual data science tool which helps make an easy to understand representation of the data.

Resources

  1. Caryl, S. (2012, October 09). GIS (Geographic Information System) (984478851 762026680 J. Evers, Ed.). Retrieved October 14, 2020, from https://www.nationalgeographic.org/encyclopedia/geographic-information-system-gis/
  2. Dempsey, C. (2020, September 26). Types of GIS Data Explored: Vector and Raster. Retrieved October 14, 2020, from https://www.gislounge.com/geodatabases-explored-vector-and-raster-data/
  3. Dang, A., & Kumar, L. (2017, November 03). Application of remote sensing and GIS-based hydrological modelling for flood risk analysis: A case study of District 8, Ho Chi Minh city, Vietnam. Retrieved September 17, 2020, from https://www.tandfonline.com/doi/full/10.1080/19475705.2017.1388853?src=recsys