Data Mining-Spatial Data Mining

Published on January 2017 | Categories: Documents | Downloads: 73 | Comments: 0 | Views: 1612
of 8
Download PDF   Embed   Report



Spatial Database
 Stores a large amount of space-related data
 Maps
 Remote Sensing
 Medical Imaging
 VLSI chip layout
 Have Topological and distance information
 Require spatial indexing, data access, reasoning
representation techniques
Spatial Data Mining
 Extraction of knowledge, spatial relationships from
spatial databases
 Can be used for understanding spatial data and spatial
 Applications:
 GIS, Geomarketing, Remote Sensing, Image
database exploration, medical imaging, Navigation
 Challenges
 Complexity of spatial data types and access
 Large amounts of data
 Non-spatial Information
 Same as data in traditional data mining
 Numerical, categorical, ordinal, boolean, etc
e.g., city name, city population
 Spatial Information
 Spatial attribute: geographically referenced

 Neighborhood and extent
 Location, e.g., longitude, latitude, elevation

 Spatial data representations

 Raster: gridded space
 Vector: point, line, polygon
 Graph: node, edge, path

Spatial Data Statistical techniques
 Popular approach to analyze spatial data
 Assumes independence among spatial data
 Can be performed only by experts
 Do not work well with symbolic values
Spatial Data Warehousing
 Spatial data warehouse: Integrated, subject-oriented,
time-variant, and nonvolatile spatial data repository.
 It consists of both spatial and non spatial in support
of spatial data mining and spatial-data-related
decision-making processes.
 Spatial data cube: multidimensional spatial database
 Both dimensions and measures may contain spatial
 Challenging issues:
 Spatial data integration: a big issue
 Structure-specific formats (raster- vs. vectorbased, OO vs. relational models, different
storage and indexing, etc.)
 Vendor-specific formats (ESRI, MapInfo,
Intergraph, IDRISI, etc.)
 Realization of Fast and flexible OLAP in spatial
data warehouses.

Dimensions and Measures in Spatial Data Warehouse
 Dimensions
 non-spatial
 e.g. “25-30 degrees” generalizes to“hot” (both
are strings)
 spatial-to-non spatial
 e.g. Seattle generalizes to description “Pacific
Northwest” (as a string)
 spatial-to-spatial
 e.g. Seattle generalizes to Pacific Northwest (as
a spatial region)
 Measures
 numerical (e.g. monthly revenue of a region)
 distributive (e.g. count, sum)
 algebraic (e.g. average)
 holistic (e.g. median, rank)
 spatial
 collection of spatial pointers (e.g. pointers to all
regions with temperature of 25-30 degrees in
Example: British Columbia Weather Pattern Analysis
 Input
 A map with about 3,000 weather probes scattered
in B.C.
 Recording daily data for temperature, precipitation,

wind velocity, etc. for a designated small area and
transmitting signal to a provincial weather station.
 Data warehouse using star schema

 A map that reveals patterns: merged (similar)

 Interactive analysis (drill-down, slice, dice, pivot,
 Fast response time
 Minimizing storage space used
 A merged region may contain hundreds of
“primitive” regions (polygons)

Star Schema of the BC Weather Warehouse
 Spatial data warehouse
 Dimensions
 region_name
 time
 temperature
 precipitation
 Measurements
 region_map
 area
 count
Can we precompute all of the possible spatial merges and
store them in the corresponding cuboid cells of a
spatial data cube?

 Probably not.
 It requires multi-megabytes of storage.
 On-line computation is slow and expensive.

Dynamic Merging of Spatial Objects
Methods for Computing Spatial Data Cubes
 On-line aggregation: collect and store pointers to
spatial objects in a spatial data cube
 expensive and slow, need efficient aggregation
 Precompute and store all the possible combinations
 huge space overhead
 Precompute and store rough approximations in a
spatial data cube
 accuracy trade-off, MBR
 Selective computation: only materialize those which
will be accessed frequently
 a reasonable choice
Mining Spatial Association and Co-location Patterns
 Spatial association rule: A  B [s%, c%]
 A and B are sets of spatial or non-spatial predicates
 Topological relations: intersects, overlaps,
disjoint, etc.
 Spatial orientations: left_of, west_of, under, etc.
 Distance information: close_to, within_distance,
 s% is the support and c% is the confidence of the
 Examples
close_to(x, “Park”)
[7%, 85%]

Progressive Refinement
 Progressive Refinement:
 spatial association mining needs to evaluate
multiple spatial relationships among a large no. of
spatial object – expensive.
 Hierarchy of spatial relationship:
 First search for rough relationship and then
refine it
 Superset coverage property – all the potential
answers should be perserved (i.e.false-positive
 Two-step mining of spatial association:
 Step 1: Rough spatial computation (as a filter)
 Using MBR for rough estimation
 Step2: Detailed spatial algorithm (as refinement)
 Apply only to those objects which have passed
the rough spatial association test (no less than
Spatial co-locations
 Just what one really wants to explore.
 Based on the property of spatial autocorrelation,
interesting features likely coexist in closely located
 Efficient methods - Apriori , progressive
Spatial Cluster Analysis & Spatial Classification
 Analyze spatial objects to derive classification
schemes, such as decision trees, in relevance to certain
spatial properties (district, highway, river, etc.)
 Classifying medium-size families according to

income, region, and infant mortality rates
 Mining for volcanoes on Venus
Employ methods such as:
 Decision-tree
classifier + boosting, neural network, genetic
programming, etc.

Spatial Trend Analysis
 Function
 Detect changes and trends along a spatial
 Study the trend of non-spatial or spatial data
changing with space
 Application examples
 Observe the trend of changes of the climate or
vegetation with increasing distance from an ocean
 Crime rate or unemployment rate change with
regard to city geo-distribution.
 Traffic flows in highways and in cities.
Mining Raster Databases
 Vector data Mining
 Maps
 Graphs
 Molecular chains
 Raster data mining
 Satellite Images
Other Applications
 Spatial data mining is used in
 NASA Earth Observing System (EOS): Earth
science data

National Inst. of Justice: crime mapping
Census Bureau, Dept. of Commerce: census data
Dept. of Transportation (DOT): traffic data
National Inst. of Health(NIH): cancer clusters
Commerce, e.g. Retail Analysis

Sponsor Documents

Or use your account on


Forgot your password?

Or register your new account on


Lost your password? Please enter your email address. You will receive a link to create a new password.

Back to log-in