2. Presents…..!
Data Mining an
Expansive Groundwater
System
3. Press your Pause key to stop/
restart this presentation at any
time.
Press your Esc key to end it.
4. Advanced Data Mining (ADMi) has
developed unique Data Mining technology
for modeling natural systems. This video
demonstrates its application to an
expansive groundwater system.
Data Mining extracts valuable knowledge
from large amounts of data. It employs
advanced methods from several scientific
disciplines.
6. This system is approximately 100 x 120
miles with a maximum surface elevation
of 220 feet.
The following illustration shows its
topography. Land elevation is indicated
by the key at left. The path of the
Suwannee River can be readily seen
near the center.
9. This groundwater resource is
managed by the Suwannee River
Management District in Live Oak,
Florida.
They maintain a network of several
hundred wells that provide data
about the behavior of the aquifer.
10. The following shows the locations of
wells for which there are significant
amounts of data.
Note that some areas have several
wells clustered together and that
others have few or none.
12. Histories for a few wells go back to the
1940’s, however, the record prior to
1982 is sparse.
The vertical blue streaks in the
following 3D image show the historical
range of individual wells. Together they
show the dynamic range of the aquifer.
14. Collectively, these data comprise a
vast, but unwieldy source of
potentially valuable knowledge.
We researched how Data Mining
could be used to extract knowledge
about this complex system and
others like it.
15. Computer models of groundwater
systems are important tools for learning
how these invaluable resources are
affected by weather, pumping and land
development.
Our goal was to use Data Mining to
create an accurate model of the
aquifer’s water level.
16. The following is a 25 x 30 mile
detail from near the center of the
system. It shows the positions of 22
wells and their histories since 1982.
Note that the two groups of circled
wells clearly behave differently from
each other.
17. 490000
470000
450000
Suw
25 miles
430000
anne
e
Rive
410000
r
390000
370000
350000
2360000 2380000 2400000 2420000 2440000 2460000 2480000 2500000
30 miles
18. Because the wells exhibited so many
different behaviors, it was necessary
to group them into “classes”. Wells
assigned to a particular class behave
similarly.
Data Mining optimally determined the
number of classes and how the wells
would be assigned.
19. The following shows that 12 classes
were used and how the wells were
assigned. The classes are numbered
1 to 12.
It was surprising how some classes
are distributed over a broad area and
are intermingled with other classes.
20.
21. Closer inspection showed that Data
Mining did indeed optimally assign
the wells.
The following shows the “normalized”
histories of wells for two of the
classes.
Note the seasonal variability.
23. The next Data Mining task was to assign
aquifer locations to the 12 classes.
Locations were optimally assigned
based on their topological
characteristics and proximity to wells
whose classes were known.
Results are shown in the following.
24.
25. The next Data Mining task was to
create a water level model for each
class. Every location was assigned to
a class, and therefore, a model.
Inputs to each model were the
characteristics of a location and water
levels of selected wells. The output
was the predicted water level of the
location.
26. The models are very accurate.
Accuracy can be checked at locations
where there are well histories.
The following compares predictions to
actual histories for wells of four
different classes. The water levels are
normalized to land surface elevation.
27. Normalized Water Level above Sea Level Class 1
Actual
Prediction
History from April 1982 to October 1998
28. Normalized Water Level above Sea Level Class 3
Actual
Prediction
History from April 1982 to October 1998
29. Normalized Water Level above Sea Level
Class 6
Actual
Prediction
History from April 1982 to October 1998
30. Normalized Water Level above Sea Level Class 10
Actual
Prediction
History from April 1982 to October 1998
31. The “model” of the aquifer is actually a
collection of models, one for each class.
A computer program was created that
integrates the models, a history database,
and a graphical user interface.
The following shows a long term
simulation of the aquifer’s water level
generated by the model. Note the color
key at right, and that time is reversed.
32.
33.
34.
35.
36.
37.
38.
39.
40.
41.
42.
43.
44.
45.
46.
47.
48.
49.
50.
51.
52.
53.
54.
55.
56.
57.
58.
59.
60.
61.
62.
63.
64.
65.
66.
67.
68. Often multi-dimensional visualization
reveals important information that
would otherwise go unnoticed. ADMi
has world-class capabilities in
advanced visualization technology.
The following shows the model’s
prediction of the upper range (ceiling)
of the aquifer. The vertical scale is
exaggerated to show details.
85. The following shows the predicted
aquifer level for the period from
January 1995 to October 1998.
Note the spatially asynchronous
motions caused by variability in
rainfall and the Suwannee River’s
stage.
132. Conclusion
s
This Data Mining-based model required
about 10 weeks to develop.
A conventional finite-difference model of
the same natural system was developed
by a government agency. It took over 3
years to complete! It is much less
accurate at predicting water level.
133. Conclusion
s
Data Mining is incredibly powerful for
extracting knowledge about complex
natural systems from databases.
The models can be more accurate
than traditional approaches, and
require much less time to develop.