3.
What does Data Mining is?
● The process for:
– select
– explore
– model big data volumes.
● For discover periodicity and not knowing
relations.
● It search useful and clear results
● Is many useful for the data base proprietary.
5.
The Data mining process
● 1. Defining analysis goals.
● 2. Select, organize and prepare data.
● 3. Data exploration analysis and eventual
transformation.
● 4. Establish statistical methods for
elaboration phase:
– Exploratory methods
– Descriptive methods
– Forecast methods
– Local methods
6.
The Data mining process (will continue)
● 5. Data elaboration, using the previously selected
methods.
● 6. Evaluation and validation of statistical methods,
select a final analysis model.
● 7. Model interpretation and future appliances on
decision processes.
7.
1. Defining analysis goals.
● shows citizen behaviors like:
– Who lives/works in certain area
– Which are their “working days”
– Who like certain things (such as assist to the
soccer stadium each two weeks)
– Who buy in well known supermarkets.
– Who has little childs, because go to kinder
garden each day in the open/close hours.
– Who is using train service, because he follow
the rail lines.
– Who use car daily, because follow the freeway
route.
8.
2. Select, organize and prepare data.
● Create Metadata Database
● Populate it
– Delimitate problem just for a city
– Make an database extraction just
for considering that city.
– Research entire city services and its
addresses
– Transform each addresses in
geopositions
– Create Relations between “Service
Places” and “base stations”
9.
3. Data exploration analysis and
eventual transformation
● Transaction Data with Missing and Incomplete
Fields
– CELL_TO_LOCATION_TRACE()
– lookUnlocalizedCells()
● Content changes along the time
10.
4. Establish statistical methods for
elaboration phase
● We decide to use a Business Rule-Engine
– The underlying idea of a rule engine is
to externalize the business or application logic
11.
Data Mining Differs from
Typical Operational Business
12.
next steps?
● Finish Geophone Data Mining
– Continue working with the Rule-Engine
– Making Decision Trees
– Link analysis
– Cluster Analysis
● Create Real-time Embedded System,
– this software piece will replace Mobile Application
– will be installed on Base-Stations
– will avoid all cell management problems and many
of current data acquisition problems.
● Get ready for Anthropological Approach