By integrating new techniques in data mining and operational research, we develop a novel travel planning system to design multi-day and multi-stay travel plans based on geo-tagged photos. Specifically, a modified Iterated Local Search heuristic algorithm is developed to find an approximate optimal solution for the multi-day and multi-stay travel planning problem using points of interests (POIs) and recurrence weights between POIs in a travel graph model, which are discovered from photos. To demonstrate the feasibility of this approach, we retrieved geo-tagged photos in Australia from the photo sharing website Panoromia.com to design experimental multi-day and multi-stay travel plans for tourists. The travel patterns that are mined using flow-mapping technique at different geographical scales are used to evaluate the experimental results.
Travel Plan using Geo-tagged Photos in Geocrowd2013
1.
2. The Problem
Multi-Day and Multi-Stay Travel Planning
using Geo- Tagged Photos
Nov 5, 2013
GEOCROWD 2013
Xun Li
2
3. Related Work in Geo-tagged Photos Research
Exploring landmarks or attraction places
“k-means” mean shift density based clustering hierarchical clustering
3
Mining travel patterns from geo-tagged photos
weighted plotting travel routes lines flow maps traffic flows
Making travel recommendations/plans based on geo-tagged photos
• One-day travel planning
• Multi-day travel planning
4. Related Work in Operational Research
Orienteering Problem
• Given a set of attractions and a time budget, find a tour to maximize the
collected scores from selected attractions
Team Orienteering Problem (TOP) with Time Window(TOPTW)
• Given a set of attractions and a time budget, send out several teams to find a
set of tours to maximize the collected scores from selected attractions
• TOP problem plus setting up attractions with different time windows
Algorithms:
• Heuristic solutions (e.g. dynamic programming, greedy algorithm, iterated
local optimum search )
Limitations:
• Travel attractions are from existed resources (e.g. tourist offices)
• Attractiveness scores of POIs are predefined with categorical scores
according to the types of attractions (e.g. museum, archaeology, nature etc.)
• Correlations between POIs are not considered. 4
5. Research Problem
Automatically recommend multi-day and multi-stay travel plan to
travellers based on travel knowledge that mined from
Geo-tagged Photos
• Integrates Geo-tagged Photos research with latest techniques
5
in operational research
• Driven by rich travel information mined from geo-tagged
photos:
• Points of interest, attractive score, suggested visiting time,
opening and closing time, travel recurrence weights
• Relevance
• Tourism research and practice
• Personal guide services
7. Finding POIs from Geo-Tagged Photos
7
Algorithm: Ordering Points To Identify the Clustering Structure
• Density-based clustering
• Hierarchical clustering structure
POI Properties:
• Name: the peak (Mean Shift) is mapped to and labeled
using the nearest feature from a preloaded OSM data
• Attractive Score Si
: # of troutists
• Suggested Visiting Time Ti
: avg( tlast_photo-tfirst_photo)
• Time Window [Otime>8am , Ctime<6pm]i :
[min{tfirst_photo}, max{ tlast_photo} ]
8. Travel Information and Travel Graph Model
8
Travel Information:
• Traveling time between POIs: using OSM data and
Open Source Routing Machine System (OSRMS, Luxen
et. al. 2011)
• Traveling weights and Travel Graph Model:
Reconstruct individual travel route
Travel graph model
Recurrence weight of sub-trip
10. A Heuristic Solution
10
A modified Iterative Local Search based heuristic algorithm to
solve the multi-day and multi-stay travel plan problem
• A variant of TSP problem (NP)
• Approximate optimal solution as fast as possible
Algorithm
• A variant of TSP problem (NP)
• Approximate optimal solution as fast as possible
11. Heuristic Solution—Construct Step
Initialize tour with virtual POIs
• Virtual POI: no location information
Inserting POIi between POIi and POIk
• Find best candidate
• : reoccurring travel weights
•
• Filtering:
11
12. Heuristic Solution—Shake Step
Shake to remove a set of selected POIs from each sub-tour
• Proved in [] as a good technique to explore the entire
solution space and correct earlier mistake solution
12
13. Experiments
Application area
• Australia (Sydney)
• Tourism industry contributes
3% GDP (2011)
• 5.9 million international
tourists (2011)
Data:
• 118,736 geo-tagged photos
contributed by 4,920
registered Internet users in
Panoramio.com
• Average 24 geo-tagged
photos per user
13
14. Results
• 2,135 POIs in Australia
14
OPTICS Result POI and Travel Patterns
15. Results
A 2-day tourist trip itinerary, which starts and ends at
Sydney International Airport
15
16. Results
The detail of the 2-day tourist trip itinerary is shown below:
Day 1 (pink route):
start from Sydney International Airport at 8am;
drive about 0.12 hours to Chinese Garden of Friendship at 8:20, spend 1 hour there;
drive 0.01 hours to Sydney Town Hall at 9:30, spend about 2.4 hours there;
drive 0.01 hours to Sydney Aquarium at 12:00, spend about 1.5 hours there;
drive 0.03 hours to the Mercantile at 13:40, spend 3.2 hours there;
drive 0.15 hours to the Gap Park at 16:50, spend 1 hour there;
drive 0.14 hours to Sydney Harbor Bridge at 18:00, find a hotel nearby to stay.
Day 2 (green route):
start from near Sydney Harbor Bridge at 8am, spend about 3.9 hours there;
drive 0.01 hours to Sydney Opera House at 11:50, spend about 3.9 hours there;
drive 0.01 hours to Museum of Contemporary Art at 15:30 and spend about 1.9
hours there;
drive 0.15 hours to Sydney International Airport at 18:00.
16
17. Results
A 4-day tourist trip itinerary, which starts and ends at
Sydney International Airport
17
18. Results
The detail of the 4-day tourist trip itinerary is shown below:
Day 1 (pink route):
start from Sydney International Airport at 8am;
drive 0.15 hours to Customs House at 8:15, spend about 4.5 hours there;
drive 1.8 hours to The Giant Stairway at 14:45, spend about 1hour there;
drive 0.01 hours to The Three Sisters at 15:40, spend about 1.5 hours there;
drive 0.02 hours to Scenic World Blue Mountains at 16:50, spend about 1 hour there;
drive 1.8 hours to Sydney Aquarium, and find a hotel nearby to stay
Day 2 (green route):
start from Sydney Aquarium at 8am, spend about 1.7 hours there;
drive0.03 hours to Royal Botanic Gardens at 9:50, spend about 1.6 hours there;
drive 0.03 hours to Milsons Point at 11:20, spend about 2.5 hours there;
drive 0.01 hours to Olympic Pool North Sydney at 13:50, spend about 2.5 hours there;
drive 0.15 hours to The Gap Park at 16:35, spend about 1 hour there;
drive 0.14 hours to Sydney Opera House, and find a hotel nearby to stay
Day 3 (blue route):
start from Sydney Opera House at 8am, spend 4 hours there;
drive 0.01 hours to Sydney Visitors Information Centre at 12:00, spend about 4 hours there;
drive 0.01 hours to Museum of Contemporary Art at 16:20, spend about 1 hours there;
drive 0.03 hours to Chinese Garden of Friendship at 17:20, spend about 0.5 hour there;
drive 0.05 hours to Sydney Harbour Bridge, and find a hotel nearby to stay
Day 4 (light yellow route):
start from Sydney Harbour Bridge at 8am, spend about 3.5 hours there;
drive 0.01 hours to the Mercantile at 11:30, spend about 3.2 hours there;
drive 0.02 hours to the Cenotaph at 14:50, spend about 0.8 hours there;
drive 0.01 hours to the Sydney Town Hall at 15:30, spend about 2.3 hours there;
drive 0.13 hours to Sydney International Airport at 18:00. 18
21. Conclusion
Main contribution
• An Intelligent Tourist Trip Plan System
• Solve multi-day and multi-stay travel plan problem
using modified ILS based heuristic algorithm
• More applicable to realistic problems than existing
solutions
• large Internet social media data
• results visualization (travel patterns, travel plans)
21
To explore landmarks and popular places from geo-tagged photographs, there are three types of spatial clustering algorithms were applied.
The first one is “k-means” clustering algorithm, which is a “partitional clustering technique” that attempts to find a user-predefined number of clusters (k). The results are spherical clusters
The second one is “density based clustering” algorithms, such as mean shift and DBSCAN. DBSCAN is also a paritional clustering technique, but it doesn’t require a predefined number of clusters, instead, it need to specify the density (minpts).
Mean shift clustering is a density-estimation based clustering algorithm, it’s non-parametric, so doesn’t require any inputs. It works by assuming that the distribution of points can be approximated via kernel density estimation, so that the dense clusters can be found as the modes of the probability density functions.
The third one is “hierarchical clustering” algorithms, they decompose points into several levels of nested clusters. The result is usually organized as a tree structure, which is close to natural structure of spatial scales. Therefore, they were used to represent different landmarks at different spatial scales.
To mining travel patterns from geotagged photos, a common need is to reconstruct movement or travel routes from geo-tagged photos.
Some researchers directly reconstruct travel routes by connecting the photos in the order of the shooting time. See first figure. By plotting all travel routes on maps, we can see some patterns (dense area and dense routes).
Most research used “region-based approaches”. The regions of photos are either manually defined or extracted by using some clustering algorithms (DBSCAN). Then the travel routes were represented by a connected sequence of regions. Travel patterns can be further mined from these sequential data.
To visualize travel patterns, current reseach used either textual sequences to describe them textually, or use geovisualization techniques to present them on maps. These techniqes include: weighted lines, flow maps, or traffic flows.
Recently, large number of internet user volunteered geo-located photographs are accessible to public. By connecting one individual’s geo-tagged photos chronologically, the photographer’s travel routes can be represented in the form of space-time path.– a problem that is encountered in practice but not yet addressed in existing research
The geo-tagged photos and the additional travel information have been successfully used in many applications: such as exploring landmarks, finding travel patterns for travel recommendation; or location based services; or web applications and services: such auto-tagging.
They bring new research problem to researchers: in this essay, I will address the problem of automatically discovering popular places and travel patterns from geo-tagged photos at different spatial scale.
Information retrieval
Data mining, OPTICS and Meanshift-> POIs and characteristics
Reconstruct travel graph from POIs
Association analysis travel patterns
These information -> inputs for an intelligent tourist trip plan system
Core: an iterated local search based heuristic algorithm -> multi-day and multi stay problem
Web easy to use by users
Virtual POI: but has time information to indicate when to start/end a trip everyday
This heuristic could ensure that every POI inserted on the tour is removed at least once.
2,135 POIs that discovered in Australia
Results of OPTICS
Convert the results to a tree structure
Find travel routes
Data mine the travel patterns
Display them in different hierarchies as supplementaray travel information
Inputs for ILS based heuristic solution
To test Usefulness, 2 experiments
Abstract travel routes is shown in left plot
Travel route in different days is assigned with different color
Actually driving plan is also provided by integrated with Bing Maps
Text travel itinerary for this 2-day trip is also provided.
Tells you the details: when to start, how long to drive, how long to stay
Another experiments
More time, more POIs to visit, longer trip, to get maxiumn visiting score
Interesting, blue mountain NP
Domestic travellers
Match the travel patterns that discovered from photos
Text based travel itinerary for this 4-day trousit trip
Multi-day and multi stay is a very complex problem
Don’t consider how manys days, differnet stays, - variant of TSP problem->NP
The-state-of-the-art heuristic solution -> multi-day and one stay problem
Solve the problem by extending existing algorithm and introducing Virtual POI as stay
This heuristics approximate optimal solution
Find travel routes that has Maximax visiting score as fast as possible
Construction step / Shake step
Text based travel itinerary for this 4-day trousit trip
Geo-tagged photos USED as the travel information of experienced travellers
More applicable
Also applied on large dataset
Results are displayed in both text mode and graph mode,
Travel patterns as additional travel information for travellers
Geo-tagged photos USED as the travel information of experienced travellers
More applicable
Also applied on large dataset
Results are displayed in both text mode and graph mode,
Travel patterns as additional travel information for travellers