This document presents a sequence-based approach for recommending modes of transport to users based on their past activity patterns. It extends the authors' previous framework to extract and match subsequences from user timelines. A machine learning approach learns an optimal subsequence length for matching current and past user activity patterns. The framework is evaluated on a real-world GPS trajectory dataset containing transport mode labels for 18 users. Results show the proposed sequence-based recommender outperforms baseline methods that recommend frequent or long-duration transport modes.
What's New in Teams Calling, Meetings and Devices March 2024
Personalised Transport Recommendations Using Sequence Modelling
1. Personalised Recommendations for
Modes of Transport:
A Sequence-based Approach
Gunjan Kumar, Houssem Jerbi, and Michael P. O’Mahony
Insight Centre for Data Analytics
University College Dublin
AICS ‘17, Dublin
Dec 8, 2017
4. Rich User Activity Data
• Sequential nature of user activities
• Activities have associated features/context, e.g.
location, time, weather, etc.
5. For Recommender Systems
Facilitates real time recommendations for a given user and context
(e.g. time, location, weather, etc.)
Insight Centre for Data Analytics AICS 2017 Slide 4
6. For Recommender Systems
Facilitates real time recommendations for a given user and context
(e.g. time, location, weather, etc.)
Previous work:
A framework for sequence- and context-based activity
recommendation [Kumar et al., 2014]
Insight Centre for Data Analytics AICS 2017 Slide 4
7. For Recommender Systems
Facilitates real time recommendations for a given user and context
(e.g. time, location, weather, etc.)
Previous work:
A framework for sequence- and context-based activity
recommendation [Kumar et al., 2014]
Current Research Problem:
Recommending the next mode of transport to users.
Insight Centre for Data Analytics AICS 2017 Slide 4
8. Motivation for Mode of Transport Recommendation
Google Now Microsoft Cortana
Insight Centre for Data Analytics AICS 2017 Slide 5
9. Motivation for Mode of Transport Recommendation
But limited by fixed & manual selection of transport:
Google Now Manual Setting Microsoft Cortana Manual Setting
Insight Centre for Data Analytics AICS 2017 Slide 6
10. Motivation for Mode of Transport Recommendation
Recommending mode of transport can :
• Help users better plan their days
• Facilitate travel
• Help service providers better cater to needs of the community
Insight Centre for Data Analytics AICS 2017 Slide 7
11. Related Work
Capturing Sequence in UrbComp/RecSys
• Hierarchical-graph-based model:
- [Li et al., 2008; Zheng et al., 2009; Yoon et al., 2010]
• All-kth
-order Markov models:
- [Bohnenberger and Jameson, 2001; Deshpande and Karypis, 2004; Shani
et al., 2005]
Capturing Context in UrbComp/RecSys
• Tensor and matrix factorization models:
- [Zheng et al., 2010a, 2012; Symeonidis et al., 2011; Wang et al., 2010;
Adomavicius et al., 2011]
12. Related Work
Capturing Both Sequence & Context
• To improve recommendations
- [Adomavicius and Tuzhilin, 2005; Zheng et al., 2012]
• Content-based Activity Recommendation Framework
- [Kumar et al., 2014]
• Stochastic Modelling
- [Sun et al., 2016]
13. Our Contributions
• A content-based approach for recommending the next activity
(mode of transport) to users based on past activity patterns.
• Extending our previous framework [Kumar et al., 2014] with new
approaches to extract and match subsequences drawn from
the past activity patterns of users.
• A ML approach to learn optimal subsequence length for
matching current and past subsequences of user activity
patterns.
• Experiments using real-world mode of transport dataset.
Insight Centre for Data Analytics AICS 2017 Slide 10
14. Our Contributions
• A content-based approach for recommending the next activity
(mode of transport) to users based on past activity patterns.
• Extending our previous framework [Kumar et al., 2014] with new
approaches to extract and match subsequences drawn from
the past activity patterns of users.
• A ML approach to learn optimal subsequence length for
matching current and past subsequences of user activity
patterns.
• Experiments using real-world mode of transport dataset.
Insight Centre for Data Analytics AICS 2017 Slide 11
15. Framework Overview
Ranking
User Data TimelinesData Modelling
Timeline Matching
Top-N
Recommendations
Similarity Assesment
Insight Centre for Data Analytics AICS 2017 Slide 12
16. Framework Overview
Ranking
User Data TimelinesData Modelling
Timeline Matching
Top-N
Recommendations
Similarity Assesment
Insight Centre for Data Analytics AICS 2017 Slide 13
17. Framework Overview
Ranking
User Data TimelinesData Modelling
Timeline Matching
Top-N
Recommendations
Similarity Assesment
Insight Centre for Data Analytics AICS 2017 Slide 14
18. Framework Overview
Ranking
User Data TimelinesData Modelling
Timeline Matching
Top-N
Recommendations
Similarity Assesment
Insight Centre for Data Analytics AICS 2017 Slide 15
19. Framework Overview
Ranking
User Data TimelinesData Modelling
Timeline Matching
Top-N
Recommendations
Similarity Assesment
Insight Centre for Data Analytics AICS 2017 Slide 16
20. Data Model
Ranking
User Data TimelinesData Modelling
Timeline Matching
Top-N
Recommendations
Similarity Assesment
Insight Centre for Data Analytics AICS 2017 Slide 17
21. Data Model
Activity object
A single occurrence of an activity (mode of transport) and consists
of a set of features describing the activity or the context.
mode of transport,
start-time,
duration,
distance-travelled,
average altitude,
start geo-coordinates,
end geo-coordinates.aoi
Insight Centre for Data Analytics AICS 2017 Slide 18
22. Data Model
Activity Timeline
A chronological sequence of n activity objects performed by the
user during a time interval δ:
T =< ao1, ao2, ..., aon >
time
Train, 08:19, 28 mins, (53.38N, -6.07W), (53.35N, -6.25W)
Walk, 8:47, 9 mins, (53.31N, -6.21W), (53.30N, -6.22W)
Bus, 8:37, 10 mins, (53.35N, -6.25W), (53.31N, -6.21W)
ao1 ao3
23. Recommendation Algorithm
Ranking
User Data TimelinesData Modelling
Timeline Matching
Top-N
Recommendations
Similarity Assesment
Insight Centre for Data Analytics AICS 2017 Slide 20
29. Recommendation Algorithm
4 3 2 1
4 3 2 1
4 3 2 1
N-count matching
Target
Activity
(aot)
Current
Activity
(aoc)
User Timeline
Time
?
00 hrs 00 hrs 00 hrs 00 hrs
Candidate Timeline #2
Candidate Timeline #1
Current Timeline
(N = 4)
Insight Centre for Data Analytics AICS 2017 Slide 21
30. Similarity Assessment
Ranking
User Data TimelinesData Modelling
Timeline Matching
Top-N
Recommendations
Similarity Assesment
Insight Centre for Data Analytics AICS 2017 Slide 22
31. Similarity Assessment
4 3 2 1
4 3 2 1
4 3 2 1
Two-level Edit Distance
Target
Activity
(aot)
Current
Activity
(aoc)
User Timeline
Time
?
00 hrs 00 hrs 00 hrs 00 hrs
Candidate Timeline #2
Candidate Timeline #1
Current Timeline
[Kumar et al., 2014]
Insight Centre for Data Analytics AICS 2017 Slide 23
32. Ranking
Ranking
User Data TimelinesData Modelling
Timeline Matching
Top-N
Recommendations
Similarity Assesment
Insight Centre for Data Analytics AICS 2017 Slide 24
33. Ranking
4 3 2 1
4 3 2 1
4 3 2 1
Target
Activity
(aot)
Current
Activity
(aoc)
User Timeline
Time
?
00 hrs 00 hrs 00 hrs 00 hrs
Candidate Timeline #2
Candidate Timeline #1
Current Timeline
Insight Centre for Data Analytics AICS 2017 Slide 25
34. Ranking
4 3 2 1
4 3 2 1
4 3 2 1
Target
Activity
(aot)
Current
Activity
(aoc)
User Timeline
Time
?
00 hrs 00 hrs 00 hrs 00 hrs
Candidate Timeline #2
Candidate Timeline #1
Current Timeline
Insight Centre for Data Analytics AICS 2017 Slide 25
35. Ranking
Ranked
4 3 2 1
4 3 2 1
4 3 2 1
Target
Activity
(aot)
Current
Activity
(aoc)
User Timeline
Time
?
Candidate Timeline #2
Candidate Timeline #1
Current Timeline
00 hrs 00 hrs 00 hrs 00 hrs
Score(aoj
rec ) = 1 −
d(Tj , Tc ) − min
Tp∈T
d(Tp, Tc )
max
Tp∈T
d(Tp, Tc ) − min
Tp∈T
d(Tp, Tc )
Insight Centre for Data Analytics AICS 2017 Slide 25
36. What value for N ?
3 2 1
3 2 1
3 2 1
?
?
N-count matching
Target
Activity
(aot)
Current
Activity
(aoc)
User Timeline
Time
?
00 hrs 00 hrs 00 hrs 00 hrs
Candidate Timeline #2
Candidate Timeline #1
Current Timeline
(N = )
?
?
Insight Centre for Data Analytics AICS 2017 Slide 26
37. Why N is important ?
Figure: MRR versus matching unit for three representative users.
38. Our Contributions
• A content-based approach for recommending the next activity
(mode of transport) to users based on past activity patterns.
• Extending our previous framework [Kumar et al., 2014] with new
approaches to extract and match subsequences drawn from
the past activity patterns of users.
• A ML approach to learn optimal subsequence length for
matching current and past subsequences of user activity
patterns.
• Experiments using real-world mode of transport dataset.
Insight Centre for Data Analytics AICS 2017 Slide 28
39. Learning Personalised Optimal Matching Units
• Supervised classification to learn optimal N, i.e. N , for each
user.
• Given the natural variation in the activity patterns of users,
learning an exact value for N is not feasible.
• Hence, the approach is to learn a range of values N within
which N is likely to lie for each user.
Opt. matching range (Ni ) Opt. matching unit (Ni )
[0, 1] 1
[2, 4] 3
[5+] 5
Insight Centre for Data Analytics AICS 2017 Slide 29
40. Attribute Extraction: Timeline Decomposition
• Each user represented by an attribute vector.
• For attribute extraction:
timelines are decomposed into features-sequence :
User Timeline
Time
00 hrs 00 hrs 00 hrs
41. Attribute Extraction: Timeline Decomposition
• Each user represented by an attribute vector.
• For attribute extraction:
timelines are decomposed into features-sequence :
User Timeline
Time
00 hrs 00 hrs 00 hrs
dist-travel
start-geo
start-time
42. Attribute Extraction: Timeline Decomposition
• Each user represented by an attribute vector.
• For attribute extraction:
timelines are decomposed into features-sequence :
User Timeline
Time
00 hrs 00 hrs 00 hrs
dist-travel
start-geo
dist-travel
start-geo
start-time
start-time
44. Timeline Attributes
Regularity Attributes: Sample Entropy
• Capturing the degree of regularity in the timelines
• Previously used to quantify regularity in physiological and
biological time-series (ECG, fMRI)
[Costa and Goldberger, 2015; Sokunbi, 2014]
Given: A feature-sequence Sz with n elements, epoch length p and
tolerance r
Then: Sample Entropy is :
SampEnz (p, r, n) = −ln
n−p
i=1
kp+1
i
n−p
i=1
kp
i
45. Timeline Attributes
Regularity Attributes: Sample Entropy
1. SampEnp
z : sample entropy of a feature sequence Sz for epoch
length p,
2. µSampEnp
T : mean sample entropy over all feature sequences
Sz , z = 1, 2, ..., m of timeline T for epoch length p,
3. σSampEnp
T : standard deviation of sample entropy over all
feature sequences Sz , z = 1, 2, ..., m of timeline T for epoch
length p.
SampEnp
transport mode SampEnp
start time
SampEnp
duration SampEnp
distance travelled
SampEnp
start geo SampEnp
end geo
SampEnp
avg altitude
µSampEnp
σSampEnp
Here, p = 2, 3Insight Centre for Data Analytics AICS 2017 Slide 33
46. Timeline Attributes
Repetition Attributes: k-gram attributes
Previously used for sequence classification, biological sequence
analysis and text classification [Xing et al., 2010; Dong and Pei, 2007].
1. ηk
z : total number of distinct k-grams in feature sequence Sz ,
normalised by total number of k-grams occurring in Sz ,
2. µf k
z : mean frequency of occurrence of distinct k-grams in
feature sequence Sz , normalised by total number of k-grams
occurring in Sz ,
3. σf k
z : standard deviation of frequency of occurrence of distinct
k-grams in feature sequence Sz , normalised by length of Sz .
η1
transport mode ηk
transport mode
µf k
transport mode σf k
transport mode
Here, k = 2, 3.
Insight Centre for Data Analytics AICS 2017 Slide 34
47. Our Contributions
• A content-based approach for recommending the next activity
(mode of transport) to users based on past activity patterns.
• Extending our previous framework [Kumar et al., 2014] with new
approaches to extract and match subsequences drawn from
the past activity patterns of users.
• A ML approach to learn optimal subsequence length for
matching current and past subsequences of user activity
patterns.
• Experiments using real-world mode of transport dataset.
Insight Centre for Data Analytics AICS 2017 Slide 35
48. Dataset
• GPS trajectory dataset Geolife Trajectories 1.3 [Zheng et al., 2010b]
• Extract a subset containing mode of transport labels:
51 days, 334 activity objects, 18 users.
• Modes of transport:
bike, bus, car, subway, taxi, train, walk, airplane, boat
• Features:
mode of transport,
start-time,
duration,
distance-travelled,
average altitude,
start geo-coordinates,
end geo-coordinates.aoi
49. Methodology
• 80-20 temporal split: Each user’s complete timeline is split
into training and test timelines, where test timeline has most
recent 20% of available days.
• Evaluation measure: Mean Reciprocal Rank (MRR)
• Recommendation algorithms:
• N-count recommendation algorithm (SeqNCRec)
• Daywise sequence-based recommender (DW_ActivRec)
• High occurrence recommender (OccurRec)
• High duration recommender (DurationRec)
Insight Centre for Data Analytics AICS 2017 Slide 37
50. Recommendation Performance
SeqNCRec DW_ActiveRec OccurRec DurationRec
0.4
0.5
0.6
0.7
0.8
0.9
1
MRR
Figure: MRR distribution over all users for the SeqNCRec,
DW_ActivRec, OccurRec and DurationRec recommenders.
Insight Centre for Data Analytics AICS 2017 Slide 38
51. Methodology
Learning Optimal Matching Unit range
• Wrapper attribute selection: C4.5 algorithm, greedy
backward search and area under ROC curve as evaluation
measure.
• Classification: pruned attribute vectors for each user fed into
a C4.5 induction algorithm to predict optimal matching unit
range.
Insight Centre for Data Analytics AICS 2017 Slide 39
53. Using the predicted N, the mean reduction in MRR
(recommendations performance) is only 3.1%
Insight Centre for Data Analytics AICS 2017 Slide 41
54. Conclusions
Experiments using a real-world dataset showed good results for our
proposed:
• Content-based recommendation approach which captures
both sequence and context
• N-count subsequence matching approach
• ML approach to learn optimal matching unit.
Insight Centre for Data Analytics AICS 2017 Slide 42
55. Recent Publications
• Recommendations for Modes of Transport: A
Sequence-based Approach
The 5th ACM SIGKDD International Workshop on Urban
Computing (UrbComp 2016), 2016
• Towards the Recommendation of Personalised Activity
Sequences in the Tourism Domain
The 2nd ACM RecSys Workshop on Recommenders in Tourism
(RecTour 2017), 2017
Insight Centre for Data Analytics AICS 2017 Slide 43
56. Future Work
• Recommend a sequence of activities, along with associated
context.
• Investigate collaborative approaches.
• Consider new probabilistic and RNN-based approaches.
• Improve diversity and novelty of recommended sequences.
Insight Centre for Data Analytics AICS 2017 Slide 44
58. References I
G. Adomavicius and A. Tuzhilin. Toward the next generation of recommender systems:
A survey of the state-of-the-art and possible extensions. IEEE Trans. on Knowl.
and Data Eng., 17(6):734–749, June 2005.
G. Adomavicius, B. Mobasher, F. Ricci, and A. Tuzhilin. Context-aware recommender
systems. AI Magazine, 32(3), 2011.
T. Bohnenberger and A. Jameson. When policies are better than plans:
Decision-theoretic planning of recommendation sequences. In Proceedings of the
6th International Conference on Intelligent User Interfaces, IUI ’01, pages 21–24.
ACM, 2001.
M. D. Costa and A. L. Goldberger. Generalized multiscale entropy analysis:
Application to quantifying the complex volatility of human heartbeat time series.
Entropy, 17(3):1197–1203, 2015.
M. Deshpande and G. Karypis. Selective Markov models for predicting web page
accesses. ACM Trans. Internet Technol., 4(2):163–184, May 2004.
G. Dong and J. Pei. Sequence Data Mining (Advances in Database Systems).
Springer-Verlag New York, Inc., 2007.
B. Hayes. Crinkly curves. American Scientist, 101(3):178, 2013.
G. Kumar, H. Jerbi, C. Gurrin, and M. P. O’Mahony. Towards activity
recommendation from lifelogs. In Proceedings of the 16th International Conference
on Information Integration and Web-based Applications & Services, iiWAS ’14,
pages 87–96. ACM, 2014.
59. References II
Q. Li, Y. Zheng, X. Xie, Y. Chen, W. Liu, and W.-Y. Ma. Mining user similarity based
on location history. In Proceedings of the 16th ACM SIGSPATIAL International
Conference on Advances in Geographic Information Systems, GIS ’08, pages
34:1–34:10. ACM, 2008.
H. Sagan. Space-filling curves. Springer Science & Business Media, 2012.
G. Shani, D. Heckerman, and R. I. Brafman. An MDP-based recommender system. J.
Mach. Learn. Res., 6:1265–1295, Dec. 2005.
M. O. Sokunbi. Sample entropy reveals high discriminative power between young and
elderly adults in short fMRI data sets. Frontiers in neuroinformatics, 8, 2014.
Y. Sun, N. J. Yuan, X. Xie, K. McDonald, and R. Zhang. Collaborative nowcasting for
contextual recommendation. In Proceedings of the 25th International Conference
on World Wide Web, WWW ’16, pages 1407–1418. International World Wide Web
Conferences Steering Committee, 2016.
P. Symeonidis, A. Papadimitriou, Y. Manolopoulos, P. Senkul, and I. Toroslu.
Geo-social recommendations based on incremental tensor reduction and local path
traversal. In Proceedings of the 3rd ACM SIGSPATIAL International Workshop on
Location-Based Social Networks, LBSN ’11, pages 89–96. ACM, 2011.
C.-Y. Wang, Y.-H. Wu, and S.-C. T. Chou. Toward a ubiquitous personalized daily-life
activity recommendation service with contextual information: A services science
perspective. Information Systems and E-Business Management, 8(1):13–32,
January 2010.
Z. Xing, J. Pei, and E. Keogh. A brief survey on sequence classification. SIGKDD
Explor. Newsl., 12(1):40–48, Nov. 2010.
Insight Centre for Data Analytics AICS 2017 Slide 2
60. References III
H. Yoon, Y. Zheng, X. Xie, and W. Woo. Smart itinerary recommendation based on
user-generated GPS trajectories. In Proceedings of the 7th International
Conference on Ubiquitous Intelligence and Computing, UIC’10, pages 19–34.
Springer-Verlag, 2010.
V. W. Zheng, B. Cao, Y. Zheng, X. Xie, and Q. Yang. Collaborative filtering meets
mobile recommendation: A user-centered approach. In AAAI 2010. Association for
Computing Machinery, Inc., July 2010a.
V. W. Zheng, Y. Zheng, X. Xie, and Q. Yang. Towards mobile intelligence: Learning
from GPS history data for collaborative recommendation. Artif. Intell., 184-185:
17–37, June 2012.
Y. Zheng, L. Zhang, X. Xie, and W.-Y. Ma. Mining interesting locations and travel
sequences from GPS trajectories. In Proceedings of the 18th International
Conference on World Wide Web, WWW ’09, pages 791–800. ACM, 2009.
Y. Zheng, X. Xie, and W.-Y. Ma. Geolife: A collaborative social networking service
among user, location and trajectory. IEEE Database Engineering Bulletin, June
2010b.
Insight Centre for Data Analytics AICS 2017 Slide 3
62. Recommendation Performance
airplane bike bus car subway taxi train walk
0
0.2
0.4
0.6
0.8
1
Figure: MRR achieved for each user by the SeqNCRec recommender
using observed optimal matching units.
Insight Centre for Data Analytics AICS 2017 Slide 5