12. How does Arnetminer works?
Ranks pairs of researchers according to four factors:
f1: they published
Many papers
Together
f2: advisor published
More than the
Student
f3: advisor older
than the student
f4: student published
her first paper(s)
with the advisor
35. Empirical Study
• Goal: analyze data from mailing lists and versioning
systems
• Purpose: investigating which factors can be used to
identify mentors
• Quality focus: recommend mentors in software
projects
• Context: mailing lists and versioning systems of five
software
Apache, FreeBSD, PostgreSQL, Python and Samba
36. Context
Training and Test sets for evaluating Yoda.
Apache
FreeBSD
PostgreSQL
Python
Samba
Period
(Training set)
08/2001-03/2002
11/1998-02/2000
10/1998-05/2001
05/2000-05/2001
04/1998-09/2000
Period
(Test set)
04/2002-12/2008
03/2000-10/2008
06/2001-03/2008
06/2001-12/2008
10/2000-12/2008
# of Mentors
(Training set)
19
65
10
28
17
# of
Newcomers
(Training set)
13
33
8
32
33
# of
Newcomers
(Test set)
13
33
7
31
33
38. RQ1: How can we identify mentors from the past
history of a software project?
COUPLES
SCORE
5
2.5
1.5
1.5
1.5
1.5
1.5
……….
……….
w f
i 1
i i
39. RQ1: How can we identify mentors from the past
history of a software project?
COUPLES
SCORE
5
w f
2.5
i 1
1.5
1.5
1.5
1.5
1.5
……….
……….
Manual
Validation
i i
40. RQ1: How can we identify mentors from the past
history of a software project?
Possible
Configurations
100%
90%
f1
80%
Precision
70%
60%
50%
40%
30%
20%
10%
0%
18
19
20
21
22
23
Number of newcomer‐mentor pairs
24
41. RQ1: How can we identify mentors from the past
history of a software project?
Possible
Configurations
100%
90%
f1 +f2+ f3
80%
Precision
70%
60%
50%
40%
30%
20%
10%
0%
18
19
20
21
22
23
Number of newcomer‐mentor pairs
24
42. RQ1: How can we identify mentors from the past
history of a software project?
Possible
Configurations
100%
90%
f1 +f2+ f4
80%
Precision
70%
60%
50%
40%
30%
20%
10%
0%
18
19
20
21
22
23
Number of newcomer‐mentor pairs
24
43. RQ1: How can we identify mentors from the past
history of a software project?
Possible
Configurations
100%
90%
f5
80%
Precision
70%
60%
50%
40%
30%
20%
10%
0%
18
19
20
21
22
23
Number of newcomer‐mentor pairs
24
(Baseline)
44. RQ1: How can we identify mentors from the past
Apache
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Precision
Precision
history of a software project?
18
19
20
21
22
23
24
Number of newcomer‐mentor pairs
f1
f1 +f2+ f3
PostgreSQL
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
12
14
16
18
20
22
Number of newcomer‐mentor pairs
f1 +f2+ f4
f5(Baseline)
45. RQ1: How can we identify mentors from the past
Apache
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Precision
Precision
history of a software project?
18
19
20
21
22
23
24
Number of newcomer‐mentor pairs
f1
f1 +f2+ f3
PostgreSQL
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
12
14
16
18
20
22
Number of newcomer‐mentor pairs
f1 +f2+ f4
f5(Baseline)
46. RQ1: How can we identify mentors from the past
history of a software project?
Precision
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
FreeBSD
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
23
24 26 28 30 32 34 36 38 40 42 44 46 48
25
Samba
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
30
27
29
31
33
35
37
39
Number of newcomer‐mentor pairs
Number of newcomer‐mentor pairs
Precision
Precision
Python
32
34
36
38
40
42
Number of newcomer‐mentor pairs
41
47. RQ1: How can we identify mentors from the past
history of a software project?
Precision
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
USEFUL FACTORS FOR
MENTORS IDENTIFICATION
FreeBSD
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
23
24 26 28 30 32 34 36 38 40 42 44 46 48
Number of newcomer‐mentor pairs
f1
25
Samba
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
30
27
29
31
33
35
37
39
Number of newcomer‐mentor pairs
0.5*f1 + 0.25*f2 + 0.25*f3
0.5*f1 + 0.25*f2 + 0.25*f4
Precision
Precision
Python
32
34
36
38
40
42
Number of newcomer‐mentor pairs
41
48. RQ2: To what extent would it be possible to
recommend mentors to newcomers joining a
software project?
49. RQ2: To what extent would it be possible to
recommend mentors to newcomers joining a
software project?
50. RQ2: To what extent would it be possible to
recommend mentors to newcomers joining a
software project?
YODA make it is possible
possible to recommend
Mentors
58. Done/received mentoring?
Had a mentor?
58%
42%
Yes, I received
Mentoring. My
mentor was…
Yes, I did
mentoring…
>
Did mentoring?
0%
92%
8%
20% 40% 60% 80% 100%
YES NO
59. Perceived importance of mentoring
Useless at all
0%
0%
Not important
0%
0%
Neutral
11%
45%
56%
Important
Very important
0%
Effect of mentor
36%
33%
18%
20%
40%
60%
Effect on newcomer
60. Perceived importance of mentoring
Useless at all
0%
0%
Not important
0%
0%
Neutral
11%
45%
56%
Important
Very important
0%
Effect of mentor
36%
33%
18%
20%
40%
60%
Effect on newcomer
61. Perceived importance of mentoring
Useless at all
0%
0%
Not important
0%
0%
Neutral
11%
45%
56%
Important
Very important
0%
Effect of mentor
36%
33%
18%
20%
40%
60%
Effect on newcomer
62. Perceived importance of mentoring
Useless at all
0%
0%
Not important
0%
0%
11%
Is very important that
Neutral
mentor share knowledge
with a mentee…
Important
Very important
0%
Effect of mentor
45%
56%
36%
33%
18%
20%
40%
60%
Effect on newcomer
63. What makes a good Mentor
Others
0%
Project knowledge
38%
Communication skills
Experience
42%
19%
0% 10% 20% 30% 40% 50%
64. What makes a good Mentor
Others
0%
Project knowledge
38%
Communication skills
Experience
42%
19%
0% 10% 20% 30% 40% 50%
65. What makes a good Mentor
Others
0%
My knowledge
Project first Mentor
38%
had a very strong and
technical background
Communication skills
Experience
42%
19%
0% 10% 20% 30% 40% 50%