SlideShare a Scribd company logo
1 of 19
Evaluating Recommended
      Applications

        Mark Grechanik
  Accenture Technology Labs
         David Coppit
      Denys Poshyvanyk
The College of William and Mary
The College of William and Mary
      Denys Poshyvanyk
The College of William and Mary
How Many Open Source Applications Are There?

• Sourceforge.net reports that they host
  180,000 projects as of August 1, 2008.

• There are dozens of other open source
  repositories containing tens of
  thousands of different applications.

• Companies have internal source
  control management systems
  containing hundreds of thousands of
  applications.


                                               2
Problem

• Finding/checking existing software
  matching high-level user requirements
  • Would reduce the cost of many software
    projects
  • Would provide users with examples of
    different implementations

• Challenges:
  • Finding relevant applications is difficult
  • Evaluating recommended applications is very
    difficult
                                                  3
What Search Engines Do




        Matching
        keywords


         descriptions
           of apps



                         4
What Search Engines Do




        Matching
        keywords


         descriptions
           of apps



                         5
Fundamental Problems

• Vocabulary problem
  • Mismatch between the high-level intent reflected in
    the descriptions of applications and their low-level
    implementation details
• Concept assignment problem
                          Code snippet implementing
  High level concept
                                 “Send data”
     “Send data”          s = socket.socket(proto, socket.SOCK_DGRAM)
                          s.sendto(teststring, addr)
                          buf = data = receive(s, 100)
                          while data and 'n' not in buf:
                          data = receive(s, 100) buf += data

• Many application repositories are polluted with
  poorly functioning projects
                                                                        6
Working without a Recommender

• Find relevant application (s)
• Download application
• Locate and examine fragments of the code that
  implement the desired features
• Observe the runtime behavior of this application
  to ensure that this behavior matches
  requirements
• This process is manual since programmers:
   • study the source code of the retrieved applications
   • locate various API calls
   • read information about these calls in help documents
• Still, it is difficult for programmers to link high-
  level concepts from requirements to their
  implementations in source code
                                                            7
Our Goal




           search




                    8
Our Goal




           9
Key observations

• While studying retrieved apps developers :
  • locate various API calls
  • read information about these calls in help documents
• Help docs are supplied by the same vendors
  whose packages/APIs are used in software
• Programmers read and rely on these API docs
• Help docs are written and reviewed by many
  developers
• Help documents are usually more verbose and
  accurate than project descriptions

                                                           10
How K9 Recommender System Works


                                          API call1


                        descriptions      API call2
                match
                        of API calls


                                          API call3


 Automatically matching words in user queries against API
help docs instead of:
    searching in project descriptions;
    searching in source code.
 K9 uses help documents to produce a list of relevant API calls

                                                               11
The Architecture behind K9


 API calls     API call                        Integrity
                              API calls
Dictionary     lookup                           Engine




                                             Applications
Help page                    Recommended
                 User                         metadata
processor                     applications




Help pages    Source code     Retrieved
                                                Static
             search engine   applications
                                               Analyzer




                                                            12
API Call Lookup (ACL)

• After the user enters keywords, the ACL
  searches through the documentation text to find
  these keywords
• Use Information Retrieval to rank the
  documents to the user query
• The ACL returns the absolute path to the
  programming entity for the text that contains
  entered keywords




                                                    13
The Architecture behind K9


 API calls     API call                        Integrity
                              API calls
Dictionary     lookup                           Engine




                                             Applications
Help page                    Recommended
                 User                         metadata
processor                     applications




Help pages    Source code     Retrieved
                                                Static
             search engine   applications
                                               Analyzer




                                                            14
Integrity Engine

• We assume that keywords in queries represent
  logically connected concepts
  • Consider query “compress encrypt XML”
  • Queries are often formed by constructing a question
    first, and then extracting keywords from this
    sentence
• If a relation is present between concepts in the
  query, then there should be a relation between
  API calls that implement these concepts in the
  program code
• This heuristics is based on the observation that
  relations between concepts are often preserved
  in their implementations in the program code
                                                          15
Computing the Ranking

• Total number of API calls is n
• Total number of connections between these API calls is
  less or equal to n(n-1)
• Each connection i is assign some weight wi
   • Weak connections have weights 0.5 and strong connections
     have weights 1.0
• l is the connectivity value
   • 1 : loop link, 0.5 :single link, 0 : no link
• Rank is computed using the following formula

                          wi li
                 rank 
                        n  n  1
                                                                16
Current Status


• Restricting the scope to Java projects
• Challenges:
   • How to automatically locate and download the latest
     version of the software (e.g., from sourceforge)?
   • How to automatically locate the correct entry point
     (i.e., main) for static analysis?
   • How to reduce the time for the static analysis?
   • How and when to update API call dictionary?
   • Testing other ranking heuristics
• Evaluation is pending

                                                           17
Related Work

•   CodeFinder/Helgon
•   ExpertFinder
•   CodeBroker
•   Hipikat
•   Automated Method Completion (AMC)
•   Strathcona
•   Prospector
•   XSnippet
•   Google code search, Krugle,…


                                        18
Conclusions & Future Work

• K9 recommends/checks relevant applications
  based on:
  • the analysis of relevant API help documents;
  • the connectivity analysis of retrieved API calls.

• Indexing available open-source projects and
  pre-computing data and control flow among
  API calls

• Analyzing multiple releases of the same project


                                                        19

More Related Content

What's hot

API-first development
API-first developmentAPI-first development
API-first developmentVasco Veloso
 
Fundamental essentials for api design
Fundamental essentials for api designFundamental essentials for api design
Fundamental essentials for api designMichael James Cyrus
 
API Description Languages
API Description LanguagesAPI Description Languages
API Description LanguagesAkana
 
API Athens Meetup - API standards 25-6-2014
API Athens Meetup - API standards   25-6-2014API Athens Meetup - API standards   25-6-2014
API Athens Meetup - API standards 25-6-2014Michael Petychakis
 
SwiftRiver 2011 Overview
SwiftRiver 2011 OverviewSwiftRiver 2011 Overview
SwiftRiver 2011 OverviewUshahidi
 
Himansu-Java&BigdataDeveloper
Himansu-Java&BigdataDeveloperHimansu-Java&BigdataDeveloper
Himansu-Java&BigdataDeveloperHimansu Behera
 
API Description Languages: Which Is The Right One For Me?
 API Description Languages: Which Is The Right One For Me?  API Description Languages: Which Is The Right One For Me?
API Description Languages: Which Is The Right One For Me? ProgrammableWeb
 
A Tasty deep-dive into Open API Specification Links
A Tasty deep-dive into Open API Specification LinksA Tasty deep-dive into Open API Specification Links
A Tasty deep-dive into Open API Specification LinksTony Tam
 
API Docs Made Right / RAML - Swagger rant
API Docs Made Right / RAML - Swagger rantAPI Docs Made Right / RAML - Swagger rant
API Docs Made Right / RAML - Swagger rantVladimir Shulyak
 
Making Sense of Hypermedia APIs – Hype or Reality?
Making Sense of Hypermedia APIs – Hype or Reality?Making Sense of Hypermedia APIs – Hype or Reality?
Making Sense of Hypermedia APIs – Hype or Reality?Akana
 
History and Future of the Netflix API - Mashery Evolution of Distribution
History and Future of the Netflix API - Mashery Evolution of DistributionHistory and Future of the Netflix API - Mashery Evolution of Distribution
History and Future of the Netflix API - Mashery Evolution of DistributionDaniel Jacobson
 
AppliFire Blue Print Design Guidelines
AppliFire Blue Print Design GuidelinesAppliFire Blue Print Design Guidelines
AppliFire Blue Print Design GuidelinesAppliFire Platform
 
Scaling Indexing and Replication in Jira Data Center Apps
Scaling Indexing and Replication in Jira Data Center AppsScaling Indexing and Replication in Jira Data Center Apps
Scaling Indexing and Replication in Jira Data Center AppsAtlassian
 
SiLCC Overview
SiLCC OverviewSiLCC Overview
SiLCC OverviewUshahidi
 

What's hot (18)

API-first development
API-first developmentAPI-first development
API-first development
 
Fundamental essentials for api design
Fundamental essentials for api designFundamental essentials for api design
Fundamental essentials for api design
 
Netflix API
Netflix APINetflix API
Netflix API
 
API Description Languages
API Description LanguagesAPI Description Languages
API Description Languages
 
API Athens Meetup - API standards 25-6-2014
API Athens Meetup - API standards   25-6-2014API Athens Meetup - API standards   25-6-2014
API Athens Meetup - API standards 25-6-2014
 
Api pattern
Api patternApi pattern
Api pattern
 
SwiftRiver 2011 Overview
SwiftRiver 2011 OverviewSwiftRiver 2011 Overview
SwiftRiver 2011 Overview
 
Himansu-Java&BigdataDeveloper
Himansu-Java&BigdataDeveloperHimansu-Java&BigdataDeveloper
Himansu-Java&BigdataDeveloper
 
API Description Languages: Which Is The Right One For Me?
 API Description Languages: Which Is The Right One For Me?  API Description Languages: Which Is The Right One For Me?
API Description Languages: Which Is The Right One For Me?
 
REST APIs
REST APIsREST APIs
REST APIs
 
A Tasty deep-dive into Open API Specification Links
A Tasty deep-dive into Open API Specification LinksA Tasty deep-dive into Open API Specification Links
A Tasty deep-dive into Open API Specification Links
 
API Docs Made Right / RAML - Swagger rant
API Docs Made Right / RAML - Swagger rantAPI Docs Made Right / RAML - Swagger rant
API Docs Made Right / RAML - Swagger rant
 
Making Sense of Hypermedia APIs – Hype or Reality?
Making Sense of Hypermedia APIs – Hype or Reality?Making Sense of Hypermedia APIs – Hype or Reality?
Making Sense of Hypermedia APIs – Hype or Reality?
 
Huge: Running an API at Scale
Huge: Running an API at ScaleHuge: Running an API at Scale
Huge: Running an API at Scale
 
History and Future of the Netflix API - Mashery Evolution of Distribution
History and Future of the Netflix API - Mashery Evolution of DistributionHistory and Future of the Netflix API - Mashery Evolution of Distribution
History and Future of the Netflix API - Mashery Evolution of Distribution
 
AppliFire Blue Print Design Guidelines
AppliFire Blue Print Design GuidelinesAppliFire Blue Print Design Guidelines
AppliFire Blue Print Design Guidelines
 
Scaling Indexing and Replication in Jira Data Center Apps
Scaling Indexing and Replication in Jira Data Center AppsScaling Indexing and Replication in Jira Data Center Apps
Scaling Indexing and Replication in Jira Data Center Apps
 
SiLCC Overview
SiLCC OverviewSiLCC Overview
SiLCC Overview
 

Similar to Evaluating Recommended Applications

Pain Points In API Development? They’re Everywhere
Pain Points In API Development? They’re EverywherePain Points In API Development? They’re Everywhere
Pain Points In API Development? They’re EverywhereNordic APIs
 
Refining Your API Design - Architecture and Modeling Learning Event
Refining Your API Design - Architecture and Modeling Learning EventRefining Your API Design - Architecture and Modeling Learning Event
Refining Your API Design - Architecture and Modeling Learning EventLaunchAny
 
API Documentation.pptx
API Documentation.pptxAPI Documentation.pptx
API Documentation.pptxRahulCR31
 
API Documentation.pptx
API Documentation.pptxAPI Documentation.pptx
API Documentation.pptxRahulCR31
 
apidays LIVE Australia - Evaluating the usability of security APIs by Dr Nali...
apidays LIVE Australia - Evaluating the usability of security APIs by Dr Nali...apidays LIVE Australia - Evaluating the usability of security APIs by Dr Nali...
apidays LIVE Australia - Evaluating the usability of security APIs by Dr Nali...apidays
 
Stack overflow code_laundering
Stack overflow code_launderingStack overflow code_laundering
Stack overflow code_launderingFoutse Khomh
 
Streamlined Geek Talk
Streamlined Geek TalkStreamlined Geek Talk
Streamlined Geek TalkSarah Allen
 
Reaching 1 Million APIs and what to do when we get there
Reaching 1 Million APIs and what to do when we get thereReaching 1 Million APIs and what to do when we get there
Reaching 1 Million APIs and what to do when we get there3scale
 
Metrics-Driven Devops: Delivering High Quality Software Faster!
Metrics-Driven Devops: Delivering High Quality Software Faster! Metrics-Driven Devops: Delivering High Quality Software Faster!
Metrics-Driven Devops: Delivering High Quality Software Faster! Dynatrace
 
Is your SAP system vulnerable to cyber attacks?
Is your SAP system vulnerable to cyber attacks?Is your SAP system vulnerable to cyber attacks?
Is your SAP system vulnerable to cyber attacks?Virtual Forge
 
Whitebox Testing for Blackbox Testers: Simplifying API Testing
Whitebox Testing for Blackbox Testers: Simplifying API TestingWhitebox Testing for Blackbox Testers: Simplifying API Testing
Whitebox Testing for Blackbox Testers: Simplifying API TestingQASymphony
 
Practices and tools for building better APIs
Practices and tools for building better APIsPractices and tools for building better APIs
Practices and tools for building better APIsNLJUG
 
Practices and tools for building better API (JFall 2013)
Practices and tools for building better API (JFall 2013)Practices and tools for building better API (JFall 2013)
Practices and tools for building better API (JFall 2013)Peter Hendriks
 
RACK-Tool-ICSE2017
RACK-Tool-ICSE2017RACK-Tool-ICSE2017
RACK-Tool-ICSE2017Masud Rahman
 
Software Analytics: Data Analytics for Software Engineering and Security
Software Analytics: Data Analytics for Software Engineering and SecuritySoftware Analytics: Data Analytics for Software Engineering and Security
Software Analytics: Data Analytics for Software Engineering and SecurityTao Xie
 
Content Strategy and Developer Engagement for DevPortals
Content Strategy and Developer Engagement for DevPortalsContent Strategy and Developer Engagement for DevPortals
Content Strategy and Developer Engagement for DevPortalsAxway
 
Outpost24 webinar - Api security
Outpost24 webinar - Api securityOutpost24 webinar - Api security
Outpost24 webinar - Api securityOutpost24
 

Similar to Evaluating Recommended Applications (20)

Pain Points In API Development? They’re Everywhere
Pain Points In API Development? They’re EverywherePain Points In API Development? They’re Everywhere
Pain Points In API Development? They’re Everywhere
 
Refining Your API Design - Architecture and Modeling Learning Event
Refining Your API Design - Architecture and Modeling Learning EventRefining Your API Design - Architecture and Modeling Learning Event
Refining Your API Design - Architecture and Modeling Learning Event
 
In app search 1
In app search 1In app search 1
In app search 1
 
API Documentation.pptx
API Documentation.pptxAPI Documentation.pptx
API Documentation.pptx
 
API Documentation.pptx
API Documentation.pptxAPI Documentation.pptx
API Documentation.pptx
 
apidays LIVE Australia - Evaluating the usability of security APIs by Dr Nali...
apidays LIVE Australia - Evaluating the usability of security APIs by Dr Nali...apidays LIVE Australia - Evaluating the usability of security APIs by Dr Nali...
apidays LIVE Australia - Evaluating the usability of security APIs by Dr Nali...
 
Stack overflow code_laundering
Stack overflow code_launderingStack overflow code_laundering
Stack overflow code_laundering
 
Streamlined Geek Talk
Streamlined Geek TalkStreamlined Geek Talk
Streamlined Geek Talk
 
Reaching 1 Million APIs and what to do when we get there
Reaching 1 Million APIs and what to do when we get thereReaching 1 Million APIs and what to do when we get there
Reaching 1 Million APIs and what to do when we get there
 
Metrics-Driven Devops: Delivering High Quality Software Faster!
Metrics-Driven Devops: Delivering High Quality Software Faster! Metrics-Driven Devops: Delivering High Quality Software Faster!
Metrics-Driven Devops: Delivering High Quality Software Faster!
 
Is your SAP system vulnerable to cyber attacks?
Is your SAP system vulnerable to cyber attacks?Is your SAP system vulnerable to cyber attacks?
Is your SAP system vulnerable to cyber attacks?
 
Whitebox Testing for Blackbox Testers: Simplifying API Testing
Whitebox Testing for Blackbox Testers: Simplifying API TestingWhitebox Testing for Blackbox Testers: Simplifying API Testing
Whitebox Testing for Blackbox Testers: Simplifying API Testing
 
Practices and tools for building better APIs
Practices and tools for building better APIsPractices and tools for building better APIs
Practices and tools for building better APIs
 
Practices and tools for building better API (JFall 2013)
Practices and tools for building better API (JFall 2013)Practices and tools for building better API (JFall 2013)
Practices and tools for building better API (JFall 2013)
 
RACK-Tool-ICSE2017
RACK-Tool-ICSE2017RACK-Tool-ICSE2017
RACK-Tool-ICSE2017
 
Presentation
PresentationPresentation
Presentation
 
Apsec18.ppt
Apsec18.pptApsec18.ppt
Apsec18.ppt
 
Software Analytics: Data Analytics for Software Engineering and Security
Software Analytics: Data Analytics for Software Engineering and SecuritySoftware Analytics: Data Analytics for Software Engineering and Security
Software Analytics: Data Analytics for Software Engineering and Security
 
Content Strategy and Developer Engagement for DevPortals
Content Strategy and Developer Engagement for DevPortalsContent Strategy and Developer Engagement for DevPortals
Content Strategy and Developer Engagement for DevPortals
 
Outpost24 webinar - Api security
Outpost24 webinar - Api securityOutpost24 webinar - Api security
Outpost24 webinar - Api security
 

Recently uploaded

"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 

Recently uploaded (20)

"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 

Evaluating Recommended Applications

  • 1. Evaluating Recommended Applications Mark Grechanik Accenture Technology Labs David Coppit Denys Poshyvanyk The College of William and Mary The College of William and Mary Denys Poshyvanyk The College of William and Mary
  • 2. How Many Open Source Applications Are There? • Sourceforge.net reports that they host 180,000 projects as of August 1, 2008. • There are dozens of other open source repositories containing tens of thousands of different applications. • Companies have internal source control management systems containing hundreds of thousands of applications. 2
  • 3. Problem • Finding/checking existing software matching high-level user requirements • Would reduce the cost of many software projects • Would provide users with examples of different implementations • Challenges: • Finding relevant applications is difficult • Evaluating recommended applications is very difficult 3
  • 4. What Search Engines Do Matching keywords descriptions of apps 4
  • 5. What Search Engines Do Matching keywords descriptions of apps 5
  • 6. Fundamental Problems • Vocabulary problem • Mismatch between the high-level intent reflected in the descriptions of applications and their low-level implementation details • Concept assignment problem Code snippet implementing High level concept “Send data” “Send data” s = socket.socket(proto, socket.SOCK_DGRAM) s.sendto(teststring, addr) buf = data = receive(s, 100) while data and 'n' not in buf: data = receive(s, 100) buf += data • Many application repositories are polluted with poorly functioning projects 6
  • 7. Working without a Recommender • Find relevant application (s) • Download application • Locate and examine fragments of the code that implement the desired features • Observe the runtime behavior of this application to ensure that this behavior matches requirements • This process is manual since programmers: • study the source code of the retrieved applications • locate various API calls • read information about these calls in help documents • Still, it is difficult for programmers to link high- level concepts from requirements to their implementations in source code 7
  • 8. Our Goal search 8
  • 10. Key observations • While studying retrieved apps developers : • locate various API calls • read information about these calls in help documents • Help docs are supplied by the same vendors whose packages/APIs are used in software • Programmers read and rely on these API docs • Help docs are written and reviewed by many developers • Help documents are usually more verbose and accurate than project descriptions 10
  • 11. How K9 Recommender System Works API call1 descriptions API call2 match of API calls API call3  Automatically matching words in user queries against API help docs instead of:  searching in project descriptions;  searching in source code.  K9 uses help documents to produce a list of relevant API calls 11
  • 12. The Architecture behind K9 API calls API call Integrity API calls Dictionary lookup Engine Applications Help page Recommended User metadata processor applications Help pages Source code Retrieved Static search engine applications Analyzer 12
  • 13. API Call Lookup (ACL) • After the user enters keywords, the ACL searches through the documentation text to find these keywords • Use Information Retrieval to rank the documents to the user query • The ACL returns the absolute path to the programming entity for the text that contains entered keywords 13
  • 14. The Architecture behind K9 API calls API call Integrity API calls Dictionary lookup Engine Applications Help page Recommended User metadata processor applications Help pages Source code Retrieved Static search engine applications Analyzer 14
  • 15. Integrity Engine • We assume that keywords in queries represent logically connected concepts • Consider query “compress encrypt XML” • Queries are often formed by constructing a question first, and then extracting keywords from this sentence • If a relation is present between concepts in the query, then there should be a relation between API calls that implement these concepts in the program code • This heuristics is based on the observation that relations between concepts are often preserved in their implementations in the program code 15
  • 16. Computing the Ranking • Total number of API calls is n • Total number of connections between these API calls is less or equal to n(n-1) • Each connection i is assign some weight wi • Weak connections have weights 0.5 and strong connections have weights 1.0 • l is the connectivity value • 1 : loop link, 0.5 :single link, 0 : no link • Rank is computed using the following formula wi li rank  n  n  1 16
  • 17. Current Status • Restricting the scope to Java projects • Challenges: • How to automatically locate and download the latest version of the software (e.g., from sourceforge)? • How to automatically locate the correct entry point (i.e., main) for static analysis? • How to reduce the time for the static analysis? • How and when to update API call dictionary? • Testing other ranking heuristics • Evaluation is pending 17
  • 18. Related Work • CodeFinder/Helgon • ExpertFinder • CodeBroker • Hipikat • Automated Method Completion (AMC) • Strathcona • Prospector • XSnippet • Google code search, Krugle,… 18
  • 19. Conclusions & Future Work • K9 recommends/checks relevant applications based on: • the analysis of relevant API help documents; • the connectivity analysis of retrieved API calls. • Indexing available open-source projects and pre-computing data and control flow among API calls • Analyzing multiple releases of the same project 19