SlideShare una empresa de Scribd logo
1 de 18
Selenium & Scrapy
Web UI testing and Web Scraping
About me
Arcangelo Saracino
IT student at Bari University
2016-2018 Web developer at Aryma
2018- Feb2019 Web developer at Enterprise Digital Solution
saracinoarcangelo@gmail.com github.com/Arkango
Selenium
Selenium is a portable framework for testing web applications.
Selenium provides a playback (formerly also recording) tool for authoring
functional tests without the need to learn a test scripting language (Selenium IDE).
It also provides a test domain-specific language (Selenese) to write tests in a
number of popular programming languages, including C#, Groovy, Java, Perl,
PHP, Python, Ruby and Scala.
The tests can then run against most modern web browsers.
Selenium deploys on Windows, Linux, and macOS platforms.
It is open-source software, released under the Apache 2.0 license: web
developers can download and use it without charge.
Source: Wikipedia
Selenium Components
●
Selenium IDE
●
Selenium Client Api
● Selenium Web Driver
● Selenium Remote Control
● Selenium Grid
Selenium IDE
Selenium IDE is a complete integrated development environment (IDE) for Selenium tests.
It is implemented as a Firefox Add-On and as a Chrome Extension.
It allows for recording, editing, and debugging of functional tests. It was previously known
as Selenium Recorder.
Selenium-IDE was originally created by Shinya Kasatani and donated to the Selenium
project in 2006.
Selenium IDE was previously little-maintained. Selenium IDE began being actively
maintained in 2018.
Scripts may be automatically recorded and edited manually providing autocompletion
support and the ability to move commands around quickly. Scripts are recorded in
Selenese, a special test scripting language for Selenium. Selenese provides commands
for performing actions in a browser (click a link, select an option), and for retrieving data
from the resulting pages.
Selenium Client API
As an alternative to writing tests in Selenese, tests can
also be written in various programming languages. These
tests then communicate with Selenium by calling methods
in the Selenium Client API. Selenium currently provides
client APIs for Java, C#, Ruby, JavaScript, R and Python.
With Selenium 2, a new Client API was introduced (with
WebDriver as its central component). However, the old API
(using class Selenium) is still supported.
Selenium Web Driver
Selenium WebDriver is the successor to Selenium RC.
Selenium WebDriver accepts commands (sent in Selenese, or
via a Client API) and sends them to a browser.
This is implemented through a browser-specific browser driver,
which sends commands to a browser and retrieves results.
Most browser drivers actually launch and access a browser
application (such as Firefox, Chrome, Internet Explorer, Safari,
or Microsoft Edge); there is also an HtmlUnit browser driver,
which simulates a browser using the headless browser
HtmlUnit.
Hands on code
● An example …..
Scrapy
Scrapy (/ skre pi/ SKRAY-pee) is a free and open-source web-crawlingˈ ɪ
framework written in Python. Originally designed for web scraping, it
can also be used to extract data using APIs or as a general-purpose
web crawler. It is currently maintained by Scrapinghub Ltd., a web-
scraping development and services company.
Scrapy project architecture is built around "spiders", which are self-
contained crawlers that are given a set of instructions. Following the
spirit of other don't repeat yourself frameworks, such as Django,[4] it
makes it easier to build and scale large crawling projects by allowing
developers to reuse their code. Scrapy also provides a web-crawling
shell, which can be used by developers to test their assumptions on a
site’s behavior.[5]
Scrapy: Basic Concept
● Command line tools
Scrapy is controlled through the scrapy command-line tool, to be referred here as the “Scrapy tool” to
differentiate it from the sub-commands, which we just call “commands” or “Scrapy commands”.
● Spiders
Spiders are classes which define how a certain site (or a group of sites) will be scraped,
including how to perform the crawl (i.e. follow links) and how to extract structured data from
their pages (i.e. scraping items). In other words, Spiders are the place where you define the
custom behaviour for crawling and parsing pages for a particular site (or, in some cases, a
group of sites).
● Selectors
Extract the data from web pages using XPath.
● Scrapy Shell
Test your extraction code in an interactive environment.
Scrapy: Basic Concept 2
● Items
Define the data you want to scrape.
● Items Loader
Populate your items with the extracted data.
● Items Pipeline
Post-process and store your scraped data.
● Feed Exports
Output your scraped data using different formats and storages.
● Request and responses
Scrapy uses Request and Response objects for crawling web sites.
Scrapy: Basic Concept 3
● Link extractor
Convenient classes to extract links to follow from pages.
● Settings
Learn how to configure Scrapy and see all available settings.
● Exceptions
See all available exceptions and their meaning.
Let’s code
● An example …..
Usages
● Testing ui
● Web crawling
● Hacking
Sources
● Wikipedia.org
● https://www.seleniumhq.org/
● https://scrapy.org/
● Tutorial: https://selenium-python.readthedocs.io/,https://www.youtube.com/watch?v=XDn60jw68tM,
https://docs.scrapy.org/en/latest/intro/tutorial.html
Questions&Answers
About me
Arcangelo Saracino
IT student at Bari University
2016-2018 Web developer at Aryma
2018- Feb2019 Web developer at Enterprise Digital Solution
saracinoarcangelo@gmail.com github.com/Arkango
Thank you

Más contenido relacionado

La actualidad más candente

Downloading the internet with Python + Scrapy
Downloading the internet with Python + ScrapyDownloading the internet with Python + Scrapy
Downloading the internet with Python + ScrapyErin Shellman
 
Scraping with Python for Fun and Profit - PyCon India 2010
Scraping with Python for Fun and Profit - PyCon India 2010Scraping with Python for Fun and Profit - PyCon India 2010
Scraping with Python for Fun and Profit - PyCon India 2010Abhishek Mishra
 
Building an API with Django and Django REST Framework
Building an API with Django and Django REST FrameworkBuilding an API with Django and Django REST Framework
Building an API with Django and Django REST FrameworkChristopher Foresman
 
Intro to Web Development Using Python and Django
Intro to Web Development Using Python and DjangoIntro to Web Development Using Python and Django
Intro to Web Development Using Python and DjangoChariza Pladin
 
Web development with django - Basics Presentation
Web development with django - Basics PresentationWeb development with django - Basics Presentation
Web development with django - Basics PresentationShrinath Shenoy
 
Django Introduction & Tutorial
Django Introduction & TutorialDjango Introduction & Tutorial
Django Introduction & Tutorial之宇 趙
 
Web Scraping in Python with Scrapy
Web Scraping in Python with ScrapyWeb Scraping in Python with Scrapy
Web Scraping in Python with Scrapyorangain
 
Django Overview
Django OverviewDjango Overview
Django OverviewBrian Tol
 
Django tech-talk
Django tech-talkDjango tech-talk
Django tech-talkdtdannen
 
Django Framework Overview forNon-Python Developers
Django Framework Overview forNon-Python DevelopersDjango Framework Overview forNon-Python Developers
Django Framework Overview forNon-Python DevelopersRosario Renga
 
Create responsive websites with Django, REST and AngularJS
Create responsive websites with Django, REST and AngularJSCreate responsive websites with Django, REST and AngularJS
Create responsive websites with Django, REST and AngularJSHannes Hapke
 
Django REST Framework
Django REST FrameworkDjango REST Framework
Django REST FrameworkLoad Impact
 

La actualidad más candente (20)

Downloading the internet with Python + Scrapy
Downloading the internet with Python + ScrapyDownloading the internet with Python + Scrapy
Downloading the internet with Python + Scrapy
 
Web Scrapping with Python
Web Scrapping with PythonWeb Scrapping with Python
Web Scrapping with Python
 
Fun with Python
Fun with PythonFun with Python
Fun with Python
 
Scrapy
ScrapyScrapy
Scrapy
 
Scraping with Python for Fun and Profit - PyCon India 2010
Scraping with Python for Fun and Profit - PyCon India 2010Scraping with Python for Fun and Profit - PyCon India 2010
Scraping with Python for Fun and Profit - PyCon India 2010
 
Building an API with Django and Django REST Framework
Building an API with Django and Django REST FrameworkBuilding an API with Django and Django REST Framework
Building an API with Django and Django REST Framework
 
Django
DjangoDjango
Django
 
Scrapy.for.dummies
Scrapy.for.dummiesScrapy.for.dummies
Scrapy.for.dummies
 
Intro to Web Development Using Python and Django
Intro to Web Development Using Python and DjangoIntro to Web Development Using Python and Django
Intro to Web Development Using Python and Django
 
Analyse Yourself
Analyse YourselfAnalyse Yourself
Analyse Yourself
 
Web development with django - Basics Presentation
Web development with django - Basics PresentationWeb development with django - Basics Presentation
Web development with django - Basics Presentation
 
Django Introduction & Tutorial
Django Introduction & TutorialDjango Introduction & Tutorial
Django Introduction & Tutorial
 
Web Scraping in Python with Scrapy
Web Scraping in Python with ScrapyWeb Scraping in Python with Scrapy
Web Scraping in Python with Scrapy
 
Django Overview
Django OverviewDjango Overview
Django Overview
 
Django tech-talk
Django tech-talkDjango tech-talk
Django tech-talk
 
Django Framework Overview forNon-Python Developers
Django Framework Overview forNon-Python DevelopersDjango Framework Overview forNon-Python Developers
Django Framework Overview forNon-Python Developers
 
Firebase slide
Firebase slideFirebase slide
Firebase slide
 
Create responsive websites with Django, REST and AngularJS
Create responsive websites with Django, REST and AngularJSCreate responsive websites with Django, REST and AngularJS
Create responsive websites with Django, REST and AngularJS
 
Django
DjangoDjango
Django
 
Django REST Framework
Django REST FrameworkDjango REST Framework
Django REST Framework
 

Similar a Selenium&scrapy

Introduction to Selenium Webdriver - SpringPeople
Introduction to Selenium Webdriver - SpringPeopleIntroduction to Selenium Webdriver - SpringPeople
Introduction to Selenium Webdriver - SpringPeopleSpringPeople
 
Automation Testing using Selenium Webdriver
Automation Testing using Selenium WebdriverAutomation Testing using Selenium Webdriver
Automation Testing using Selenium WebdriverPankaj Biswas
 
selenium-webdriver-interview-questions.pdf
selenium-webdriver-interview-questions.pdfselenium-webdriver-interview-questions.pdf
selenium-webdriver-interview-questions.pdfAnuragMourya8
 
Test Automation Using Selenium
Test Automation Using SeleniumTest Automation Using Selenium
Test Automation Using SeleniumNikhil Kapoor
 
تست وب اپ ها با سلنیوم - علیرضا عظیم زاده میلانی
تست وب اپ ها با سلنیوم - علیرضا عظیم زاده میلانیتست وب اپ ها با سلنیوم - علیرضا عظیم زاده میلانی
تست وب اپ ها با سلنیوم - علیرضا عظیم زاده میلانیirpycon
 
Selenium Basics by Quontra Solutions
Selenium Basics by Quontra SolutionsSelenium Basics by Quontra Solutions
Selenium Basics by Quontra SolutionsQUONTRASOLUTIONS
 
A Simple Guide to Selenium Software Testing
A Simple Guide to Selenium Software TestingA Simple Guide to Selenium Software Testing
A Simple Guide to Selenium Software TestingCalidad Infotech
 
Demystifying Selenium framework
Demystifying Selenium frameworkDemystifying Selenium framework
Demystifying Selenium frameworkkunalgate125
 
Automated UI testing. Selenium. DrupalCamp Kyiv 2011
Automated UI testing. Selenium. DrupalCamp Kyiv 2011Automated UI testing. Selenium. DrupalCamp Kyiv 2011
Automated UI testing. Selenium. DrupalCamp Kyiv 2011Yuriy Gerasimov
 
Basics of Selenium IDE,Core, Remote Control
Basics of Selenium IDE,Core, Remote ControlBasics of Selenium IDE,Core, Remote Control
Basics of Selenium IDE,Core, Remote Controlusha kannappan
 
Selenium Presentation at Engineering Colleges
Selenium Presentation at Engineering CollegesSelenium Presentation at Engineering Colleges
Selenium Presentation at Engineering CollegesVijay Rangaiah
 
Selenium Automation Using Ruby
Selenium Automation Using RubySelenium Automation Using Ruby
Selenium Automation Using RubyKumari Warsha Goel
 
Selenium PPT 2.pptx
Selenium PPT 2.pptxSelenium PPT 2.pptx
Selenium PPT 2.pptxssusere4c6aa
 
Automated UI testing.Selenium.DrupalCamp Kyiv 2011
Automated UI testing.Selenium.DrupalCamp Kyiv 2011Automated UI testing.Selenium.DrupalCamp Kyiv 2011
Automated UI testing.Selenium.DrupalCamp Kyiv 2011camp_drupal_ua
 

Similar a Selenium&scrapy (20)

Introduction to Selenium Webdriver - SpringPeople
Introduction to Selenium Webdriver - SpringPeopleIntroduction to Selenium Webdriver - SpringPeople
Introduction to Selenium Webdriver - SpringPeople
 
Automation Testing using Selenium Webdriver
Automation Testing using Selenium WebdriverAutomation Testing using Selenium Webdriver
Automation Testing using Selenium Webdriver
 
selenium-webdriver-interview-questions.pdf
selenium-webdriver-interview-questions.pdfselenium-webdriver-interview-questions.pdf
selenium-webdriver-interview-questions.pdf
 
Selenium.pptx
Selenium.pptxSelenium.pptx
Selenium.pptx
 
QSpiders - Automation using Selenium
QSpiders - Automation using SeleniumQSpiders - Automation using Selenium
QSpiders - Automation using Selenium
 
Test Automation Using Selenium
Test Automation Using SeleniumTest Automation Using Selenium
Test Automation Using Selenium
 
تست وب اپ ها با سلنیوم - علیرضا عظیم زاده میلانی
تست وب اپ ها با سلنیوم - علیرضا عظیم زاده میلانیتست وب اپ ها با سلنیوم - علیرضا عظیم زاده میلانی
تست وب اپ ها با سلنیوم - علیرضا عظیم زاده میلانی
 
Selenium
SeleniumSelenium
Selenium
 
Test automation using selenium
Test automation using seleniumTest automation using selenium
Test automation using selenium
 
Selenium Basics by Quontra Solutions
Selenium Basics by Quontra SolutionsSelenium Basics by Quontra Solutions
Selenium Basics by Quontra Solutions
 
A Simple Guide to Selenium Software Testing
A Simple Guide to Selenium Software TestingA Simple Guide to Selenium Software Testing
A Simple Guide to Selenium Software Testing
 
Demystifying Selenium framework
Demystifying Selenium frameworkDemystifying Selenium framework
Demystifying Selenium framework
 
BCS Selenium Workshop
BCS Selenium WorkshopBCS Selenium Workshop
BCS Selenium Workshop
 
Automated UI testing. Selenium. DrupalCamp Kyiv 2011
Automated UI testing. Selenium. DrupalCamp Kyiv 2011Automated UI testing. Selenium. DrupalCamp Kyiv 2011
Automated UI testing. Selenium. DrupalCamp Kyiv 2011
 
Basics of Selenium IDE,Core, Remote Control
Basics of Selenium IDE,Core, Remote ControlBasics of Selenium IDE,Core, Remote Control
Basics of Selenium IDE,Core, Remote Control
 
Selenium Presentation at Engineering Colleges
Selenium Presentation at Engineering CollegesSelenium Presentation at Engineering Colleges
Selenium Presentation at Engineering Colleges
 
Selenium Automation Using Ruby
Selenium Automation Using RubySelenium Automation Using Ruby
Selenium Automation Using Ruby
 
Selenium PPT 2.pptx
Selenium PPT 2.pptxSelenium PPT 2.pptx
Selenium PPT 2.pptx
 
Selenium
SeleniumSelenium
Selenium
 
Automated UI testing.Selenium.DrupalCamp Kyiv 2011
Automated UI testing.Selenium.DrupalCamp Kyiv 2011Automated UI testing.Selenium.DrupalCamp Kyiv 2011
Automated UI testing.Selenium.DrupalCamp Kyiv 2011
 

Último

Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 

Último (20)

Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 

Selenium&scrapy

  • 1. Selenium & Scrapy Web UI testing and Web Scraping
  • 2. About me Arcangelo Saracino IT student at Bari University 2016-2018 Web developer at Aryma 2018- Feb2019 Web developer at Enterprise Digital Solution saracinoarcangelo@gmail.com github.com/Arkango
  • 3. Selenium Selenium is a portable framework for testing web applications. Selenium provides a playback (formerly also recording) tool for authoring functional tests without the need to learn a test scripting language (Selenium IDE). It also provides a test domain-specific language (Selenese) to write tests in a number of popular programming languages, including C#, Groovy, Java, Perl, PHP, Python, Ruby and Scala. The tests can then run against most modern web browsers. Selenium deploys on Windows, Linux, and macOS platforms. It is open-source software, released under the Apache 2.0 license: web developers can download and use it without charge. Source: Wikipedia
  • 4. Selenium Components ● Selenium IDE ● Selenium Client Api ● Selenium Web Driver ● Selenium Remote Control ● Selenium Grid
  • 5. Selenium IDE Selenium IDE is a complete integrated development environment (IDE) for Selenium tests. It is implemented as a Firefox Add-On and as a Chrome Extension. It allows for recording, editing, and debugging of functional tests. It was previously known as Selenium Recorder. Selenium-IDE was originally created by Shinya Kasatani and donated to the Selenium project in 2006. Selenium IDE was previously little-maintained. Selenium IDE began being actively maintained in 2018. Scripts may be automatically recorded and edited manually providing autocompletion support and the ability to move commands around quickly. Scripts are recorded in Selenese, a special test scripting language for Selenium. Selenese provides commands for performing actions in a browser (click a link, select an option), and for retrieving data from the resulting pages.
  • 6. Selenium Client API As an alternative to writing tests in Selenese, tests can also be written in various programming languages. These tests then communicate with Selenium by calling methods in the Selenium Client API. Selenium currently provides client APIs for Java, C#, Ruby, JavaScript, R and Python. With Selenium 2, a new Client API was introduced (with WebDriver as its central component). However, the old API (using class Selenium) is still supported.
  • 7. Selenium Web Driver Selenium WebDriver is the successor to Selenium RC. Selenium WebDriver accepts commands (sent in Selenese, or via a Client API) and sends them to a browser. This is implemented through a browser-specific browser driver, which sends commands to a browser and retrieves results. Most browser drivers actually launch and access a browser application (such as Firefox, Chrome, Internet Explorer, Safari, or Microsoft Edge); there is also an HtmlUnit browser driver, which simulates a browser using the headless browser HtmlUnit.
  • 8. Hands on code ● An example …..
  • 9. Scrapy Scrapy (/ skre pi/ SKRAY-pee) is a free and open-source web-crawlingˈ ɪ framework written in Python. Originally designed for web scraping, it can also be used to extract data using APIs or as a general-purpose web crawler. It is currently maintained by Scrapinghub Ltd., a web- scraping development and services company. Scrapy project architecture is built around "spiders", which are self- contained crawlers that are given a set of instructions. Following the spirit of other don't repeat yourself frameworks, such as Django,[4] it makes it easier to build and scale large crawling projects by allowing developers to reuse their code. Scrapy also provides a web-crawling shell, which can be used by developers to test their assumptions on a site’s behavior.[5]
  • 10. Scrapy: Basic Concept ● Command line tools Scrapy is controlled through the scrapy command-line tool, to be referred here as the “Scrapy tool” to differentiate it from the sub-commands, which we just call “commands” or “Scrapy commands”. ● Spiders Spiders are classes which define how a certain site (or a group of sites) will be scraped, including how to perform the crawl (i.e. follow links) and how to extract structured data from their pages (i.e. scraping items). In other words, Spiders are the place where you define the custom behaviour for crawling and parsing pages for a particular site (or, in some cases, a group of sites). ● Selectors Extract the data from web pages using XPath. ● Scrapy Shell Test your extraction code in an interactive environment.
  • 11. Scrapy: Basic Concept 2 ● Items Define the data you want to scrape. ● Items Loader Populate your items with the extracted data. ● Items Pipeline Post-process and store your scraped data. ● Feed Exports Output your scraped data using different formats and storages. ● Request and responses Scrapy uses Request and Response objects for crawling web sites.
  • 12. Scrapy: Basic Concept 3 ● Link extractor Convenient classes to extract links to follow from pages. ● Settings Learn how to configure Scrapy and see all available settings. ● Exceptions See all available exceptions and their meaning.
  • 13. Let’s code ● An example …..
  • 14. Usages ● Testing ui ● Web crawling ● Hacking
  • 15. Sources ● Wikipedia.org ● https://www.seleniumhq.org/ ● https://scrapy.org/ ● Tutorial: https://selenium-python.readthedocs.io/,https://www.youtube.com/watch?v=XDn60jw68tM, https://docs.scrapy.org/en/latest/intro/tutorial.html
  • 17. About me Arcangelo Saracino IT student at Bari University 2016-2018 Web developer at Aryma 2018- Feb2019 Web developer at Enterprise Digital Solution saracinoarcangelo@gmail.com github.com/Arkango