This document discusses using visual approaches and screenshots to automate graphical user interfaces. It demonstrates a technique called template matching that allows finding small patterns in images despite changes in scale, rotation, color palette, or texture. The document also introduces various functions like find(), click(), type(), and wait() that can perform actions and interactions on the user interface based on visual matching of screenshots. Finally, it encourages the reader to experiment with building their own automation tools using these approaches.
3. Visual approach to search and
automation of graphical user interfaces
using screenshots.
4. Just to see if it can be done
Usually needed to automate GUIs:
support from developers
API access
language/OS dependency
position/naming dependencies
7. Template Matching for small patterns
Invariance
Resized versions
Texturally similar, different color pallets
8. Learn
Extract model from training pattern
▪ Invariant to scale & rotation
Encode in the model the relative position of the center
Search
Extract invariant features
Infer possible model center position
Cluster consistent features
Validate supposition
12. Region(x, y, w, h)
Region(region)
Region(Rectangle)
Search in a given region
Observe a region in background for changes
Retrieve matches
Optimize the search by chaining regions
No need to be aware of the content
19. How performant do you want it to be?
A typical call to find() for a 100x100 target on
a 1600x1200 screen takes less than 200 msec
Improve your searches
20. Why not give it a try and make your own?
Record-playback
Sikuli Guide
…
21. Ok, the moment of truth!
Screenshots – unstable interfaces
Visibility constraints
A paradigm shift requires a thinking shift