Handwritten Text Recognition for manuscripts and early printed texts
Formal Verification of Web Service Interaction Contracts
1.
2. E-Business Scenario Your server command (process id #20) has been terminated. Re-run your command (severity 13) in /opt/www/your-reliable-eshop.biz/mb_1300_db.mb1 place your order!
3.
4. Transaction recovery is idempotent. However, … Web Client Web Application Server Database Server Timeline Non-idempotent execution ! ACK Purchase Request Order Confirmation Start Transaction SQL Request SQL Response SQL Request SQL Response Commit Transaction ACK Transaction Restart Purchase Request Resubmission
5. Real-World n -Tier Application Expedia Sabre Server Amadeus Expedia App Server Sabre App Server Amadeus App Server Client Web Server DB 1 DB 2 DB 3 DB 4
6.
7.
8. Committed IC Sender * EVENT_OK = EVENT LINK_OUTAGE STABLE_S SENDING INSTALLED_S RECOVERY MSG_LOOKUP PREPARE_PERSISTENCE SNDR_MSG_TM and not (STABLE_OK or INSTALLED_OK)/ SEND_MSG SNDR_ND/ SEND_MSG SNDR_TRIGGER [SNDR_LAST_LOGGED=='']/ SNDR_ND MSG_RECOVERED_TM/ SEND_MSG GET_MSG_OK [SNDR_LAST_LOGGED=='INSTALLED'] INSTALLED_OK/ SNDR_LAST_LOGGED:='INSTALLED' STABLE_OK SNDR_STABLE_TM and not (INSTALLED_OK or GET_MSG_OK)/ IS_INSTALLED CIC_SNDR_SC STABLE_S SENDING MSG_LOOKUP SNDR_MSG_TM and INSTALLED_OK)/ SEND_MSG SNDR_ND/ SEND_MSG [SNDR_LAST_LOGGED=='']/ SNDR_ND MSG_RECOVERED_TM/ SEND_MSG GET_MSG_OK INSTALLED_OK/ SNDR_STABLE_TM and not (INSTALLED_OK or GET_MSG_OK)/ IS_INSTALLED SNDR_CRASH T T STABLE_S SENDING MSG_LOOKUP SNDR_MSG_TM and INSTALLED_OK)/ SEND_MSG SNDR_ND/ SEND_MSG [SNDR_LAST_LOGGED=='']/ SNDR_ND MSG_RECOVERED_TM/ SEND_MSG GET_MSG_OK INSTALLED_OK/ SNDR_STABLE_TM and not (INSTALLED_OK or GET_MSG_OK)/ IS_INSTALLED CIC_SNDR_SC STABLE_S SENDING MSG_LOOKUP INSTALLED_OK/ SNDR_MSG_TM and INSTALLED_OK)/ SEND_MSG SNDR_ND/ SEND_MSG SNDR_LAST_LOGGED SNDR_ND MSG_RECOVERED_TM/ SEND_MSG GET_MSG_OK INSTALLED_OK/ SNDR_STABLE_TM and not (INSTALLED_OK or GET_MSG_OK)/ IS_INSTALLED T T SNDR_LAST_LOGGED:='INSTALLED' _TM means TIMEOUT
9. Committed IC Receiver MSG_RECOVERY STABLE_R INSTALLED_R MSG_RECEIVED RECOVERY MSG_PROCESSED RCVR_INSTALL_TM/ RCVR_LAST_LOGGED:='INSTALLED'; INSTALLED [RCVR_LAST_LOGGED=='INSTALLED'] [RCVR_LAST_LOGGED=='STABLE'] SEND_MSG_OK [RCVR_LAST_LOGGED=='STABLE']/ GET_MSG [ICIC]/ RCVR_LAST_LOGGED:='INSTALLED'; INSTALLED MSG_EXEC_TM/ RECEIVED; ( RCVR_STABLE_TM or RCVR_ND [MSG_ORDER_MATTERS] ) [not ICIC and RCVR_LAST_LOGGED=='']/ RCVR_LAST_LOGGED:='STABLE'; SEND_MSG_OK [RCVR_LAST_LOGGED==''] not SEND_MSG_OK and GET_MSG_TM/ GET_MSG RCVR_CRASH T CIC_RCVR_SC MSG_RECEIVED RECOVERY MSG_PROCESSED [RCVR_LAST_LOGGED=='INSTALLED'] [RCVR_LAST_LOGGED=='STABLE'] SEND_MSG_OK [RCVR_LAST_LOGGED=='STABLE']/ GET_MSG [ICIC]/ RCVR_LAST_LOGGED:='INSTALLED'; INSTALLED MSG_EXEC_TM/ RECEIVED; [not ICIC and RCVR_LAST_LOGGED=='']/ RCVR_LAST_LOGGED:='STABLE'; SEND_MSG_OK [RCVR_LAST_LOGGED==''] not SEND_MSG_OK and GET_MSG_TM/ GET_MSG RCVR_CRASH T SEND_MSG or IS_INSTALLED/ SEND_MSG or IS_INSTALLED/ INSTALLED STABLE_R INSTALLED_R MSG_RECEIVED RECOVERY MSG_PROCESSED [RCVR_LAST_LOGGED=='INSTALLED'] [RCVR_LAST_LOGGED=='STABLE'] SEND_MSG_OK [RCVR_LAST_LOGGED=='STABLE']/ GET_MSG [ICIC]/ RCVR_LAST_LOGGED:='INSTALLED'; INSTALLED MSG_EXEC_TM/ RECEIVED; STABLE SEND_MSG_OK [RCVR_LAST_LOGGED==''] not SEND_MSG_OK and GET_MSG_TM/ GET_MSG RCVR_CRASH T CIC_RCVR_SC MSG_RECEIVED RECOVERY MSG_PROCESSED [RCVR_LAST_LOGGED=='INSTALLED'] [RCVR_LAST_LOGGED=='STABLE'] SEND_MSG_OK [RCVR_LAST_LOGGED=='STABLE']/ GET_MSG [ICIC]/ RCVR_LAST_LOGGED:='INSTALLED'; INSTALLED MSG_EXEC_TM/ RECEIVED; SEND_MSG_OK [RCVR_LAST_LOGGED==''] not SEND_MSG_OK and GET_MSG_TM/ GET_MSG RCVR_CRASH T SEND_MSG or IS_INSTALLED/ STABLE SEND_MSG or IS_INSTALLED/ INSTALLED * EVENT_OK = EVENT LINK_OUTAGE, _TM means TIMEOUT RCVR_LAST_LOGGED:='INSTALLED'
10.
11.
12.
13. EOS Demo USER 1 Backend Server Frontend Server B2B_LINK B2C_LINK
14.
15.
16. Statecharts [Harel'87, UML' 97] Step-wise refinement INIT ЕND S 1 S 3 E[C]/A S 2 E 23 / A 23 [OK] [!OK]
17. 2PC Message Sequence Coordinator DB i force-log begin Timeline prepare force-log prepared commit force-log commit force-log commit force-log end ack yes
We use the state-and-activity chart language to formally specify the interaction contracts. The State-and-Activity chart language is provided with a leading tool for specification of reactive systems Statemate. The specification process begins with an activity chart providing the functional view on the system. Internal activities are represented by solid-line boxes. Dashed-line boxes specify external activities, an execution environment, and external applications. The arrows represent the data flow. Labels indicates which data or events are concerned. In this concrete scenario we specify an activity ensuring that a message is passed from one CIC component to an other one according to the CIC rules in a failure-prone environment that non-deterministically supplies failure events (crashes and link outages). What the application needs to know about it that it should activate the "sender trigger" and await an occurrence of the event "message processed" . This is important, please memorize that. The system administrator specifies the timeout values suitable for the given application along with some other options. The manager may stop the specification process at this stage. Activities are hierarchical and allow for a step-wise refinement. The next employee will say that actually the behavior of the cic activity is controlled by a so-called control activity cic_sc (sc stands for statechart) depicted as a green rounded box and has two further sub-activities: cic_sender and cic_receiver exchanging the messages and notifications as I have described informally before. The behaviors of these subactivities are defined by the corresponding control activities.
The CIC can be informally described as follows: By sending a message to a different component the CIC sender commits its state. Usually, it forces the log to disk to make its state and the message recoverable. The sender deterministically tags its message with a unique id, a message sequence number MSN The sender keeps sending the message periodically until it gets a stable notification from the receiver. It keeps the message for the receiver may request the message again after a failure. The sender is released from all of its obligations when it gets an installed notification from the receiver. The CIC receiver eliminates message duplicates based on MSN. It persists an interaction before sending a stable notification to the sender. Normally this is done by logging the message header and forcing the log. The receiver requests the original message from the sender after a failure, when its log contains only the message header. The receiver ensures its autonomous recovery by forcing the complete message to disk or creating an installation point before sending an installed notification to the sender.
At the end, we learned that we need to make compromises between the realism of the models and their verifiability. A web service model using integer expressions to generate timeouts periodically as it would happen in a real system could not be verified. We succeeded after replacing the integer-based timeouts by nondeterministic 1-bit timeouts, which is a more general case. No engineering tricks however have helped to obtain any results for a multi-user model and for the liveness of the single-user-model.
We performed measurements to evaluate the overhead of the interaction contracts in a 3-tier application that has a similar structure as an ebay like auction service. The front-end server manages private user setting that are accessed simultaneously without contention. The backend server manages the current highest bids for auction items that are accessed concurrently. The load was generated by a synthetic load generator Apache Jmeter from 5 different machines
The run-time overhead of EOS-PHP is on average about 100% in terms of both the elapsed and the CPU time. At this price we support failure making which radically simplifies the development process and provides a correct and highly available service to customers.
I implemented the committed and external interaction contracts for PHP-based Web-services. PHP is a scripting language that is embedded into usual HTML pages. PHP is interpreted by the Zend engine that has a great variety of modules extending the capabilities of the PHP language. With PHP we can manage the application state across multiple HTTP requests using the Session module. There is a number of options of invoking remote Web services to build a complex multi-tier Application. In my work I concentrated on the CURL module. A reply message of a PHP script is normally an HTML page that is displayed by the browser.
Our prototype implements the exactly sematics. It delivers the recovery guarantees to the end-user by implementing the external and the committed interaction contracts for the Internet Explorer. On the PHP side we can recover concurrent request accessing shared objects. We can recover calls to the nondeterminisatic functions, time, curl_exec, and the random number generator rand. We do really support n-tier for any n with any fanout in the call structure. We have enhanced performance of the original PHP implementation with Regard to disk I/Os and made the conccurency control. For instance it is now possible to access the session data read only.
We use the state-and-activity chart language to formally specify the interaction contracts. The State-and-Activity chart language is provided with a leading tool for specification of reactive systems Statemate. The specification process begins with an activity chart providing the functional view on the system. Internal activities are represented by solid-line boxes. Dashed-line boxes specify external activities, an execution environment, and external applications. The arrows represent the data flow. Labels indicates which data or events are concerned. In this concrete scenario we specify an activity ensuring that a message is passed from one CIC component to an other one according to the CIC rules in a failure-prone environment that non-deterministically supplies failure events (crashes and link outages). What the application needs to know about it that it should activate the "sender trigger" and await an occurrence of the event "message processed" . This is important, please memorize that. The system administrator specifies the timeout values suitable for the given application along with some other options. The manager may stop the specification process at this stage. Activities are hierarchical and allow for a step-wise refinement. The next employee will say that actually the behavior of the cic activity is controlled by a so-called control activity cic_sc (sc stands for statechart) depicted as a green rounded box and has two further sub-activities: cic_sender and cic_receiver exchanging the messages and notifications as I have described informally before. The behaviors of these subactivities are defined by the corresponding control activities.
Before we start with the verification of the IC we need some additional definitions. A finite state computational system, e.g. a Statemate specification, can be represented as a Kripke structure. It contains a finite state transition graph with nodes labeled with atomic propositions that are valid in this node. These atomic propositions would refer to individual memory bits in a software system. If we unwind the state transition diagram we obtain a computation tree with potentially infinite branches.
A computation tree over the set of atomic propositions P can be characterized by the temporal logic called CTL. Its syntax is inductively defined as shown on this slide. The temporal aspects of the execution paths originating in the given state can be characterized by the Path quantifiers Exists and All combined with the temporal modalities Next and Util, finally, and globally. The modality Finally is used in a sense that some property holds eventually. Globally means that a property holds in every state of a path.
Explicit model checking is a rather simple recursive algorithm with the quadratic run-time. There are heuristic solutions using ordered binary decision diagrams as in the Statemate's symbolic model checker. Other model checkers use SAT solvers.
To provide recovery guarantees all Pcoms such as client and server components need to be equipped with logging and recovery capabilities. Unlike database systems, we do not want and do not need to enable undo. Components are piecewise deterministic, they execute deterministically between two consecutive non-deterministic events such incoming messages from other components or reading the system clock. SO, logging of nondeterministic events turns piecewise-deterministic components into truly deterministic ones. We can recreate Pcom's state and messages by simply replaying the log from some initial state. To accelerate the deterministic replay the component needs to truncate the log on a regular basis. before doing this it has to dump its current state to disk. We call such state dumps "installation points". Out failure model includes crashes of the sending and receiving components as well as network failures causing message losses. Such transient failures are due to nondeterministic so-called Heisenbugs that are impossible to reproduce to take them out. We do not consider malicious manipulations called commission failures. And we do not deal with the corruption of stable storage as this can be avoided by a sufficient replication.