FIWARE Global Summit - Data Usage Control in FIWARE
1. FIWARE Data Usage Control
Context Management (Core) Chapter
Data/API Management, Publication and Monetization Chapter
Universidad Politécnica de Madrid (ETSIT)
3. Data Access / Usage Control
● Data Access Control:
■ Specify who can access what resource
■ Also the rights to access it (actions)
● Data Usage Control:
■ Ensures data sovereignty
■ Regulates what is allowed to happen with the data (future
usage).
■ Related to data ingestion and processing
■ Context of intellectual property protection, privacy protection,
compliance with regulations and digital rights management
Source: IDS Reference Architecture Model Version 2.0
4. Data Usage Control in FIWARE
Policies definition
We define the FI-UCON model. Based
on the UCON specification and model.
Define :
● Obligations
● Authorizations
● Conditions
Over data and processing.
Pre Decision
permit access
start access
Ongoing Decision
revoke access
end access Timetry access
5. Data Access Control in FIWARE
Resources protection
access-token
permissions
check
6. Data Usage Control in FIWARE
Proposed scenario
▪ The Security Framework provides Usage Control (FI-UCON)
• To Data processed in Big Data components
• Provided by Orion Context Broker
▪ Usage Control policies are defined using an extension for ODRL model
based (through a UI)
• And stored in Keyrock’s PAP
▪ Policies are transformed into a program that processes the traces
generated by the user data-processing engines
• And enforces punishments if the user does not comply with the
policies ( Algebra transform into a CSP-like behaviour detection)
➔ A user with permissions to access a specific entity in the CB will be able to
use it if compliance with the data usage policies is ensured.
7. Data Usage Control in FIWARE
Policies definition: ODRL 2.2 ( W3C)
It is a policy expression language that provides a flexible and
interoperable information model, vocabulary, and encoding mechanisms
for representing statements about the usage of content and services.
We define our own profile FI-DUsageML (we are based on a modified
RIGHTML profile)
Entities :
● Dataset ( url )
● NGSIStream ( url )
● Processing Engines ( Apache Flink, Spark Scala)
8. Data Usage Control in FIWARE
FI-ODRL: an ODRL extension for data processing and data
provenance.
Extension for the ODRL 2.2 W3C standard (Open Digital Rights
Language) with
● New vocabulary (based on https://www.w3.org/TR/odrl-vocab/)
● New profile more oriented for data processing.
This will provide an algebraic specification (label transition system) for
Obligations and Permissions in a quite abstract way.
This will be translated into a extended automata processing tool. To
implement this in a simple way we have chosen to use the Complex
Event Processing capabilities from Flink (FI-ODRL compiler to be
integrated).
This will trigger events to avoid the processed data to be delivered or
serialized.
9. Data Usage Control in FIWARE
Policies definition: Attributes
● Constraints
● Permissions
● Prohibitions
● Obligations
This is the ODRL 2.2 // RightML model
10. Data Usage Control in FIWARE
Reference Architecture Model 1
Data Consumer Data Provider
Processing Engines
Define
Access/ Usage
Control Policies
Data Controller
Storage Systems
PIP / PAP
(IDM Keyrock)
PXP/PDP
policy rules
ODRL policies
Stored Data
“Real-Time” Data
Shared Data
Usage
Control
Ongoing
Decisions
Data-processing
Engine
Traces
11. Data Consumer
Data Provider
Data Usage Control in FIWARE
Reference Architecture Model 2
Processing Engines
Define Access/ Usage
Control Policies
Storage Systems
PDP / PAP
(IDM Keyrock)
PXP/PDP
policy rules
ODRL policies
Stored Data
“Real-Time” Data
Shared Data
Usage Control Ongoing
Decisions
Data-processing
Engine Traces
12. Data Usage Control in FIWARE
Architecture
Data Consumer Data provider
PDP / PAP
(IDM Keyrock)
NGSIv2
Notification
PXP/PDP
Apache Flink
policy rules
Traces
Control Signals
FIWARE
Context Broker
(Orion)
PEP
PEPPEP
PEP
Proxy (Wilma)
ODRL policies
FIWARE
DRACO
Access control
13. Data Usage Control in FIWARE
Architecture (detail)
Streaming Engine
Usage Control
PDP / PAP
(Keyrock)
Streaming Job
Data Events Data Events Logs
Execution Graph Logs
PXP/PDP PTP
ODRL policies
DATA CONSUMER DATA PROVIDER
FI-ODRL
Specification
Control Signals
Usage control
ODRL specification is transformed into a PXP
(extended automata) execution engine
14. Usage Control
Apache Flink PXPApache Flink
FIWARE
Context Broker
(Orion)
PEP
PEPPEP
PEP
Proxy (Wilma)
Data Events Logs
Execution Graph Logs
Control Signals
NGSI Data
Events
Access
control
PXP/PDP Engine
IdM
(Keyrock)
FI-ODRL
Policy Translation Point
(Extended Automata)
FI-ODRL
Specification
Data Usage Control in FIWARE
Deployment Diagram
15. Data Usage Control in FIWARE
Policies check
Logs used for monitoring and control:
⭓ Execution Logs
It is the chain of operations to be performed by the program run on
the processing engine (Flink- Data User Side)
⭓ Events Logs
All the events received at the source of the Processing Engine
(Flink- Data User Side)
This events will be fed into the FI-ODRL CEP translation to verify its
conformance with the specified policy.
May be integrated with the Container Log interface or the Cluster
Manager.
16. Data Usage Control in FIWARE
Policies check
■ Execution Logs example:
2019-05-14 11:22:23.820 [flink-akka.actor.default-dispatcher-3] INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph - Source: Custom Source -> Flat
Map -> Map -> Map (1/1)
2019-05-14 11:22:23.993 [flink-akka.actor.default-dispatcher-2] INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph -
TriggerWindow(TumblingProcessingTimeWindows(15000),
AggregatingStateDescriptor{name=window-contents, defaultValue=null,
serializer=org.fiware.cosmos.orion.flink.cep.examples.example1.AveragePrice$$anon$26$$a
non$11@963b52f9}, ProcessingTimeTrigger(),
AllWindowedStream.aggregate(AllWindowedStream.java:475)) -> Sink: Print to Std. Out
(1/1)
■ Execution Graph
Data Source FlatMap Combine Sink
Data Events
18. Data Usage Control in FIWARE
Use case
Cash registers generate tickets and publish
purchase data on the CB
Ticket
Supermarket Store 1
Cash
Registers
FIWARE
Context
Broker
(Orion)
PEP
P
E
P
P
E
P
PEP
Supermarket Store 2
TicketCash
Registers
19. Data Usage Control in FIWARE
Use case
subscription to
processed data
Client A
Ticket
Supermarket Store 1
Cash
Registers
FIWARE
Context
Broker
(Orion)
PEP
P
E
P
P
E
P
PEP
Supermarket Store 2
TicketCash
Registers
Client A wants to subscribe to the entity that
contains the tickets’ information
20. Data Usage Control in FIWARE
Use case
Data Processing
and
Usage Control
subscription to
processed data
Client A
Ticket
Supermarket Store 1
Cash
Registers
FIWARE
Context
Broker
(Orion)
PEP
P
E
P
P
E
P
PEP
Supermarket Store 2
TicketCash
Registers
Client A deploys a Flink Job that performs
analytics on the data received from Orion
using the Cosmos connector
All the operations performed and events
received are registered in the logs
21. Data Usage Control in FIWARE
Use case
Data Processing
and
Usage Control
subscription to
processed data
Client A
Ticket
Supermarket Store 1
Cash
Registers
FIWARE
Context
Broker
(Orion)
PEP
P
E
P
P
E
P
PEP
Supermarket Store 2
TicketCash
Registers
The logs generated by the Flink Job are sent
to the PDP/PXP, who makes sure the
operations performed on the data comply with
the policies.
22. Data Usage Control in FIWARE
Use case: defining entities and policies
Context broker Entities
Ticket
● date
● client_id
● supermarket_id
● product_list
− description
− n_items
− price
Usage Policies
● The user shall NOT save the data without aggregating them each
15 seconds first or else the processing job will be terminated
● The user shall NOT receive more than 200 notifications from Orion
in a minute or else the subscription to the entity will be deleted
23. Data Usage Control in FIWARE
Use case implementation: Policy translation
Policy in natural language
● The user shall NOT
save the data without
aggregating them
every 15 seconds first
or else the processing
job will be terminated
● The user shall NOT
receive more than 200
notifications from Orion
in a minute or else the
subscription to the
entity will be deleted
{
"@context": ["http://www.w3.org/ns/odrl.jsonld",
"http://keyrock.fiware.org/FIDusageML/profile/FIDusageML.jsonld"],
"@type": "Set",
"uid": " http://keyrock.fiware.org/FIDusageML/policy:1010",
"profile": "http://keyrock.fiware.org/FIDusageML/profile/",
"permission": [{
"target": "http://orion.fiware.org/NGSInotification",
"action": "ReadNGSIWindow",
"constraint": [{
"leftOperand": "WindowNotification",
"operator": "gt",
"rightOperand": { "@value": "3", "@type": "xsd:integer"
}
},{
"leftOperand": "WindowNotificationValueSet",
{ "@value": "zip", "@type": "xsd:string" }
"operator": "gt",
"rightOperand": { "@value": "2", "@type": "xsd:integer"
}]
}]
"prohibition": [{
"target": "http://orion.fiware.org/NGSInotification",
"action": "SingleEventProcessing"
}]
}
24. Data Usage Control in FIWARE
Use case implementation: creating policies
Manage app policies
26. Data Usage Control in FIWARE
Use case implementation: creating policies
Assign policy to role
27. Data Usage Control in FIWARE
Use case implementation: Flink Job (User side)
val env = StreamExecutionEnvironment.getExecutionEnvironment
// Create Orion Source. Receive notifications on port 9001
val eventStream = env.addSource(new OrionSource(9001))
// Process event stream
val processedDataStream = eventStream
.flatMap(event => event.entities)
.map(entity => {
val id = entity.attrs("_id").value.toString
val items = entity.attrs("items").value.asInstanceOf[List[Map[String,Any]]]
items.map(product => {
val productName = product("desc").asInstanceOf[String]
val unitPrice = product("net_am").asInstanceOf[Number].floatValue()
val unitNumber = product("n_unit").asInstanceOf[Number].floatValue()
SupermarketProduct(id, productName, unitPrice * unitNumber)
})
})
.map(_.map(_.price).sum)
.timeWindowAll(Time.seconds(15))
.aggregate(new AverageAggregate)
// Print the results with a single thread, rather than in parallel
processedDataStream.print().setParallelism(1)
env.execute("Supermarket Job")
28. Data Usage Control in FIWARE
Use case implementation: Flink CEP generated code
// First pattern: At least N events in T.
val countPattern2 = Pattern.begin[Entity]("events" )
.timesOrMore(200).within(Time.seconds(15))
CEP.pattern(entityStream, countPattern2).select(events =>
Signals.createAlert(Policy.COUNT_POLICY, events, Punishment.UNSUBSCRIBE))
// Second pattern: Source -> Sink. Aggregation TimeWindow
val aggregatePattern = Pattern.begin[ExecutionGraph]("start",
AfterMatchSkipStrategy.skipPastLastEvent())
.where(Policies.executionGraphChecker(_, "source"))
.notFollowedBy("middle").where(Policies.executionGraphChecker(_,
"aggregation", 15000))
.followedBy("end").where(Policies.executionGraphChecker(_,
"sink")).timesOrMore(1)
CEP.pattern(operationStream, aggregatePattern).select(events =>
Signals.createAlert(Policy.AGGREGATION_POLICY, events,
Punishment.KILL_JOB))
31. Future work
▪ Consider integration with apache Atlas and Apache Ranger
(evolution of Cosmos Fiware GE). These projects are centered
on batch scenarios right now.
▪ Propose the FI-ODRL extension on the ODRL 2.2 W3C
standard.
▪ Consider the provenance of the data and even provide it as an
additional result (even if the policy denial of execution is not
triggered)
▪ Possible integration with containers’ infrastructure to automatize
the logs and block of execution and serialization.
▪ Ongoing research activity ….